Assign to the course: here
Marks: link
Prerequisites:
- Python 3.8 + Jupyter notebook
- Java 12 (scripts should also work with 8+)
- It is suggested to use own laptops.
Grading: link
No | Topic | Assignment | Materials |
---|---|---|---|
1. | Data collection: Web crawling | Python | Files |
2. | Data extraction: Apache Tika | Java | Files |
3. | Text processing: Apache OpenNLP | Java | Files |
4. | Indexing + document representation + similarity | Files | |
5. | Query expansion | Python | Files |
6. | Search engine: Apache Lucene | Java | Files |
7. | HITS + Page Rank | Python | Files |
8. | Clustering + Classification | Files | |
9. | Log Analysis | Python | Files |
10. | Image Retrieval | Python | Files |
11. | Map Reduce + Hadoop + Spark | Python | Files |
12 | Repetition | Files |