Information Retrieval

Assign to the course: here

Marks: link


  • Python 3.8 + Jupyter notebook
  • Java 12 (scripts should also work with 8+)
  • It is suggested to use own laptops.

Grading: link

1.Data collection: Web crawlingPythonFiles
2.Data extraction: Apache TikaJavaFiles
3.Text processing: Apache OpenNLPJavaFiles
4.Indexing + document representation + similarityFiles
5.Query expansionPythonFiles
6.Search engine: Apache LuceneJavaFiles
7.HITS + Page RankPythonFiles
8.Clustering + ClassificationFiles
9.Log AnalysisPython Files
10.Image RetrievalPythonFiles
11.Map Reduce + Hadoop + SparkPythonFiles