Professional Documents
Culture Documents
Paper
Mentor
Corpus
None
Ratna
Single Document Keyphrase Extraction Using Neighborhood Knowledge By Xiaojun Wan and Key phrase Extraction Jianguo Xiao Manisha CollabRank: Towards a Collaborative Key Phrase Approach to Single-Document Extraction Keyphrase Extraction Ratna A Graph-based Approach to Mining Multilingual Word Multilingual Word Associations from Associations Wikipedia Ratna Improving Weak Ad-Hoc Queries using Wikipedia as External Corpus Using d-gap Patterns for Index Compression
Output Infobox template using text on the page. Fill the values in the template too All the keyphrases using the algo in the paper (Precision Recall and Fscore will be evaluated) (comparison with tf-idf)
Manisha Hanisha
Spelling suggestion
Question Answering
An information retrieval approach to spelling suggestion by Sai Krishna Quantitative Evaluation of Passage Retrieval Algorithms for Question Answering (any two approaches)
Hanisha
Hanisha
Review Spam Detection Review Spam Detection By Nitin Jindal and Bing Liu
Manisha
Ad-hoc search
Query Expansion Movie Review Summarization Movie review mining and (Extensions possible ) summarization (a lenghty project)
Using Wikipedia Categories for Ad Hoc Search Cluster-Based Query Expansion by IG Kalmanovich
Ratna Manisha
same as above Wikipedia and Fire data (hindi and english) Fire data (news corpus in english) Wikipedia 25 gb dump Wikipedia and a list of commonly misspelled words Wikipedia and list of 49 questions Amazon reviews (1.4 gb) and spammed review list Wikipedia and Fire data (english news corpus) + 50 queries Fire data + queries Imdb movie reviews
same as above Words with all possible associations All the results using wikipedia vs all the results using normal search vs ur index of mini project compressed index size vs normal lucene index size
Words mispelled
Questions
Spam
Spam or not
Queries Queries
Results with wikipedia vs results of normal lucene index Results with expansion vs normal results
Manisha
Movie
Topic
Opinion Extraction
Paper Opinion Extraction, Summarization and Tracking in News and Blog Corpora Lun-Wei Ku (opinion extractio only )
Mentor
Corpus Reviews of Amazon products Wikipedia (Extract disambiguation pages for input) Wikipedia + snapshot of google news page Wikipedia + some webpages News corpus of Fire in English Wikipedia and WordSim353 collection
Input
Output
Manisha
Karan
Clustering Short Texts using Wikipedia Wikify!: linking documents to encyclopedic knowledge Rada Mihalcea (lengthy) A Fact/Opinion Classifier for News Opinion Classification Articles (Will need labeling of data) Clustering texts Linking outer web pages with wikipedia pages Semantic relatedness WikiRelate! Computing Semantic of words Relatedness Using Wikipedia For all the teams participating in the track mail the track name and ur team number
Karan
news page
results with clustering vs normal results Possible wikipedia pages that can be linked to that page Fact or opinion
Karan Karan
Karan