You are on page 1of 2

Topic Infobox Template generation

Paper

Mentor

Corpus

Input A page with no infobox

None

Ratna

Single Document Keyphrase Extraction Using Neighborhood Knowledge By Xiaojun Wan and Key phrase Extraction Jianguo Xiao Manisha CollabRank: Towards a Collaborative Key Phrase Approach to Single-Document Extraction Keyphrase Extraction Ratna A Graph-based Approach to Mining Multilingual Word Multilingual Word Associations from Associations Wikipedia Ratna Improving Weak Ad-Hoc Queries using Wikipedia as External Corpus Using d-gap Patterns for Index Compression

Wikipedia Research papers in plain text (annotated with key phrases)

Output Infobox template using text on the page. Fill the values in the template too All the keyphrases using the algo in the paper (Precision Recall and Fscore will be evaluated) (comparison with tf-idf)

A research paper from the corpus

Improving Queries Index Compression

Manisha Hanisha

Spelling suggestion

Question Answering

An information retrieval approach to spelling suggestion by Sai Krishna Quantitative Evaluation of Passage Retrieval Algorithms for Question Answering (any two approaches)

Hanisha

Hanisha

Review Spam Detection

Review Spam Detection Review Spam Detection By Nitin Jindal and Bing Liu

Manisha

Ad-hoc search

Query Expansion Movie Review Summarization Movie review mining and (Extensions possible ) summarization (a lenghty project)

Using Wikipedia Categories for Ad Hoc Search Cluster-Based Query Expansion by IG Kalmanovich

Ratna Manisha

same as above Wikipedia and Fire data (hindi and english) Fire data (news corpus in english) Wikipedia 25 gb dump Wikipedia and a list of commonly misspelled words Wikipedia and list of 49 questions Amazon reviews (1.4 gb) and spammed review list Wikipedia and Fire data (english news corpus) + 50 queries Fire data + queries Imdb movie reviews

same as above Words without associations

same as above Words with all possible associations All the results using wikipedia vs all the results using normal search vs ur index of mini project compressed index size vs normal lucene index size

Queries 10gb dump of wikipedia

Words mispelled

correct spelling Passages which answer the questions

Questions

Spam

Spam or not

Queries Queries

Results with wikipedia vs results of normal lucene index Results with expansion vs normal results

Manisha

Movie

Summary of the reviews

Topic

Opinion Extraction

Paper Opinion Extraction, Summarization and Tracking in News and Blog Corpora Lun-Wei Ku (opinion extractio only )

Mentor

Corpus Reviews of Amazon products Wikipedia (Extract disambiguation pages for input) Wikipedia + snapshot of google news page Wikipedia + some webpages News corpus of Fire in English Wikipedia and WordSim353 collection

Input

Output

Manisha

Reviews one of the disambiguation pages

positive review or negetive review

Word Sense Disambiguation

Which "Apple" Are You Talking About?

Karan

list of possible pages in wikipedia

Clustering Short Texts using Wikipedia Wikify!: linking documents to encyclopedic knowledge Rada Mihalcea (lengthy) A Fact/Opinion Classifier for News Opinion Classification Articles (Will need labeling of data) Clustering texts Linking outer web pages with wikipedia pages Semantic relatedness WikiRelate! Computing Semantic of words Relatedness Using Wikipedia For all the teams participating in the track mail the track name and ur team number

Karan

news page

results with clustering vs normal results Possible wikipedia pages that can be linked to that page Fact or opinion

Karan Karan

Page news article words from wordSim353 corpus

Karan

words related to input

You might also like