Professional Documents
Culture Documents
by T.Mehbub Basha
Overview
Introduction Document Preprocessing
Introduction
Information Retrieval (IR):
Concerned with satisfying information needs of users. Ex: documents World Wide Web(WWW) websites requires efficient approaches to retrieve relevant subsets
Cont.. Many information retrieval approaches are based on Machine Translation (MT) systems. However, these systems still have high error rates(like
grammars, meanings)
This motivates the development of multilingual retrieval methods that do not depend on MT or at least are able to compensate errors introduced by the translation systems.
DEFINITION OF INFORMATION RETRIEVAL: Given a collection D containing information items di and a keyword query q representing an information need, IR is defined as the task of
Common techniques used for document preprocessing document syntax, encoding, tokenization & normalization of
tokens