Professional Documents
Culture Documents
ENGINES
Search Engine?
Search engines are an invaluable tool for
retrieving information from the Web. In
response to a user query, they return a list of
results ranked in order of relevance to the
query.
Eg: Google, Yahoo etc.
Issues in Implementation Of
clusters
ii.Tokenization:
Text of each search result gets split into a
sequence of basic independent units called
tokens represent by word, number or symbol.
iii.Stemming:
Remove the inflectional prefixes and suffixes of
each word to reduce different grammatical form of
the word to a common base form called a stem.
Eg:
connected,connecting & interconnection
connect
iv.Selection features:
Extract features for each search result present
in the input.
Features are atomic entities by which we can
describe an object and represent its most
important characteristic to an algorithm.
Features vary from single word to tuples of
word.
vsm representation
4.Visualization of Clustered
Results
One prominent approach is based on hierarchical folders
Clusty, CREDO, Lingo3G - hierarchical folder visualization
approach
Grokker - Nesting ,zooming approach
KartOO - Graph based interfaces
THANK YOU