You are on page 1of 1

Physical World Data/Document

Creation Process

that exist in the


Persons, Places
Things, Ideas

e
describ
Descriptions
Data/Document

are represented by
Storage System
t
en
es
pr
re

Data/Document
make up organized into provide structure to
Data Objects Feeds Tables Databases
are gath coul
ered by d be
or cont
ga aine
ni provide d in
News, a

are
ze
d multiple

Retrieval System
Products,

are
Objects residing in databases that by ate

an
cre views of Related Links,
are not referenceable by a simple to

ind
ate d data in a Trademarks,
URI such as ener use A repository of data that could be stored in RDF format.

exe
g is database Recirculation,
company names, addresses, phone

db
Stocks Can be updated independently of the index that references it.
numbers, names of people, generate

y
personal profiles, horoscopes, 3rd Parties Metadata Entries in the graph can reference each other. For instance,
credit card numbers, product ca a request for a stock quote of an entry could point to the
names, prices, maps, event na symbol, the symbol could point to the company name and the
Alexa ct data that contains current quote, the current quote could point to the quote history,
listings, news clips, weather as

are
e

Bigfoot information about and the quote history could point to a quote histogram.
crib

Groups of WWW At Hand its format


des

Pages in context

describe make up are organized into provide structure to WWW Site could be contained in a
Hierarchical
Attributes WWW Sites Submissions
Taxonomies
Directory Graph
de Databases
scr la
ibe are g te
ro

are
athe Usenet Yahoo!
de

ar
red b rg
ain

ei
a Yahoo! Open Directory

an
scr

y ni t
ain

nd
ze Infoseek
ibe

m Open Directory
nd

ex
d a
by

ed
ra te ate Dewey Decimal
gene cre

b
Library of Congress

y
can be can assign quality ratings A collection of technologies that attempt Amazon ranks related books and CDs by keeping track
WWW Users Editors
such as editor’s choice to rank the relative value of a set of results. of the purchasing habits of groups (group profiles).
or cool site of the day Google ranks results during the indexing process Grapevine allows users to actively rank results

ar
paid

are
e
by examining the relationships created by the hyperlinks (a customization process) but then attempts to guess
nt

volunteer
p
in the documents. More popular sites score higher. what the user will want based on what it has learned
e

in a
u
res

ned
ke

(a personalization process).
ntai Clever does the ranking during post processing. It uses
rep

co
ma

d be
coul an iterative process that looks at the popularity of a Direct Hit analyzes the time spent by users on a given
site (much like Google) but does so in regard to the result. It also keeps track of whether or not a user returns
make up make up are organized into provide structure to
WWW Objects WWW Pages Page Sorted Lists WWW Document Relevance keywords used in the search. to the results page after viewing a result. If the user
Databases Databases returns, the result receives a lower score.
are Ranking can

is r
are ga an are be d
la

of
thered ind one

efe
by te by
Objects referenceable Groups of WWW ro alphabetically Inktomi ex
ed

ren
rg
by a URI such as Objects in context a ate by relevance Google by
cre

c
ni

ed
html, text, gif, jpeg, png, ze to by modification date Alta Vista
d sed y

b
te by u db Lycos
are

ya
avi, mpeg, real, quicktime, ra use
gene

n
wav, pdf, rss, xml, xul files

fro
a
as
Organizing

at
Crawlers Algorithms used

ad
are Any processes done on the data
such

Principle by

et
act
as

m
as returned from the index. Could involve
ch

ta
sorting, ordering, or compiling the data

da
su

as ... gather all of the text on a given text analysis

ir
ch retrieved from the index.

pa
su web page, then continue gathering text link analysis

n
on all of the pages linked to the given page,

ca
meta data analysis
for
s
t data
ha then follow the links on those pages, and so on.
suc
collec
are “sensed” by send data to send data to create an sends data to a
Representations Sensors Analyzers Indexers Index Post Processors
gene
rate
generate s
s

Stored proxies of ... are the input devices Creates a lookup table from all of the If a graph is in place, the index functions
physical world objects of the data/document data fed into it. If a graph is not in use as a lightweight lookup table.
and ideas storage system all of the data fed into it is stored as

sends data to an
If no graph is used, the index contains
entries in the lookup table.
all of the data in addition to the lookup table,
I-search and PLS are existing technologies. in a list format. Results Data

su
su ch
ch as
as

an
to
Metadata and Raw Data

a
Data

at
sd
such as such as

nd

could be sent to a
XML, XUL, or RDF HTML streams

se
Data is self-describing Form and content
and separate from the are merged.
form it will eventually take.

Interpreter

Turns the user’s query Scraper


into something readable
by the index.
Scrapes unwanted

sends data to an
Could handle:
boolean operators form information
spell checking from the data
word stemming stream. Classifies
case folding remaining data.

ive

e
internationalization

iv
ece

rece
thesauri

ive
ld r

e
phrase searching

rec
ectly
cou
related terms

ly
ect
d dir

dir
Aggregator

coul

ld
cou

an
m
Combines or interleaves

ro
yf
an data from multiple

ctl
a from sources.
e dat

ire
receiv

ad
could

at
ed
eiv
Articulator

c
re
make
s use

uld
of an

co
m
ro
tl yf
c
re

sends data to a
or ‘Layout Engine’ di
Combines form ta
da
and content. ive
ce
re
d Templates
ul
co
he
yt

sent to an
nb Provides the architecture,

organize the form of


w
re dra or “form” of the data for the
be creation of the results page.
ld
cou The templates can exist in
multiple, localized formats.

uence
ce

ence
uen

influ
infl

can infl
could be the same as the

can

can
Input Device Output Device
awn by an
could be redr ren
de
Keywords Views rs
can contro Any device that can

displays
cont l the d receive data and
ribu isplay
ca te to ord er of render it for the user.
nb the Could be by relevance,
es spec
to ifica creation date, modification
red tion
cify in of date, alphabetical,
pe
ca ns by source, by media type,
involve progressive disclosure,
Scope is reduced by specifying pagination, or custom look and
additional restrictions upfront.

in a

in a
Active Searchers feel (themes)

an

an
Best for data retrieval.

contains an
can

conta

conta
n

ain
spe

ntai
cify

ont
d co

could

could

ld c
coul

cou
made using a
can be stored in
Options Customization Results Pages
Settings
create

can contri
bute to th
e specific
such as keyverbs, ation of options that are Any UI page or widget that
be
could boolean operators, stored either locally displays the results data
d in
specify
data type, media type, or on a server n taine
could language or domain be co
could
make of a in order to get
People Queries Source Answers

coul
d be
tion of
... can be described by their ecifica
goals, age, gender, income, ion of rove the sp
specificat an imp
geographic location, prove the data th
at c
ta that can im contain
education, hardware and contains da
software, connection speed, Behavioral Personalization
member status History can be stored in Profiles can
require

be
col
lec
click paths, behaviors and ted
session tracking, states that are stored and shou
ld pro
sto
decision tracking either locally or red vide
the
on into requir
rely on a server ed
shou
Scope is reduced by reducing large collections of
ld e
Passive Searchers Group Profiles mpo
choices from a found set, in a profiles that can be
wer
rely
sequence of ever smaller sets. on analyzed to produce
Best for document retrieval. of
tion trend information
in fica
sho
ed peci
uld
tor he s
hel
t
es rove
pp
eop
n b imp
is made up of

le a
n

Data/Document
ca at c
a ttai
a th
exist in the

n th
dat eir
tain
con
involve

browser type and version, Environmental


language setting, IP address, State

Retrieval Process
current page, monitor size,
color settings, javascript
capability

Information
to enable

Actions

Understanding Internet Search


toward

Goals Concepts, Systems and Processes


Could be organized as
cultural responses to human needs
8 August 1999
(Malinowski) roughly,
Food
Kinship
Shelter
Protection
Or more simply,
Work, Play, Learn
User The suggested starting point for reading is “People”.
I would like to acknowledge Hugh Dubberly for his many suggestions,
and Ken Hickman and Paul Pangaro for their contributions.
Activities
Training
Hygiene
and
Food, Clothing,
Shelter, Love Context Designed by Matt Leacock
Search Concept Map, version 1.2

You might also like