You are on page 1of 6

Web search exercise 1

Web search exercise


A search for “public libraries collection weeding”

Submitted by

Hanem Ibrahim

First it was very hard for me to think about a topic for this exercise, I did that exercise
more than three times with deferent topics, (West Nile Virus) or (HIV disease), and others, the
results came in millionth, I couldn‟t narrow any of them unless I had to use the format (ppt.) in
order to get the results under 100 hits. This was not applicable with some search engines like
bing, which I didn‟t choose for searching this topic. I tried an easy subject I studied before so I
could be able to evaluate the results. This table shows the steps I took to lessen the results under
100 hits.

Tool search Google Yahoo AltaVista monstercrawler


Simple search 54,200 183,000 183,000 77
OR AND “” 51,000 173,000 209,000 74
OR AND “”NOT 4,400 16,800 25,500 68
Pdf. Format 1,280 1,070 1,350 7
English Pdf. 1,260 1,050 1,330 -
Pdf.edu 17 10 17 -
Pdf.com 96 40 63 -
doc. 157 166 207 -
doc.com 24 10 14 -
doc.edu 7 6 10 -
Exact word/s 3 1 2 -
“public library” “weeding”
in all formats , language
and all domains

1- I chose this narrow topic (weeding in public libraries) I have to make that topic non-
unitary topic so I came up with some terms I know that they are relatively suitable to
make it a non- unitary search these words: (collection, development or selection)
2- I chose the three web search engines (Google, Yahoo, AltaVista) and a Meta search
engine (monstercrawler).
3- I drop these terms without any AND, OR…to execute the preliminary or simple search
like I used to do before I learned in this course the right way to search the web. The

Hanem Ibrahim
Web search exercise 2

results came in thousands like always. Only the Meta search engine, the results were less
than 100 hits.
4- I tried the advanced search in each one of the three web search engines first because they
are almost having the same features, I tried to think how I could use the Boolean
operator indicators in a good way to narrow the search results I put these words (public
library collection) in the box that search all these words, the default is (AND), Then I
put the word (weeding) in the box that search the exact wording or phrase (“”) then I put
these two words (development selection) in the box that search one or more of these
words(OR). The results for the three web search engine were less than before except for
AltaVista its results were unexpected higher number that before (209,000) hits, and all
of them were still in the thousandth!
5- I noticed that the results came up with some words I didn‟t want in my topic like
“college or school libraries” I decided to eliminate these words from coming to my
results, so I put them and other similar words like (academic, university) in the box of
any of these unwanted words (NOT), the results were way less than before but still not
under 100 hits.
6- I used the available tools with each search engine, trying to lessen the results; I thought
that if I get only the paper in Pdf. Format will get me to the goal, but still over thousand
hits.
7- I used the language box to limit the results to English papers only, and the results still in
thousandth
8- I tried the English pdf. Format papers that published in edu. Domain so it will be
suitable for a university student like me. Finally I got to the point; less than a hundred
English paper in pdf. Format in the edu. Domain in each one of the three web search
engines.
9- I tried to see the same format published under .com. domain, they were more than a
hundred.
10- I also tried the doc. Format either edu or .com. domain the result was less than a
hundred.

Hanem Ibrahim
Web search exercise 3

11- I wanted to see the results I‟ll get if I put these terms ( public libraries, weeding) into the
exact word or phrase box, the results came with three hits in Google, one hit in Yahoo,
and two hits in AltaVista)
12- The Meta search engine (monstercrawler) doesn‟t have the ability to search with the
format, so I tried to count the pdf. Papers I found 7 only.

The evaluation of the results

In the matter of quantity I noticed that Google started the simple search with less results
than the other two search engines, the results of Yahoo and AltaVista were the same exact
number (183,000) hits!, I thought I made something wrong with the spelling or the spaces as I
used to do that without noticing, but nothing was wrong with that, I‟m still wondering why that
happened? Is that possible that the results in both Yahoo and AltaVista are using the default
Boolean operator OR rather than AND? According to review of AltaVista (1) the default Boolean
operation of AltaVista for multiple terms has changed many times depends on which search form
is used. And the simple search with four or less terms is an automatic AND while searches with
five or more terms are an automatic OR. That could be the reason for getting more results with
Yahoo and AltaVista than the results of Google for the simple search, because I used more than
five words (public, libraries, collection, weeding, development, and selection ), but when it
comes to the results with the deferent formats, and domains Google has more results than them.

In the matter of the quality of the results I found out that it‟s important to narrow the
topic searched by any of the indicators that used in advanced search engines, especially when
you‟re searching for a specific subject or specific format, however it could be very useful to
brows the results searched by broad terms or indicators, I know it‟s impossible to go through
thousands of items cited by this wide or broad search, I went through the first ten or twenty of
the results that came up with the three web search engines I found that there are useful sites and
web pages that contains good information about my topic. In the advanced search with Google
for example with the use of (AND,””,OR, NOT) the results were 4,400 sites, I went through its
first 10 hits I found them all very important to the subject, none of them came within the last list
of less than a hundred results , simply because I thought that pdf. Format is better than
(1)
Notess, Greg R.(2004 ) review of AltaVista , Retrieved on July 13th 2009 from:
http://www.searchengineshowdown.com/features/av/index.shtml

Hanem Ibrahim
Web search exercise 4

html.format, also I thought that edu. Domain is the best results I could ever have, all the first ten
results are not included in the final results with less than hundred list results, they are html.
Format or even pdf. Format but within the .org, or gov. domains. So we could go through the big
amount of results if we chose the right terms that really related to the research topic, and take the
advantage of advanced search tools in order to get the right results for our topic. I tried to
compare between Google first ten results and the first ten results in the both Yahoo and AltaVista
At yahoo search I found out there were not important results that came in the first ten results
among (16,800) hits with the use of (AND,””,OR, NOT) most of them are libraries‟ web pages
that not so important to that topic they only mentioned the word “weeding” in the development
policy of these libraries. AltaVista has to show some Sponsored Matches hits before the actual
results, these way of advertizing books on sale, these sponsored hits matches some words that in
the query, but not all of them. They ought to do that for their sponsor who support their existence
on the web. This web search engine first ten results out of 25,500 hits mostly the same like
Yahoo search, some public libraries‟ web pages, and not much of details about that subject, and
also a book resulted for sale.

About the Meta search engine (monstercrawler) it didn‟t get me the best results that I
expected like the other Meta search engines I tried before , they „re either don‟t have advanced
searching ability, or they have it without good searching tools, so the results won‟t be as good as
they should be, like (dogpile) for example a Meta search engine which I feel its results are not
right, and there „s no counted number I can figure out the development of my searching. This
Meta search engine (Monster Crawler) as they mention in their web site(2) combines the power of
all the leading search engines together in one search box to deliver the best combined results. It
combines (Google, Yahoo search, MSN bing, and Ask) This is the idea of Meta search engines.
The process suppose to be more efficient and yields many more relevant results, but that didn‟t
happened here; I found in the first twenty results of (monstercrawler) three results that are totally
false drops; the first one came the fourth hit of the twenty results, it was about wedding
collection! The sixth and the eighth also are about the same wrong topic, I know it‟s not spelling
mistake from me, it was their mistake of matching the right term especially I put the word
“weeding” in the box of the same exact word or phrase. Is that the relevancy they are talking

2
http://monstercrawler.com/about.htm Retrieved on July 13th 2009.

Hanem Ibrahim
Web search exercise 5

about? Also I found the word “school” in one of the results‟ title; however I did put this word in
the box of unwanted words, which seems very disturbing with the results that should be the best
of all search engines. Another thing with that Meta search engine; when I use the domain box it
has two options either include or exclude the domain, when I chose the include domain without
putting any domain abbreviation (.edu, or .com.) I faced trouble using these two option the
number of results in all the searches with the domains are not accurate, for example the results
that came with the simple search were (77) hits, while the results with (AND, OR, “” NOT) are
68, with excluding the domain without putting any domain it showed (68) hits, and with using
the include .com they have (71) hits, and including .edu domain the results are (53) hits, how
come that happened and all of them are (68) hits?? I did try to do the same searching terms the
next day the number of results were slightly different, and some changes to the content and some
more drop falls.

If I tested the results in the final list of these search engines with the precision equation it
could be a good measuring if there results are what I really wanted. But any way I tried to go
through the contents of the first ten of each engine results to measure their relevancy, for the only
ten results of Yahoo search with pdf. Format, and under .edu domain; no one of these ten are
relevant to the search topic, that means the relevancy is Zero%. AltaVista got three out of the
first ten of the total (17) hits are relevant to the topic, that means the precision is 30% , and
Google got two out of the first ten result from its total results(17) hits with precision 20% .
These results are not reliable because of the limitation I put to lessen the total results to less than
a hundred made me lose the right results, Now I know that not all pdf files in the web are quite
important for searching, also I learned that not everything published under .edu domain are
relevant, but if I did that relevancy measuring for the first ten hits from the first thousand results
it will be deferent than that. The results I got in the final list here with less than a hundred hits
were not that effective for the research topic; the pdf. Files within .edu domain contained copies
of the library policy or guidelines of weeding or development of the library collection, they are
not like the other pdf. Files I found it in the first ten results of (4,500) results I mentioned before
in Google, the first ten or twenty results didn‟t show the most important results I expected.
When I tried to put these terms(public library weeding) in the box of the same exact word or
phrase I got very limited hits in all the search engines, but the results are not equivalent to the
topic, they are some e-messaging published through the net. I think that searching the web

Hanem Ibrahim
Web search exercise 6

engines for specific terms may not be good unless we want to count how many site mentioned
this exact word only. I think that these web search engines and Meta search engines in particular
need to be improved, their classification of organization of information they gather suppose to be
improved, they need also to improve their advanced searching tools.

Now If I have more time to do this search more and more I‟ll be able to choose more
effective terms than I did here, that may lead me to better results and at the same time less than
a hundred. I think I‟ll get better in researching each time I practice, and learn more about the
right way to search the web, also I think that we the web not always the right place for searching
for studying or researching, it‟s better to use the libraries‟ databases, but we had to know how to
think of more effective terms and formats with the help of the advanced search tools that lead us
to better results.

Reference

1- AltaVista search
http://www.altavista.com/web/results?itag=ody&pg=aq&aqmode=s&aqa=public+libraries
+collection++&aqp=weeding&aqo=selection+development&aqn=academic+college+univer
sity+school&kgs=0&kls=1&dt=tmperiod&d2=0&dfr%5Bd%5D=1&dfr%5Bm%5D=1&df
r%5By%5D=1980&dto%5Bd%5D=14&dto%5Bm%5D=7&dto%5By%5D=2009&filetype
=pdf&rc=dmn&swd=.edu&lh=&nbq=10
2- Google search
http://www.google.com/search?hl=en&rlz=1T4RNWN_enUS320US321&as_q=public+libr
aries+collection+&as_epq=weeding&as_oq=development+selection&as_eq=academic+coll
ege+university+school&num=10&lr=lang_en&as_filetype=pdf&ft=i&as_sitesearch=.edu&
as_qdr=all&as_rights=&as_occt=any&cr=&as_nlo=&as_nhi=&safe=images
3- Meta search( monstercrawler)
http://search.monstercrawler.com/monster/ws/results/Web/public%20libraries%20collecti
on%20development%20selection%20'weeding'%20-academic%20-college%20-
school%20-
university/1/485/TopNavigation/Relevance/iq=true/zoom=off/_iceUrlFlag=7?_IceUrl=true
&adv=qall%3dpublic%2blibraries%2bcollection%26qphrase%3dweeding%26qany%3dd
evelopment%2bselection%26qnot%3dacademic%2bcollege%2bschool%2buniversity%26
domaini%3dinclude
4- Notess, Greg R.(2004 ) review of AltaVista , Retrieved on July 13th 2009 from:
http://www.searchengineshowdown.com/features/av/index.shtml
5- The web page of( monstercrawler): Retrieved on July 13th 2009.)
http://monstercrawler.com/about.htm
6- Yahoo search
http://search.yahoo.com/search?n=10&ei=UTF-
8&va_vt=any&vo_vt=any&ve_vt=any&vp_vt=any&vd=all&vf=pdf&vm=p&fl=0&fr=slv8-
grpj&p=public+libraries+collection+development+OR+selection+%22weeding%22+-
academic+-college+-university+-school&vs=.edu

Hanem Ibrahim

You might also like