You are on page 1of 2

What is a Search Engine Spider?

By Craig Mazur | December 17, 2003 | Copyright 2003 - All Rights Reserve
d
Crawlers, robots, algorithms, Web bots and spiders. They feed upon the content o
f Web pages. They skitter through the endless maze of electronic paths of Cybers
pace. Seeking, following, comparing, ranking, and ultimately passing judgement u
pon each Web page they find, assigning it a position relative to all others. Giv
en the fact that there are billions of Web pages and documents on the Internet (
some estimate 20+ billion), and millions being added each month, their endless t
ask is formidable.
These sneaky little spiders are smart little critters. But just what is a search
engine spider, and what do they look like? Where did they come from, and how do
they travel from site to site? Are they like viruses passing from one computer
to another? Just what type of creatures are these busy little guys?
No matter what you want to call them, it is usually considered to be a good thin
g when a search engine spider visits your web site. These mysterious little enti
ties can come calling at any time of the day or night. While there, they seeming
ly scurry through your site and curiously follow links from page to page. They i
nspect and dissect the code in your Web pages and form an opinion about the qual
ity of your content. Finally, they determine which keywords best represent your
pages and rank each relative to all the thousands or sometimes millions of Web p
ages with similar content.
Their monikers may vary, but each invokes an image of some type of tireless, int
elligent, biological or mechanical creature unleashed and roaming the Internet.
The reality is much simpler, but doesn t taunt the imagination with the same visua
l appeal. Search engine spiders arr just computer algorithms cold and lifeless com
puter code.
A little history
Way back in the early 1990s eons ago in Cyberspace time the Internet was primarily m
ade up of a network of servers containing text documents. No images or sound. No
Flash, videos or other multimedia. Just text documents. In those days a user ha
d to know a specific address for a site or a document in order to find it. Direc
tories were created where Webmasters could post information about their Web site
s so users could find them. As the amount of content on the Internet grew rapidl
y, the need arose for a more methodical way to gather and index information.
Out of this concept arose a range of computer program solutions, which programme
rs and denizens of the Internet whimsically called robots or spiders in order to
invoke images of tireless, methodical beings with free range over the Internet.
Meta tags were added to documents to provide the spiders with a description of
a page and keywords to be used to find it. As the World Wide Web came into exist
ence and evolved, those methods became outdated and prone to easy deceit.
With billions of Web pages and documents on today s World Wide Web, access to valu
able information and content becomes easily diluted, making it difficult to find
the exact information you are looking for. The concept of search engines has de
veloped into systems that not only search for and index as many web pages and do
cuments as they can find, they automatically analyze and categorize the content
they find. The ultimate goal of every modern search engine is to provide the mos
t meaningful search results for their users. After all, if users feel that their
keyword searches are not producing the results they expect, they will find a se
arch engine that will satisfy their needs.
Current search engine algorithms have evolved into very sophisticated computer p
rograms that request a Web page similar to the way a browser requests a page. A
request is sent out from the search engine s computer to retrieve a copy of the We
b page code for a specific URL. The code is transferred to the search engine s com
puter. But instead of displaying the Web page, the algorithm parses, dissects an
d analyzes the code. The algorithm is made up logical rules that allow it to aut
omatically make hundreds of judgements about a Web page and its content. Based u
pon this analysis, points and demerits are assigned and are used to rank the pag
e. The algorithm also parses out any other URLs it finds and adds them to its da
tabase. Requests are sent out for those URLs and the cycle continues over and ov
er.
All search engine algorithms are proprietary and the rules each one applies to W
eb pages are closely guarded secrets. They remain secret in order to prevent peo
ple from circumventing or defeating them. Many search engines publish guidelines
that suggest the right and wrong ways to do things. Other techniques have evolv
ed through trial-and-error that when applied to Web pages have shown to produce
positive results. These are called search engine friendly techniques, and they are
the basis for the Top Rank Solutions philosophy.
Algorithms are in a constant state of evolution and frequently change their rank
ing criteria without warning. Changes in algorithms become apparent only after l
arge numbers of Web pages change positions in a search engine s results. Sometimes
a Web page moves to a higher rank position, and sometimes to a lower position.
An elevation to a higher position can mean that a page exhibits traits that are
deemed to be better or more significant than similar pages. Sometimes a page tha
t previously ranked well is demoted in rank.
For several years the intelligence of search engine algorithms lagged behind tec
hniques being developed to artificially elevate a page s rank through techniques r
eferred to as spam that were designed to fool a search engine algorithm or feed it
erroneous information. In recent times, most algorithms have become sophisticat
ed enough to detect these methods and now will penalize a web page if it detects
something in the page design or code that it determines to be suspicious. Some
violations are deemed to be so egregious that an entire site may be temporarily
or permanently removed from a search engine s index. The best long-term plan for d
ealing with these issues is to adopt the search engine friendly philosophy and a
pply it to your Web development techniques.
Sorry to disillusion those of you that may believe otherwise, but there are no f
riendly, energetic little arachnids or other mythical or mechanical beings cruis
ing the Internet or rummaging through your Web site. And any intelligence that s
earch engine spiders exhibit is not the enlightened work of metaphysical creatur
es from Cyberspace. A spider is just a metaphor for an algorithm, and an algorit
hm is just the work of mere mortal programmers.

You might also like