Unit 1

Unit 1: World Wide Web
The World Wide Web, abbreviated as WWW and commonly known as the Web, is a system of
interlinked hypertext documents accessed via the Internet. With a web browser, one can view
web pages that may contain text, images, videos, and other multimedia and navigate between
them via hyperlinks. Using concepts from earlier hypertext systems, English engineer and
computer scientist Sir Tim Berners-Lee, now the Director of the World Wide Web Consortium,
wrote a proposal in March 1989 for what would eventually become the World Wide Web. At
CERN in Geneva, Switzerland, Berners-Lee and Belgian computer scientist Robert Cailliau
proposed in 1990 to use "Hypertext to link and access information of various kinds as a web of
nodes in which the user can browse at will", and publicly introduced the project in December.
"The World-Wide Web (W3) was developed to be a pool of human knowledge, and human
culture, which would allow collaborators in remote sites to share their ideas and all aspects of
a common project.
WWW
W World (Ability to access information around the world)
W Wide (Large Span of Computers, Vast information that is stored at a no. of web servers.)
W Web (Information that is linked to one another text + visual information, stored at
multiple information.
Information is cross-platform
1. Access information from any hardware INTEL, APPLE
2. Access information from any software
1. Windows
2. UNIX
3. LINUX
WHAT is required to Exchange/Access information PROTOCOL
1. HTTP (Hyper Text Transfer Protocol.)
2. FTP (File Transfer Protocol).
Internet Vs World Wide Web:

The terms Internet and World Wide Web are often used in every-day speech without much
distinction. However, the Internet and the World Wide Web are not one and the same.
Internet World Wide Web

The Internet is a global The Web is one of the services that run on the Internet. It is a collection of
system of interconnected interconnected documents and other resources, linked by hyperlinks and URLs.
computer networks. In short, the Web is an application running on the Internet.
Viewing a web page on the World Wide Web normally begins either by typing the URL of the
page into a web browser, or by following a hyperlink to that page or resource. The web
browser then initiates a series of communication messages, behind the scenes, in order to
fetch and display it.
First, the server-name portion of the URL is resolved into an IP address using the global,
distributed Internet database known as the Domain Name System (DNS). This IP address is
necessary to contact the Web server. The browser then requests the resource by sending an
HTTP request to the Web server at that particular address. In the case of a typical web page,
the HTML text of the page is requested first and parsed immediately by the web browser,
which then makes additional requests for images and any other files that complete the page
image. Statistics measuring a website's popularity are usually based either on the number of
page views or associated server 'hits' (file requests) that take place.
Internet World Wide Web (WEB)

The Internet is a massive The World Wide Web, or simply Web, is a way of accessing information
network of networks, a over the medium of the Internet. It is an information-sharing model
networking infrastructure. It that is built on top of the Internet. The Web uses the HTTP protocol,
connects millions of only one of the languages spoken over the Internet, to transmit data.
computers together globally, Web services, which use HTTP to allow applications to communicate in
forming a network in which order to exchange business logic, use the Web to share information.
any computer can The Web also utilizes browsers, such as Internet Explorer or Firefox, to
communicate with any other access Web documents called Web pages that are linked to each other
computer as long as they are via hyperlinks. Web documents also contain graphics, sounds, text and
both connected to the video.
Internet. Information that
Web Services Uses:-
travels over the Internet does
so via a variety of languages 1. HTTP to allow applications to communicate in order to exchange
known as protocols. business logic.
2. Share information.
3. Utilizes browsers, such as Internet Explorer or Firefox, to access Web documents

called Web pages that are linked to each other via hyperlinks.
The Internet, not the Web, is also used for e-mail, which relies on SMTP, Usenet news
groups, instant messaging and FTP. So the Web is just a portion of the Internet, albeit a large
portion, but the two terms are not synonymous and should not be confused.
URL
In computing, a Uniform Resource Locator (URL) is a Uniform Resource Identifier (URI) that
specifies where an identified resource is available and the mechanism for retrieving it. In
popular usage and in many technical documents and verbal discussions it is often incorrectly
used as a synonym for URI. The best-known example of the use of URLs is for the addresses of
web pages on the World Wide Web, such as http://www.example.com/.
Every URL consists of some of the following: the scheme name (commonly called protocol),
followed by a colon, then, depending on scheme, a domain name (alternatively, IP address), a
port number, the path of the resource to be fetched or the program to be run, then, for
programs such as Common Gateway Interface (CGI) scripts, a query string, and an optional
fragment identifier.
The syntax is scheme: //domain:port/path?query_string#fragment_id
Scheme
The scheme name defines the namespace, purpose, and the syntax of the remaining part of
the URL.
Software will try to process a URL according to its scheme and context. For example, a web
browser will usually dereference the URL http://example.org:80 by performing an HTTP
request to the host at example.org, using port number 80.
Other examples of scheme names include Https, gopher, ftp:
Secure Website
URLs with https as a scheme (such as https://example.com/) require that requests and
responses will be made over a secure connection to the website.
Some schemes that require authentication allow a username and perhaps a password too, to
be embedded in the URL, for example ftp://asmith@ftp.example.org. Passwords embedded in
this way are not conducive to secure working, but the full possible syntax is
scheme://username:password@domain:port/path?query_string#fragment_id
Domain Name
The domain name or IP address gives the destination location for the URL.
http://www.selfseo.com/find_ip_address_of_a_website.php
Website IP-Address
www.google.com http://173.194.70.19/
www.gmail.com 173.194.70.19
www.rediff.com 92.123.68.170
www.youtube.com http://208.65.153.238/
209.85.171.83
The domain googleq.com, or its IP address 209.85.153.104, is the address of Google's website.
The domain name portion of a URL is not case sensitive since DNS ignores case:
http://en.example.org/ and HTTP://EN.EXAMPLE.ORG/ both open the same page.
Port Number
A port is associated with an IP address of the host, as well as the type of protocol used for
communication.
The port number is optional; if omitted, the default for the scheme is used.
For example, http://vnc.example.com:5800 connects to port 5800 of vnc.example.com, which
may be appropriate for a VNC remote control session.
If the port number is omitted for an http: URL, the browser will connect on port 80, the default
HTTP port.
The default port for an https: request is 443.
Port No
HTTP 80
HTTPS 443
SMTP 25
FTP 47
Path
The path is used to specify and perhaps find the resource requested. It is case-sensitive, though
it may be treated as case-insensitive by some servers, especially those based on Microsoft
Windows. If the server is case sensitive and http://en.example.org/wiki/URL is correct,
http://en.example.org/WIKI/URL/ or http://en.example.org/wiki/url/ will display an HTTP 404
error page, unless these URLs point to valid resources themselves.
Parts of URL
The first part is the protocol. In this case, we are requesting to view a file using hypertext transfer protocol. Another
popular protocol is ftp (file transfer protocol).
The protocol is followed by ://
The next part, wiki, is usually the name of the server that stores the file you will view through
the browser. http://www.answers.com/ is not the same as http://wiki.answers.com/ even
though you see answers.com in both URLS.
The next part: answers.com is the domain. This is made up of two parts. The first is the host
name (in this case "answers") and the second is the top-level domain. Other top level
domains include .org and .mil
Some urls include directories and files. For example, in the URL
http://www.answers.com/main/business.jsp there is a directory called main and in that
directory is a FILE called business.jsp.
Notice that the domain, directories, and files are separated by slashes and the filename and the file extension are
separated by a period. There endless types of files on the Web including .pdf, .php, .html and others.
You'll notice that many times you do not see a filename in a URL. In cases like that, you are
actually looking at a default file. Many sites default to a file called index.html. If you go to
http://wiki.answers.com/index.html and http://wiki.answers.com/ you are actually looking at
the same page. You just don't have to type the entire url to see the main/default page.
The optional port number defines the port with which to connect to the server or service
(this is specified by the server and you can only connect to a special port if one exists on the
server). By default websites communicate over port 80 so when no port is specified port 80 is
assumed however other ports can be defined in the following format:
http://somesite.com:port/ (eg. http://somesite.com:1010/).
The file path defines the path to the page or file to be viewed. When you load a website
without the file path ie. http://www.google.com/ you are directed to the root level of the
Public_Html or www folder and if it exists the file in that directory named index.htm,
index.html, index.asp or index.php. Defining a file path will take you to a different location
such as http://www.somesite.com/My_pet_photos.htm.
Finally, the optional query string defines any variables when the file path is a script such as
php, asp or cgi. Query strings can cause the script to react in different ways. For example:
http://www.somewiki.com/wiki/Page_Name&Action=edit.
Query String
The query string contains data to be passed to software running on the server. It may contain
name/value pairs separated by ampersands, for example? first_name=John&last_name=Doe.
The fragment identifier, if present, specifies a part or a position within the overall resource or
document. When used with HTTP, it usually specifies a section or location within the page, and
the browser may scroll to display that part of the page.
Domain
Domain is a group of computers that are part of a N/w and share a common directory
address.
A domain is registered as a unit with common rules and procedures. Each domain has a
unique name.
An active directory domain is a collection of computers defined by the administrator of a

windows network.
These computers share a common directory address, security policies, and securities
relationship with other domains. An active Directory domain provides access to the centralized
user accounts and group accounts maintained by the domain administrator.
IP Address (Internet Protocol Address):

This number is an exclusive number all information technology devices (printers, routers,
modems, et al) use which identifies and allows them the ability to communicate with each
other on a computer network. There is a standard of communication which is called an Internet
Protocol standard (IP).
In simple terms it is the same as your home postal address of your computer system. It routes
the packet from its source to your system over the internet. When somebody sends you the
mail, it gives the internet routing protocols the unique information they need to route packets
to your desktop anywhere across the internet.
The IP address is the geographical description of the virtual world and the addresses of both
source and destination systems are stored in the header of every packet.
The address consists of 4 octets, each separated by a dot.
Domain Name System (DNS): This allows the IP address to be translated to words. It is much
easier for us to remember a word than a series of numbers. The same is true for email
addresses.
For example, it is much easier for you to remember a web address name such as
whatismyip.com than it is to remember 192.168.1.1 or in the case of email it is much easier to
remember email@somedomain.com than email@192.168.1.1
Dynamic IP Address: An IP address that is not static and could change at any time. This IP
address is issued to you from a pool of IP addresses allocated by your ISP . This is for a large
number of customers that do not require the same IP Address all the time for a variety of
reasons.
Static IP Address: An IP address that is fixed and never changes. This is in contrast to a dynamic
IP address which may change at any time. Most ISP's a single static IP or a block of static IP's
for a few extra bucks a month.

Domain Name Extensions
gTLDs Description
.com The .com domain extension that stands for commercial was established in 1985 and is used by
several commercial and non-commercial websites around the world. It is the most popular top-level
domain.
.org: The .org domain extension stands for organization. It was established in January 1985, with the
purpose of being given to organizations that do not fulfill the requirements of other generic top-level
domains. Organizations all over the world can register for .org domain extension. It can also be
used by individuals. However individuals can also use domain extensions such as .name and .info.
There are no requirements for registration of the .org extension.
.edu: It stands for education and is widely used by the educational institutions across the United States.
Not all websites using the .edu extension are educational institutions. Some of them are museums
or research organizations linked with education.
.gov: It is a sponsored domain extension that is used by the government entities in the United States.
Federal agencies in the United States use the .fed domain extension. The Department of Defense
and its subordinate organizations use the .mil domain extension.
.net earliest top-level domains in use. Established in 1985, it is currently being managed by VeriSign.
Similar to the .org domain, the .net domain also has no requirements for registration. It ranks third
in the list of most popular top-level domains.
.info and for personal use and other domain extensions like .aero, .biz and .pro are some of the relatively
.name new domains added to the list of generic top-level domains. They were developed and began to be
domain used in the period between 2000 and 2002. The .aero domain stands for aeroplane and is used by
extensions businesses associated with aviation. .biz is used by businesses. It was designed with the aim of
providing businesses with an option to the .com domain. The .pro generic top-level domain can be
used by qualified professionals.
More recent domains developed by the Internet Corporation for Assigned Names and Numbers
(ICANN). The company websites intended at seeking employees and dealing with issues related to
jobs,
the company employment use the .jobs domain extension. The .mobi domain extension is used by
.mobi and
mobile devices gaining an access to the Internet. Supported by Google, Microsoft, the GSM
.travel
Association and many prominent telecom industries, .mobi is one of the very important domain
extensions. The .travel domain is meant to be used by travel agents and tourism agencies.
.ae used by the United Arab Emirates
.asia used by organizations and individuals located in the Asia-Pacific region
.uk used by the United Kingdom
.pk by Pakistan
.us used by the state and local governments in the United States
Web gets new domain addresses:.guru,.bike

(NEWS Jan 2014 Times of India)
The humble.com is set to receive some competition from a new set of unusual web addresses such as.guru
and.singles! Internet users will now be able to register in targeted and specific domains ending
in.guru,.bike,.singles,.plumbing and.clothing among others as a US company is offering a wave of new web
addresses.
Donuts Inc will kick off the general availability period for seven new internet domain names,marking the
beginning of a new era for the Internet in which users will have unprecedented choice in how they identify
and brand themselves online.
The new generic top-level domains (gTLDs) the first of hundreds Donuts will launch this year
are.bike,.clothing,.guru,.holdings,.plumbing,.singles,and.ventures,the company said.Anyone can register
names in these gTLDs on a first come,firstserved basis from accredited registrars worldwide.According to the
company on February 5,.camera,.equipment,.estate,.gallery,.graphics,.lighting and.photography will be open
for registration by anyone interested in an online identity connected to these terms.
Internet
Protocols
The Internet Protocol (IP) is the method or protocol by which data is sent from one
computer to another on the Internet. Each computer (known as a host) on the
Internet has at least one IP address that uniquely identifies it from all other
computers on the Internet. When you send or receive data (for example, an e-mail
note or a Web page), the message gets divided into little chunks called packets.
Each of these packets contains both the sender's Internet address and the
receiver's address. Any packet is sent first to a gateway computer that understands
a small part of the Internet. The gateway computer reads the destination address
and forwards the packet to an adjacent gateway that in turn reads the destination
address and so forth across the Internet until one gateway recognizes the packet
as belonging to a computer.
Because a message is divided into a number of packets, each packet can, if necessary, be sent
by a different route across the Internet. Packets can arrive in a different order than the order
they were sent in. The Internet Protocol just delivers them. It's up to another protocol, the
Transmission Control Protocol (TCP) to put them back in the right order.
IP is a connectionless protocol, which means that there is no continuing connection

between the end points that are communicating. Each packet that travels through
the Internet is treated as an independent unit of data without any relation to any
other unit of data. (The reason the packets do get put in the right order is because
of TCP, the connection-oriented protocol that keeps track of the packet sequence
in a message).
The most widely used version of IP today is Internet Protocol Version 4 (IPv4).
However, IP Version 6 (IPv6) is also beginning to be supported.
Services provided by IP
The Internet Protocol is responsible for addressing hosts and routing datagrams (packets)
from a source host to the destination host across one or more IP networks.
For this purpose the Internet Protocol defines an addressing system that has two functions.
1. Addresses identify hosts and
2. Provide a logical location service.
Each packet is tagged with a header that contains the meta-data for the purpose of delivery. This process of tagging is
also called encapsulation.
TCP/IP
TCP (Transmission Control Protocol) and IP (Internet Protocol) are two different procedures
that are often linked together.
In fact, the term "TCP/IP" is normally used to refer to a whole suite of protocols, each with
different functions. This suite of protocols is what carries out the basic operations of the Web.
TCP/IP is also used on many local area networks.
When information is sent over the Internet, it is generally broken up into smaller pieces or
"packets". The use of packets facilitates speedy transmission since different parts of a message
can be sent by different routes and then reassembled at the destination. It is also a safety
measure to minimize the chances of losing information in the transmission process.
TCP is the means for creating the packets, putting them back together in the correct order at
the end, and checking to make sure that no packets got lost in transmission. If necessary, TCP
will request that a packet be resent.
Internet Protocol (IP) is the method used to route information to the proper address.
Every computer on the Internet has to have it own unique address known as the IP address.
Every packet sent will contain an IP address showing where it is supposed to go. A packet may
go through a number of computer routers before arriving at its final destination and IP controls
the process of getting everything to the designated computer. Note that IP does not make
physical connections between computers but relies on TCP for this function. IP is also used
in conjunction with other protocols that create connections.
This protocol is used together with IP when small amounts of information are involved.
Thus, it uses fewer system resources.
Hypertext Transfer Protocol
Web pages are constructed according to a standard method called Hypertext Markup
Language (HTML). An HTML page is transmitted over the Web in a standard way and format
known as Hypertext Transfer Protocol (HTTP). This protocol uses TCP/IP to manage the Web
transmission.
HTTP functions as a request-response protocol in the client-server computing model.
In HTTP, a web browser, for example, acts as a client, while an application running on a
computer hosting a web site functions as a server. The client submits an HTTP request message
to the server. The server, which stores content, or provides resources, such as HTML files, or
performs other functions on behalf of the client, returns a response message to the client. A
response contains completion status information about the request and may contain any
content requested by the client in its message body.
HTTP is an application layer network protocol built on top of TCP.
HTTP clients (such as Web browsers) and servers communicate via HTTP request and response
messages. The three main HTTP message types are GET, POST, and HEAD.
HTTP utilizes TCP port 80 by default, though other ports such as 8080 can alternatively be used.
A related protocol is Hypertext Transfer Protocol over Secure Socket Layer (HTTPS), first
introduced by Netscape. It provides for the transmission in encrypted form to provide security
for sensitive data. A Web page using this protocol will have https: at the front of its URL.
HTTP Methods
1. GET
The Get is one the simplest Http method. Its main job is to ask the server for the resource.
If the resource is available then it will given back to the user on your browser.
That resource may be a HTML page, a sound file, a picture file (JPEG) etc. We can say that
get method is for getting something from the server. It doesn't mean that you can't send
parameters to the server. But the total amount of characters in a GET is really limited. In get
method the data we send get appended to the URL so whatever you will send will be seen
by other user so can say that it is not even secure.
2. POST
The Post method is more powerful request. By using Post we can request as well as
send some data to the server. We use post method when we have to send a huge data
to the server, like when we have to send a long enquiry form then we can send it by
using the post method.
3. HEAD
Head is the same as GET but returns only HTTP headers and no document body.
File Transfer Protocol

File Transfer Protocol (FTP), a standard Internet protocol, is the
simplest way to exchange files between computers on the Internet.
Like the Hypertext Transfer Protocol (HTTP), which transfers
displayable Web pages and related files, and the Simple Mail
Transfer Protocol (SMTP), which transfers e-mail, FTP is a
application protocol that uses the Internet's TCP/IP protocols. FTP
is commonly used to transfer Web page files from their creator to
the computer that acts as their server for everyone on the Internet. It's also commonly used
to download programs and other files to your computer from other servers.
Protocol Usage
HTTP Display Web Pages & related files
SMTP Tranfer E-Mail
FTP Downloading files/program from other servers.
To transfer files with FTP, you use a program often called the "client."
The FTP client program initiates a connection to a remote computer running
FTP "server" software. After the connection is established, the client can
choose to send and/or receive copies of files, singly or in groups. To connect
to an FTP server, a client requires a username and password as set by the
administrator of the server.
SMTP Stands for "Simple Mail Transfer Protocol."
It's a set of communication guidelines that allow software to transmit email
over the Internet. Most email software is designed to use SMTP for
communication purposes when sending email, and It only works for outgoing
messages. When people set up their email programs, they will typically have
to give the address of their Internet service provider's SMTP server for
outgoing mail. There are two other protocols - POP3(Post Office Protocol)
and IMAP(Internet Message Access Protocol) - that are used for retrieving
and storing email.
Your e-mail client (such as Outlook Expres, Eudora, or Mac OS X Mail) uses
SMTP to send a message to the mail server, and the mail server uses SMTP
to relay that message to the correct receiving mail server.
Basically, SMTP is a set of commands that authenticate and direct the

transfer of electronic mail. When configuring the settings for your e-mail
program, you usually need to set the SMTP server to your local Internet
Service Provider's SMTP settings (i.e. "smtp.yourisp.com").
Internet, Intranet & Extranet

The Internet is an open, public space, while an intranet is designed to be a
private space. An intranet may be accessible from the Internet, but as a rule
it's protected by a password and accessible only to employees or other
Authorized users.
From within a company, an intranet server may respond much more quickly
than a typical Web site. This is because the public Internet is at the mercy
of traffic spikes, server breakdowns and other problems that may slow the
network. Within a company, however, users have much more bandwidth
and network hardware may be more reliable. This makes it easier to serve
high bandwidth content, such as audio and video, over an
intranet.
The Extranet is a portion of an organization's Intranet that is made

accessible to authorized outside users without full access to an entire
organization's intranet.
Today I think of intranets, extranets, and the Web as collections of content. An intranet is a set
of content shared by a well-defined group within a single organization. An extranet is a set of
content shared by a well-defined group, but one that crosses enterprise boundaries.
Web Portal Vs Website

Web portal is a vehicle by which a user gains an access of driving broad array of resources, while a website
is a destination in itself.
Website represents an organization to outside world, but a portal provides multiple user roles with a
common access point.
A website is also a portal, if it broadcast information from different independent resources, thus offering a
public service function to visitors.
Web Portal
Web portal refers to a website or service that offers broad array of resources and services such as email,
forums, search engines and online shopping malls. Its an organized gateway that helps to configure the
access to information found on the internet. Web portal applications offers consistent look and feel with
access control & procedures for multiple applications and databases. Some of the web portals are AOL,
iGoogle, Yahoo and even more.
Typical Portal Attributes
1. Web portal is a Public & Private Interface (extranet, intranet, etc...)
2. Offers Access for Multiple User Roles
3. Personalization / Role specific functionality & content
4. Endowed with Versatile / Enhanced functionality & flexibility
5. The user can access to broad resources
6. Supports the user in multiple task
7. Offers content from diverse resources
8. Spans content, collaboration and ecommerce
9. Extensive & unfocused content can be created to accommodate unidentified users needs.
10. customizable, the content are created for every user.
Websites
A website refers to a location on the internet and a collection of web pages, images, videos which are
addressed relative to a common Uniform Resource Location (URL). Its nothing but a domain name
hosted on a server which is accessible via a network called internet or private local area network.
Owning a website becomes an essential part for any businesses and company with no web presence is
just running the risk of losing the business opportunities.
Typical Website Attributes
1. Its a Public Interface
2. Supports the user in specific task (marketing or ecommerce)
3. Provides targeted content from independent resources to specific audience
4. Content is generally focused, eliminates the need of visiting different sites
5. Select & organize the materials needed to be accessed
6. Establish your presence in online global market
7. Reach the targeted audience.

The Portal and website can be differentiated as:
Authentication:
Portal: It provides facility of Logging-In. Provides you with information based on who you are.
e.g. mail.yahoo.com, gmail.com, rediffmail.com
Website: No log-in.
e.g. www.yahoo.com
Personalization:
Portal: Limited, focused content. Eliminates the need to visit many different sites.
e.g. You type in your user name and password and see your yahoo mail only.
Website: Extensive, unfocused content written to accommodate anonymous users needs.
Customization:
Portal: You will select and organize the materials you want to access. Organized with the materials you want to access.
Website: Searchable, but not customizable. All content is there for every visitor.
e.g. you can navigate to yahoo mail, yahoo shopping, geo cities, yahoo group. If you wish to use any of these services
you will either have to authenticate yourself and see things personalized to you or you can simply visit sections that are
for everyone like yahoo news were if you are not signed in then the default sign in is guest.
Web Server
A web server can be referred to as either the hardware (the computer) or the software (the computer
application) that helps to deliver content that can be accessed through the Internet. A web server is what makes
it possible to be able to access content like web pages or other data from anywhere as long as it is connected
to the internet. The hardware houses the content, while the software makes the content accessible through the
internet.
The most common use of web servers is to host websites but there are other uses like data storage or for
running enterprise applications.
There are also different ways to request content from a web server. The most common request is the Hypertext
Transfer Protocol (HTTP), but there are also other requests like the Internet Message Access Protocol (IMAP)
or the File Transfer Protocol (FTP).
Function of a web Server

The primary function of a web server is to deliver web pages on the request to clients. This means delivery
of HTML documents and any additional content that may be included by a document, such as images, style
sheets and JavaScripts.
A client, commonly a web browser or web crawler, initiates communication by making a request for a specific
resource using HTTP and the server responds with the content of that resource or an error message if unable
to do so. The resource is typically a real file on the server's secondary memory, but this is not necessarily the
case and depends on how the web server is implemented.
While the primary function is to serve content, a full implementation of HTTP also includes ways of receiving
content from clients. This feature is used for submitting web forms, including uploading of files.
Web servers are not always used for serving the World Wide Web. They can also be found embedded in
devices such as printers, routers, webcams and serving only a local network. The web server may then be
used as a part of a system for monitoring and/or administrating the device in question. This usually means that
no additional software has to be installed on the client computer; since only a web browser is required (which
now is included with most operating systems).
Common features
1. Virtual hosting to serve many Web sites using one IP address.

2. Large file support to be able to serve files whose size is greater than 2 GB on 32 bit OS.
3. Bandwidth throttling to limit the speed of responses in order to not saturate the network and to be
able to serve more clients.
Bandwidth throttling is a reactive measure employed in communication networks to regulate network
traffic and minimize bandwidth congestion
4. Server-side scripting to generate dynamic Web pages, still keeping Web server and Web site
implementations separate from each other. This is different from client-side scripting where scripts
are run by the viewing web browser, usually in JavaScript.
The primary advantage to server-side scripting is the ability to highly customize the response based
on the user's requirements, access rights, or queries into data stores.
Load limits
A Web server (program) has defined load limits, because it can handle only a limited number of concurrent
client connections (usually between 2 and 80,000, by default between 500 and 1,000) per IP address (and
TCP port) and it can serve only a certain maximum number of requests per second depending on:
1. its own settings;

2. the HTTP request type;
3. content origin (static or dynamic);
4. the fact that the served content is or is not cached;
5. the hardware and software limitations of the OS where it is working;
When a Web server is near to or over its limits, it becomes unresponsive.
Overload causes
At any time web servers can be overloaded because of:
1. Too much legitimate web traffic. Thousands or even millions of clients connecting to the web site in a short
interval
2. Computer worms that sometimes cause abnormal traffic because of millions of infected computers (not
coordinated among them);
A computer worm is a self-replicating malware computer program, which uses a computer network to send
copies of itself to other nodes (computers on the network) and it may do so without any user intervention. This
is due to security shortcomings on the target computer. Unlike a computer virus, it does not need to attach itself
to an existing program. Worms almost always cause at least some harm to the network, even if only by
consuming bandwidth, whereas viruses almost always corrupt or modify files on a targeted computer.
3. XSS viruses can cause high traffic because of millions of infected browsers and/or Web servers;
4. Internet (network) slowdowns, so that client requests are served more slowly and the number of connections
increases so much that server limits are reached;
5. Web servers (computers) partial unavailability. This can happen because of required or urgent maintenance or
upgrade, hardware or software failures, back-end (e.g., database) failures, etc.; in these cases the remaining
web servers get too much traffic and become overloaded.
6. Internet bots. Traffic not filtered/limited on large web sites with very few resources (bandwidth, etc.);Internet
bots, also known as web robots, WWW robots or simply bots, are software applications that run automated
tasks over the Internet. Typically, bots perform tasks that are both simple and structurally repetitive, at a much
higher rate than would be possible for a human alone.
Symptoms of overload
The symptoms of an overloaded web server are:
1. long delays in handling request (from 1 second to a few hundred seconds).

2. The web server returns an HTTP error code, such as 500, 502, 503, 504, or 408, or even 404, which is
inappropriate for an overload condition.
3. The web server refuses or resets (interrupts) TCP connections before it returns any content.
4. In very rare cases, the web server returns only a part of the requested content. This behavior can be
considered a bug, even if it usually arises as a symptom of overload.
Anti-overload techniques
Managing network traffic, by using:
1. Firewalls to block unwanted traffic coming from bad IP sources or having bad patterns;
2. Bandwidth management and traffic shaping, in order to smooth down peaks in network usage;
5. deploying Web cache techniques;
6. using many web servers (computers) that are grouped together so that they act or are seen as one big web
server
Market structure
Below is the most recent statistics of the market share of the top web servers on the internet by Netcraft
survey in January 2012.
Internet History
The Internet was originally developed by DARPA, the Defense Advanced Research
Projects Agency, as a means to share information on defense research between
involved universities and defense research facilities. Originally it was just email and
FTP sites as well as the Usenet where scientists could question and answer each other.
It was originally called ARPANET (Advanced Research Projects Agency Network)
developed starting in 1964, since networking computers was new to begin with,
standards were being developed on the fly. Once the concept was proven, the
organizations involved started to lay out some ground rules for standardization.
Web Browsers
A web browser or Internet browser is a software application for retrieving, presenting,
and traversing information resources on the World Wide Web. An information resource
is identified by a Uniform Resource Identifier (URI) and may be a web page, image,
video, or other piece of content. Hyperlinks present in resources enable users to easily
navigate their browsers to related resources.
Although browsers are primarily intended to access the World Wide Web, they can
also be used to access information provided by Web servers in private networks or
files in file systems. Some browsers can also be used to save information resources
to file systems.
History
The first web browser was invented in 1990 by Tim
Berners-Lee. It was called WorldWideWeb (no spaces)
and was later renamed Nexus.
The history of the Web browser dates back in to the
late 1980s, when a variety of technologies laid the
foundation for the first Web browser,
WorldWideWeb, by Tim Berners-Lee in 1991. That browser brought together a variety of
existing and new software and hardware technologies.
Microsoft responded with its browser Internet Explorer in 1995 (also heavily influenced by
Mosaic), initiating the industry's first browser war. Microsoft was able to leverage its
dominance in the operating system market to take over the Web browser market; Internet
Explorer usage share peaked at over 95% by 2002.
Opera first appeared in 1996; although it has never achieved widespread use, with a browser
usage share that has fluctuated between 2.2% and 2.4% throughout 2010
In 1998, Netscape launched what was to become the Mozilla Foundation in an attempt to
produce a competitive browser using the open source software model. That browser would
eventually evolve into Firefox, which developed a respectable following while still in the beta
stage of development; shortly after the release of Firefox 1.0 in late 2004, Firefox (all versions)
accounted for 7.4% of browser use. The Firefox usage share has slowly declined in 2010, from
24.4% in January to 22.8% in December.
Apple's Safari had its first beta release in January 2003; it has a dominant share of Apple-based
Web browsing, having risen from 4.5% usage share in January 2010 to 5.9% in December 2010.
Its rendering engine, called WebKit, is also running in the standard browsers of several mobile
phone platforms, including Apple iOS, Google Android, Nokia S60 and Palm webOS.
The most recent major entrant to the browser market is Google's Chrome, first released in September 2008. Chrome's
take-up has increased significantly year on year, by doubling its usage share from 7.7 percent to 15.5 percent by August
2011.
This increase seems largely to be at the expense of Internet Explorer, whose share has tended to decrease from month
to month.
In December 2011 Google Chrome overtook Internet Explorer 8 as the most widely used web browser.
However, when all versions of Internet Explorer are put together, IE is still most popular.
Function
The primary purpose of a web browser is to bring information resources to the user. This process begins when the user
inputs a Uniform Resource Identifier (URI), for example http://en.wikipedia.org/, into the browser. The prefix of the
URI determines how the URI will be interpreted.
The most commonly used kind of URI starts with http:
and identifies a resource to be retrieved over the Hypertext Transfer Protocol (HTTP).
Many browsers also support a variety of other prefixes, such as https: for HTTPS, ftp: for the File Transfer Protocol, and
file: for local files. Prefixes that the web browser cannot directly handle are often handed off to another application
entirely. For example, mailto: URIs are usually passed to the user's default e-mail application and news: URIs are passed
to the user's default newsgroup reader.
Features of a Web Browsers
1. All major web browsers allow the user to open multiple information resources at
the same time, either in different browser windows or in different tabs of the same
window.
2. Major browsers also include pop-up blockers to prevent unwanted windows from
"popping up" without the user's consent.
3. Most web browsers can display a list of web pages that the user has bookmarked
so that the user can quickly return to them. Bookmarks are also called "Favorites"
in Internet Explorer.
4. In addition, all major web browsers have some form of built-in web feed
aggregator.
5. In Mozilla Firefox, web feeds are formatted as "live bookmarks" and behave like
a folder of bookmarks corresponding to recent entries in the feed.
6. In Opera, a more traditional feed reader is included which stores and displays
the contents of the feed.
7. Furthermore, most browsers can be extended via plug-ins, downloadable
components that provide additional features.
User interface
1. Back and forward buttons to go back to the previous resource and forward
again.
2. A history list, showing resources previously visited in a list (typically, the list is
not visible all the time and has to be summoned)
3. A refresh or reload button to reload the current resource.
A stop button to cancel loading the resource. In some browsers, the stop button
is merged with the reload button.
4. A home button to return to the user's home page.
5. An address bar to input the Uniform Resource Identifier (URI) of the desired
resource and display it.
6. A search bar to input terms into a search engine.
7. A status bar to display progress in loading the resource and also the URI of
links when the cursor hovers over them, and page zooming capability.
Privacy and security

Most browsers support HTTP Secure and offer quick and easy ways to delete the
web cache, cookies, and browsing history. For a comparison of the current security
vulnerabilities of browsers, see comparison of web browsers.
Standards support
Early web browsers supported only a very simple version of HTML. The rapid
development of web browsers led to the development of non-standard dialects of
HTML, leading to problems with interoperability. Modern web browsers support a
combination of standards-based and de facto HTML and XHTML, which should be
rendered in the same way by all browsers.
Comparison of Web Browser
Source: Median values from summary table.
Internet Explorer (38.9%)
Firefox (25.0%)
Google Chrome (20.9%)
Safari (8.0%)
Opera (2.7%)
Mobile browsers (6.7%)
While Microsoft Internet Explorer comes preinstalled on all PCs running the Windows operating system, many
consumers look to third-party browsers when the features better suit their needs. In the case of the Netscape Navigator
browser, several key differences set it apart from Internet Explorer.
1. Interface
Speed
2. Netscape Navigator's interface is more bare bones, with its simple gray windows and minimal clutter. Internet
Explorer has a multi-faceted interface, ideal for some advanced users but unnecessarily complicated for
others.
Support
3. Though Netscape takes a little bit longer to initialize than Internet Explorer, the Netscape browser is able to
offer quicker real-time browsing due to its smaller file sizes.
Security
4. Perhaps the most glaring shortcoming of Netscape Navigator is its complete lack of support and upgrades.
While Internet Explorer is a current, regularly upgraded product, the Associated Press reports that Netscape
Navigator was officially cancelled by AOL on Feb 1, 2009.
Compatibility
5. According to Microsoft, Internet Explorer offers "cross-site scripting filter" and "a SmartScreen Filter" to help
avoid security risks. Netscape employs security certificates, but it does not have active security updates.
6. Both browsers allow for basic Direct X, Java, and Flash compatibility. However, in terms of toolbars and other
browser add-ons, Internet Explorer is more widely compatible.
History of Internet Browsers

1st Internet browser Worldwideweb
1st Graphical web browser - NCSA Mosaic
The first internet browser was created in the 1980s and was called WorldWideWeb released in
1991. It wasnt until the creation of NCSA Mosaic the first graphical web browser that the
internet began to see wide spread use.
The leader of the Mosaic team then separated to create Netscape Navigator in 1994 which went on
to become the most widely used internet browser in the world accounting for 90 percent of all web
use.
In 1995, Microsoft then went ahead to create their version of Netscape Navigator Internet
Explorer. This was the beginning of the internets browser war. Internet Explorer quickly took over
from Netscape and had a 95 percent market share by 2002.
In the past five years dozens of other internet browsers have come onto the market all offering
bigger and better features than the last. Mozilla Firefox was released in 2004 and has now taken
over from Internet Explorer as the worlds most used internet browser. Google Chrome was
released in September 2008 and is quickly becoming another popular choice.
We shall compare the five main internet browsers that make up the majority of the worlds market
share in internet browsers. These are: Internet Explorer, Mozilla Firefox, Safari, Opera and Google
Chrome.
All of the top five internet browsers have several things in common they are fast, light in weight,
provide internet security and are reliable. So, if that is all you after from an internet browser then
any of them will suit you fine. However, if you want additional features and add-ons then there are
slight variations between the browsers that we shall look at below.
Internet Explorer
Internet Explorer has been the leading internet browser for many years, only recently being over taken by Mozilla
Firefox. However, many people still use Internet Explorer and it has many fantastic features.
There are two main benefits to using Internet Explorer compatibility and security. Internet Explorer is compatible
with all websites, whereas other browsers may have difficult opening several websites. It is also incredibly safe and
helps protect against phishing and malware attacks. It is fast, safe and easy to use, however when compared to its
biggest rival Firefox it doesnt have as many features or the ability to customize.
The latest Internet Explorer offers crash recovery, a fast start up and the address bar provides auto complete.
Mozilla Firefox (Mozilla)

The leader of the pack of internet browsers is Mozilla Firefox. It has all the features of the other leading browsers with
many more. It is very customizable and there are hundreds of add-ons to help personalize your internet browsing
experience.
It is one of the fastest browsers on the market and is constantly being updated. It is very safe with many security
features built in to it.
Firefox also allows private browsing, a built-in spell checker and open video and audio.
The only negative point to be made about Firefox is it can be quite slow to start up in comparison to other internet
browsers.
Safari (Apple)
Safari is the standard internet browser for Mac OS X users. It is a fast and reliable browser that has a sleek and easy to
use interface like most of Apples products.
The best thing about Safari is its speed which Apple claims is the fastest of all web browsers. However, it does not
have many features that other browsers have which allow it to be so fast and light. If you dont require flashy extras
then Safari is a great choice for you, but if you want something with more features then you may want to try a different
browser.
Opera (Opera Software ASA)

Opera is a great internet browser and is becoming the first choice for an increasing number of internet users around
the world. There are many exciting features built into the browser that make browsing much more fun and functional
than other browsers.
Some of the features that Opera offers include: interactive voice, fast browsing, thumbnail previews, mouse gestures
and the ability to customize skins. If you want a reliable internet browser that is a little out of the ordinary then Opera
is the perfect choice.
Google Chrome
Google Chrome is the latest competitor in the internet browser game. It was only released in September 2008 but
already has a huge fan base.
Chrome offers a sleek and simple interface that incorporates speed, simplicity and compatibility all rolled into one
package. Some of the extra features included are: the ability to drag, drop and rearrange tabs as well as an excellent
task manager feature.
The main downside of Chrome is that it has very few add-ons, however it is expected that these will come over the
coming months as Chrome continues to become more popular
Differences between Internet Explorer (IE) and Netscape

Navigator (NN)
These two major browsers are coming closer to each other regarding the DHTML effects possible towards newer
versions.
However you will need to remember that IE is more flexible than Netscape and due to small differences something
that works really well in IE might not work at all in Netscape. So you need to be really careful and alert when
programming for both browsers.
Another major limitation of Netscape as compared to IE is that not all properties of a page can be changed at any time.
This is because when the web page is once written to the screen, only position, visibility and clipping can be
manipulated dynamically.
The good news is that from the web designing point of view you can now forget completely about debugging all your
websites for Netscape 4.x as a very small fraction of the Netscape community still use it. Think of it this way, if you are
bent on making the website work perfectly for version 4.x then you cannot use some effects (especially javascript and
CSS) that are easily supported by the latest versions of all the major browsers.
Web search engine
A program that searches documents for specified keywords and returns a list of the
documents where the keywords were found. Although search engine is really a general
class of programs, the term is often used to specifically describe systems like
Google, Alta Vista and Excite that enable users to search for documents on the
World and USENET newsgroups.
Typically, a search engine works by sending out a spider to fetch as many documents
as possible. Another program, called an indexer, then reads these documents and
creates an index based on the words contained in each document. Each search engine
uses a proprietary algorithm to create its indices such that, ideally, only meaningful
results are returned for each query.
A web search engine is designed to search for information on the World Wide Web
and FTP servers. The search results are generally presented in a list of results and are
often called hits. The information may consist of web pages, images, information and
other types of files. Some search engines also mine data available
in databases or open directories. Unlike web directories, which are maintained by
human editors, search engines operate algorithmically or are a mixture of algorithmic
and human input.
On the Internet, a search engine is a coordinated set of programs that includes:
1. A spider (also called a "crawler" or a "bot") that goes to every page or representative pages on every
Web site that wants to be searchable and reads it, using hypertext links on each page to discover and
read a site's other pages.
2. A program that creates a huge index (sometimes called a "catalog") from the pages that have been
read.
3. A program that receives your search request, compares it to the entries in the index, and returns results
to you.
An alternative to using a search engine is to explore a structured directory of topics. Yahoo, which also lets
you use its search engine, is the most widely-used directory on the Web. A number of Web portal sites offer
both the search engine and directory approaches to finding information.
Year Engine
1990 Archieve The very first tool used for searching on the Internet was Archie. The program
downloaded the directory listings of all the files located on public anonymous FTP
(File Transfer Protocol) sites, creating a searchable database of file names;
however, Archie did not index the contents of these sites since the amount of data
was so limited it could be readily searched manually.
1992 W3Catalog the web's first primitive search engine, released on September 2, 1993
1993 Wandex the first web robot, the Perl-based World Wide Web Wanderer
1993 Jump Station Used a web robot to find web pages and to build its index, and used a web form as
the interface to its query program. It was thus the first WWW resource-discovery
tool to combine the three essential features of a web search engine (crawling,
indexing, and searching) as described below.
1994 Web Crawler One of the first "full text" crawler-based search engines Unlike its predecessors, it
let users search for any word in any webpage, which has become the standard for
all major search engines since. It was also the first one to be widely known by the
public.
1994 Lycos Launched and became a major commercial endeavor.
1994 Yahoo Was among the most popular ways for people to find web pages of interest, but its
search function operated on its web directory, rather than full-text copies of web
pages. Information seekers could also browse the directory instead of doing a
keyword-based search.
2000 Google The company achieved better results for many searches with an innovation
called PageRank. Google also maintained a minimalist interface to its search
engine. In contrast, many of its competitors embedded a search engine in a web
portal.
2003 Altavista Yahoo with google
2004 Msnbot Microsoft began a transition to its own search technology
2009 Bing Microsoft's rebranded search engine
A search engine operates, in the following order
1. Web crawling
2. Indexing
3. Searching
There are basically three types of search engines:
1. Those that are powered by robots (called crawlers; ants or spiders)
2. Those that are powered by human submissions;
3. and those that are a hybrid of the two.
Web Crawler
Web search engines work by storing information about many web pages, which they
retrieve from the html itself. These pages are retrieved by a Web crawler (sometimes
also known as a spider) an automated Web browser which follows every link on the
site. Exclusions can be made by the use of robots.txt.
Different Search Engine Approaches

1. Major search engines such as Google, Yahoo (which uses Google), AltaVista, and Lycos index the
content of a large portion of the Web and provide results that can run for pages - and consequently
overwhelm the user.
2. Specialized content search engines are selective about what part of the Web is crawled and indexed.
For example, TechTarget sites for products such as the AS/400 (http://www.search400.com) and CRM
applications (http://www.searchCRM.com) selectively index only the best sites about these products and
provide a shorter but more focused list of results.
3. Ask Jeeves (http://www.ask.com) provides a general search of the Web but allows you to enter a search
request in natural language, such as "What's the weather in Seattle today?"
4. Special tools and some major Web sites such as Yahoo let you use a number of search engines at the
same time and compile results for you in a single list.
5. Individual Web sites, especially larger corporate sites, may use a search engine to index and retrieve
the content of just their own site. Some of the major search engine companies license or sell their
search engines for use on individual sites.
Powered by Robots
A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner
or in a particular order. Other terms for Web crawlers are ants, automatic indexers, bots, Web spiders, Web
robots.
Crawler-based search engines are those that use automated software agents (called crawlers) that visit a Web
site, read the information on the actual site, read the site'smeta tags and also follow the links that the site
connects to performing indexing on all linked Web sites as well.
The crawler returns all that information back to a central depository, where the data is indexed. The crawler will
periodically return to the sites to check for any information that has changed. The frequency with which this
happens is determined by the administrators of the search engine.
This process is called Web crawling or spidering. Many sites, in particular search
engines, use spidering as a means of providing up-to-date data. Web crawlers are
mainly used to create a copy of all the visited pages for later processing by a search
engine that will index the downloaded pages to provide fast searches. Crawlers can
also be used for automating maintenance tasks on a Web site, such as checking links
or validating HTML code. Also, crawlers can be used to gather specific types of
information from Web pages, such as harvesting e-mail addresses (usually for spam).
A Web crawler is one type of bot, or software agent. In general, it starts with a list
of URLs to visit, called the seeds. As the crawler visits these URLs, it identifies all
the hyperlinks in the page and adds them to the list of URLs to visit, called the crawl
frontier. URLs from the frontier are recursively visited according to a set of policies.
Examples of Web Crawler
Yahoo Slurp, Msnbot, Fast Crawler, Google bot, World Wide Web Worm, WebFountain , Web Crawler
Human Powered Search Engine

Human-powered search engines, better known as Web directories, are popular simply because
of the higher quality of links submitted and the caliber of the sites hand-picked to be included
in the index. Here are some of the most popular human-powered search engines on the Web.
In both cases, when you query a search engine to locate information, you're actually searching through the
index that the search engine has created you are not actually searching the Web. These indices are
giant databases of information that is collected and stored and subsequently searched. This explains why
sometimes a search on a commercial search engine, such as Yahoo! or Google, will return results that are, in
fact, dead links. Since the search results are based on the index, if the index hasn't been updated since a
Web page became invalid the search engine treats the page as still an active link even though it no longer is.
It will remain that way until the index is updated.
Basic Search Technique
This tutorial is a how-to guide for creating AND, OR, NOT, phrase, and field searches on Web search engines.
We'll be using Google as an example. Keep in mind that the illustrated searches will work on most general search engines on
the Web.
Putting together a search is a three-step process.
1. Identify your concepts

When planning your search, break down your topic into its separate concepts. Let's say you're interested in the effects of
global warming on crops. In this case, you have two concepts: GLOBAL WARMING and CROPS.
2. Make a list of search terms for each concept

Once you have identified your concepts, list the terms which describe each concept. Some concepts may have only one term,
while others may have many.
global warming
greenhouse effect
greenhouse gases
climate change
crops
crop yields
crop production
food supply
These lists are a suggestion. Depending on the focus of your search, there may be other terms more suited to what you're
looking for.
3. Specify the logical relationships among your search terms

Once you know the words you want to search, you need to establish the logical relationships among them using Boolean
logic: AND, OR, NOT.
To keep things simple, you don't need to use all the words you've compiled in a single search. The words are there to help you
experiment with different searches until you find the results you want.
TIP! There are also optional things you can do to focus a search. One useful option is known as field searching, and is
covered later on in this tutorial.
Boolean AND search

Let's start with a very simple two-word search. In this type of search, we want Web pages that contain both of
our search terms. This is Boolean AND logic. This is probably the most common type of search that people
want to do.
With most general search engines on the Web, including Google, all you need to do is type your search terms
in the input box and the terms will be searched using Boolean AND logic. In other words, Boolean AND is the
default logic.
In our example, we're asking for documents that contain the words rain and snow. To do this, we simply
type the two words into the search box with a space between them.
Notice how both words appear in the results. This is exactly what we wanted.
A variant of an AND search is the plus sign (+). In many search engines, the plus sign signals an AND
search. It guarantees that the words or phrases you include in your search will appear in your search results.
For example, +rain +snow. In most search engines, you don't need to use the plus sign because the
search engine will assume it.
Boolean OR search
What if we want results that include either the word r a i n or the word s no w? This calls for Boolean OR logic. With OR logic,
we're asking for one word, or the other word, or both. An easy way to use OR logic is to use an advanced search page. Most
search engines have such an option and it's very useful.
And the results are in, as you can see in the screenshot below - all 551,000,000 of them! The search results
include pages with just the word rain or just the word snow, exactly as we wanted. Farther down in the
results will be documents containing both words - the overlap in the Venn diagram that you learned about
in Boolean Searching on the Internet.
Notice that Google has translated this search into its own syntax: rain OR snow. Google requires that
the word OR be typed in CAPITAL LETTERS. So do some other search engines. Since this may not be easy
to remember, it's best to go to the advanced search page and let the search engine do the rest.
An OR search is usually used to search for synonyms, for example, global warming OR climate
change.
Boolean NOT search
Sometimes you want to retrieve documents that do not contain a particular word. This can help when
associated words are not really relevant and can muddy the focus of your results. To do this, place a minus
sign (-) in front of the word you want to exclude.
Let's go back to our rain-snow example. In this case, we want documents that contain the word rain, but not
the word snow. So, we've placed the minus sign immediately in front of the word snow: rain -snow.
Combined Boolean AND, OR search

Sometimes you need a search that is more complex than a single AND or OR search. It is possible to combine both
types of Boolean logic in the same search. Most search engines offer a way to do this. Given the variety in search
engine syntax, it is best to try this type of search using an advanced search page. Advanced search pages are great
for...advanced types of searches!
Let's say you want to learn about the behavior of cats. You believe that using both the words ca ts and
fe el ine s will help you get more results than using just one of these words. The example below shows you how to do
this type of search on an advanced search page.
Phrase Search
Some words naturally appear in the context of a phrase, for example, freedom of the press. To search on
phrases in most search engines, simply enclose the phrase within double quotes: "freedom of the
press".
Phrases are especially important when there are STOP WORDS in your search. These are "little" words such as
a, and, the, in, it, etc. Most search engines tend to ignore these words. If you want to be sure they are
included in your search results, enclose them with the rest of your search within quotation marks. You can
also put a plus sign (+) in front of them. Yahoo! suggests a combination of quotation marks and the plus sign,
e.g., "+in thing".
List of Common HTML Error Codes

Code Description Comment
100 Continue
101 Switching Protocols
200 OK Action completed successfully
201 Created Success following a POST command
The request has been accepted for processing, but the processing
202 Accepted
has not been completed.
Response to a GET command, indicates that the returned meta

203 Partial Information
information is from a private overlaid web.
Server has received the request but there is no information to

204 No Content
send back.
205 Reset Content
The requested file was partially sent. Usually caused by stopping

206 Partial Content
or refreshing a web page.
300 Multiple Choices
Requested a directory instead of a specific file. The web server

301 Moved Permanently added the filename index.html, index.htm, home.html, or
home.htm to the URL.
302 Moved Temporarily
303 See Other
The cached version of the requested file is the same as the file to
304 Not Modified
be sent.
305 Use Proxy
400 Bad Request The request had bad syntax or was impossible to be satisified.
User failed to provide a valid user name / password required for

401 Unauthorized
access to file / directory.
402 Payment Required
The request does not specify the file name. Or the directory or the
403 Forbidden file does not have the permission that allows the pages to be
viewed from the web.
404 Not Found The requested file was not found.
405 Method Not Allowed
406 Not Acceptable
Proxy Authentication
407
Required
408 Request Time-Out
409 Conflict
410 Gone
411 Length Required
412 Precondition Failed
413 Request Entity Too Large
414 Request-URL Too Large
415 Unsupported Media Type

In most cases, this error is a result of a problem with the code or
500 Server Error
program you are calling rather than with the web server itself.
501 Not Implemented The server does not support the facility required.
502 Bad Gateway
The server cannot process the request due to a system

503 Out of Resources
overload. This should be a temporary condition.
The service did not respond within the time frame that the
504 Gateway Time-Out
gateway was willing to wait.
HTTP Version not

505
supported
Practical List:-
1. Basic Tags (Head, Body & Title)
2. Use of various Heading Tags (H1,H2,H3H6)
3. Changing the webpage back color using Hexadecimal code
4. Implicit tag (Bold, Italic, Underline)
5. Explicit Tags (<cite>,<tt>)
6. Paragraph with different attributes

Font tag (with attributes)
Horizontal Rule (Line) with attributes.
7. Pre-formatted tags
1. Resume (With Table for Qualification)
2. Drawing like Hut , Teddy Bear
8. List Tag
1. Ordered List
1. Number 1,2,3
2. Number A,B,C
3. Number in Roman
2. Unordered List
1. With Disc
2. With circle
3. With square
3. Mixed List (Combination of Ordered & Unordered).
4. Definition List (Glossary List)

<dl>
<dt>
<dd> </dd>
/dt>
</dl>
5. Directory List
6. Menu List
9. <q> Vs <blockquote>
10. Paragraph Vs Division tag (<div align=right>)

11. Font Vs Base font
12. Text formatting tag

(Bold, italic, underline, strike)
13. Moving Text in Web page (marquee tag)

attributes
14. Making a comment in a webpage(Comment tag)

15. <abbr> tag
HTML <basefont> Tag

The basefont element is deprecated.
Example
Specify a default font-color and font-size for text on page:
<head>
<basefont color="red" size="5" />
</head>
<body>
<h1>This is a header</h1>
<p>This is a paragraph</p>
</body>
Definition and Usage

The <basefont> tag specifies a default font-color, font-size, or font-family for all the text in a document.
Browser Support
<p>This is the standard font size for this document.<br/>
<basefont size="4" />
And now the font is a bit larger.
<basefont size="3" />
And now the font is back to normal.</p>
Internet service provider
An Internet service provider (ISP), also sometimes referred to as an Internet access provider (IAP), is a
company that offers its customers access to the Internet. The ISP connects to its customers using a data
transmission technology appropriate for delivering Internet Protocol packets or frames, such as dial-
up, DSL, cable modem, wireless or dedicated high-speed interconnects.
Internet Service Provider, it refers to a company that provides Internet services, including personal and
business access to the Internet. For a monthly fee, the service provider usually provides a software
package, username, password and access phone number. Equipped with a modem, you can then log on to
the Internet and browse the World Wide Web and USENET, and send and receive e-mail.
For broadband access you typically receive the broadband modem hardware or pay a monthly fee for this
equipment that is added to your ISP account billing.
In addition to serving individuals, ISPs also serve large companies, providing a direct connection from the
company's networks to the Internet. ISPs themselves are connected to one another through Network Access
Points (NAPs). ISPs may also be called IAPs (Internet Access Providers).
ISPs may provide Internet e-mail accounts to users which allow them to communicate with one another by
sending and receiving electronic messages through their ISP's servers. ISPs may provide services such as
remotely storing data files on behalf of their customers, as well as other services unique to each particular ISP.
Typical home user connection
1. Broadband wireless access
2. Cable Internet
3. Dial-up
1. ISDN (Integrated Services Digital Network)

2. Modem
4. DSL ((typically Asymmetric Digital Subscriber Line, ADSL)
5. FTTH ( fiber to the premises)
6. Wi-Fi (its use to describe only a narrow range of connectivity technologies including wireless local area
network (WLAN) based on the IEEE 802.11 standards)
Typical business-type connection
1. DSL (Digital Subscrber Line)
2. Ethernet technologies
3. Leased line
4. SHDSL (Single pair high speed digital subscriber line)
ISPs having all-India licence include:
Essel Shyam
BSNL CMC RPG Infotech
Communications
Gateway World Phone

Sify Siti Cable Network
Systems (India) Internet Services
Hughes Escorts Astro India

VSNL Guj Info Petro
Communications Networks
Primus
RailTel
Reliance Telecommunications ERNET India
Corporation
India
Data Infosys GTL Jumpp India L&T Finance
Tata Internet Tata Power

HCL Infinet Primenet Global
Services Broadband
Reliance
Pacific Internet
Bharti Infotel In2Cable (India) Engineering
India
Associates
Swiftmail Estel
BG Broad India Bharti Aquanet
Communications Communication
Trak Online Net Reach Network

Spectra Net i2i Enterprise
India India
Gujarat Narmada HCL Comnet

Tata Teleservices
Comsat Max Valley Fertilizers Systems and
(Maharashtra)
Corporation Services
Harthway Cable
More recently, wireless Internet service providers or WISPs have emerged that offer Internet access through wireless
LAN or wireless broadband networks.
In addition to basic connectivity, many ISPs also offer related Internet services like email, Web hosting and access to
software tools.
A few companies also offer free ISP service to those who need occasional Internet connectivity. These free offerings
feature limited connect time and are often bundled with some other product or service.
Types of internet access

Most ISPs offer several types of internet access
which essentially differ in connection speeds the
time taken for download and upload. Many also
offer different plans or packages that vary in the
download limit, number of email accounts on offer
etc.
Dialup internet access is probably the slowest
connection and requires you to connect to the
internet via your phone line by dialling a number
specified by the ISP. This means, dialup
connections are not always on, unless you want
to raise a huge phone bill, you would sever the
connection when you finish work online.
Cable internet access can be obtained from the
local cable TV operator. However, ask them for a
demo first or check with your neighbours on the
quality of service.
Internet access via DSL broadband is indeed
very fast and ISPs can offer different download
speeds quicker the speed, higher will be the
price. If you are planning for a DSL internet
connection, ask the ISP if they would also install a
wireless modem and router at your location. A
wireless internet connection gives you freedom and
flexibility you need not be confined to one place
(the work table, for instance) and can access the
internet from any spot (even the bathroom) as long
How does the ISP connect you to the Internet?
When you are connected to the Internet through your service provider, communication between you and the
ISP is established using a simple protocol: PPP (Point to Point Protocol), a protocol making it possible for two
remote computers to communicate without having an IP address.
In fact your computer does not have an IP address. However an IP address is necessary to be able to go
onto the Internet because the protocol used on the Internet is the TCP/IP protocol which makes it possible for
a very large number of computers which are located by these addresses to communicate.
So, communication between you and the service provider is established according to the PPP protocol which
is characterized by:
1. a telephone call
2. initialization of communication
3. verification of the user name (login or userid)
4. verification of the password
Once you are "connected", the internet service provider lends you an IP address which you keep for the whole
duration that you are connected to the internet. However, this address is not fixed because at the time of the
next connection the service provider gives you one of its free addresses (therefore different because depending
on its capacity, it may have several hundreds of thousand addresses.).
Your connection is therefore a proxy connection because it is your service provider who sends all the requests
you make and the service provider who receives all the pages that you request and who returns them to you.
It is for these reasons for example that when you have Internet access via an ISP, you must pick up your email
on each connection because generally it is the service provider that receives your email (it is stored on one of
its servers).
Differences between ISPs

Selecting an ISP depends on many criteria including the number of services offered and the quality of these
services. So what are these criteria?
1. Cover: some ISPs only offer cover in large towns, other offers national coverage, i.e. a number which
is charged as a local call wherever you are calling from
2. Bandwidth: this is the total speed that the ISP offers. This bandwidth is shared between the number of
subscribers, so the more the number of subscribers increases the smaller this becomes (the bandwidth
allocated to each subscriber must be greater than his transmission capacity in order to provide him
with a quality service).
3. Price: this depends on the ISP and the type of package chosen. Some ISPs now offer free access
4. Access: unlimited: some ISPs offer a package where your connection time is taken into account, i.e.
you cannot exceed a number of hours of connection per month, in which case the call charge is
subject to a price increase (additional minutes are very expensive). Some providers even offer tariffs
without subscription, i.e. only the communication is paid for (but obviously is more expensive than a
local call!)
5. Technical service: this is a team responsible for responding to your technical problems (also called a
hotline or even customer service). ISPs generally charge for this type of service (sometimes 1.35 for
the call then 0.34/min)
6. Supplementary services:
1. Number of email addresses
2. Space made available for the creation of a personal page (HTML)
Internet Explorer Vs Netscape Navigator

Differences between Internet Explorer (IE) and Netscape Navigator (NN)
These two major browsers are coming closer to each other regarding the DHTML effects possible towards
newer versions. However you will need to remember that IE is more flexible than Netscape and due to
small differences something that works really well in IE might not work at all in Netscape. So you need
to be really careful and alert when programming for both browsers. One hint you can follow in most
cases is that if you get it working in Netscape it should most probably work in IE.
Another major limitation of Netscape as compared to IE is that not all properties of a page can be
changed at any time. This is because when the web page is once written to the screen, only position,
visibility and clipping can be manipulated dynamically.
The good news is that from the web designing point of view you can now forget completely about
debugging all your websites for Netscape 4.x as a very small fraction of the Netscape community still use
it. Think of it this way, if you are bent on making the website work perfectly for version 4.x then you
cannot use some effects (especially javascript and CSS) that are easily supported by the latest versions
of all the major browsers.
For Dreamweaver to not keep throwing up Netscape 4 errors set the browser check settings to show
Netscape 6 instead of the default 4.0. To do this click on the Results panel, select theTarget Browser
Check tab, click on the green arrow to show the list of options - select theSettings option and
set Netscape Navigator to version 6.0.

Unit 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unit 1

Uploaded by

Copyright:

Available Formats

Unit 1: World Wide Web

Internet Vs World Wide Web:

Internet World Wide Web

Internet World Wide Web (WEB)

3. Utilizes browsers, such as Internet Explorer or Firefox, to access Web documents

The syntax is scheme: //domain:port/path?query_string#fragment_id

The default port for an https: request is 443.

An active directory domain is a collection of computers defined by the administrator of a

IP Address (Internet Protocol Address):

for a few extra bucks a month.

.ae used by the United Arab Emirates

.asia used by organizations and individuals located in the Asia-Pacific region

.uk used by the United Kingdom

Web gets new domain addresses:.guru,.bike

IP is a connectionless protocol, which means that there is no continuing connection

1. Addresses identify hosts and

2. Provide a logical location service.

Hypertext Transfer Protocol

File Transfer Protocol

SMTP Tranfer E-Mail

FTP Downloading files/program from other servers.

Basically, SMTP is a set of commands that authenticate and direct the

Internet, Intranet & Extranet

The Extranet is a portion of an organization's Intranet that is made

Web Portal Vs Website

Typical Portal Attributes

1. Web portal is a Public & Private Interface (extranet, intranet, etc...)

2. Offers Access for Multiple User Roles

3. Personalization / Role specific functionality & content

4. Endowed with Versatile / Enhanced functionality & flexibility

5. The user can access to broad resources

6. Supports the user in multiple task

7. Offers content from diverse resources

8. Spans content, collaboration and ecommerce

10. customizable, the content are created for every user.

Typical Website Attributes

1. Its a Public Interface

2. Supports the user in specific task (marketing or ecommerce)

3. Provides targeted content from independent resources to specific audience

4. Content is generally focused, eliminates the need of visiting different sites

5. Select & organize the materials needed to be accessed

6. Establish your presence in online global market

7. Reach the targeted audience.

Function of a web Server

1. Virtual hosting to serve many Web sites using one IP address.

1. its own settings;

1. long delays in handling request (from 1 second to a few hundred seconds).

5. deploying Web cache techniques;

Features of a Web Browsers

Privacy and security

Comparison of Web Browser

Source: Median values from summary table.

Internet Explorer (38.9%)

Google Chrome (20.9%)

History of Internet Browsers

Mozilla Firefox (Mozilla)

Opera (Opera Software ASA)

Differences between Internet Explorer (IE) and Netscape

On the Internet, a search engine is a coordinated set of programs that includes:

1994 Lycos Launched and became a major commercial endeavor.

2003 Altavista Yahoo with google

2004 Msnbot Microsoft began a transition to its own search technology

2009 Bing Microsoft's rebranded search engine

A search engine operates, in the following order