You are on page 1of 16

FUNDAMENTALS OF WEB TECHNOLOGY

UNIT -1

INTRODUCTION

● Introduction to web technology


● World wide web
● TCP/IP
● Protocols
● Internet services
● Web server
● Client-server model
● Routing
● Gateways

Inter​connected ​Net​work
● The Internet is a global (huge) network connecting millions of computers.
● A network may include PCs, and other devices like servers or printers.
● A network is connected through a communication channel

We use Internet for:


● Email
● WWW, hypertext, browsers
● FTP, P2P file distribution
● Mobile Internet
● IM, IRC, Skype
● Blogging, microblogging
● Gaming
● Learning
● Video Conferencing
● Remote Backup
● Streaming video and audio
● Collaboration-Participation (Wiki)
● Collaborative tagging
● Software over the web
● Social networks
● Business and finance

Internet – Brief History:


● Grew out of a research network originally funded by U.S. Department of Defense.
● Development of this network, known as the ARPAnet after the Advanced
● Research Projects Agency (ARPA), began in 1969.
● As the network grew, it was used for applications beyond research, such as electronic
mail.
● In the early 1980s, the current versions of the core Internet protocols, TCP and IP, were
introduced across the network.
● In 1992, the Center for European Nuclear Research (CERN) released the first versions of
World Wide Web software.

Internet – Key properties:


● The Internet is interoperable.
● The Internet is global.
● The Web makes it easy.
● The costs of the network are shared across multiple applications and borne by the end
users
● The striking characteristic of the Internet : heterogeneity

World Wide Web (WWW)

The World Wide Web is a way of exchanging information between computers on the Internet,
tying them together into a vast collection of interactive multimedia resources.
Internet and Web is not the same thing: Web uses internet to pass over the information. WWW
is an application that runs on the Internet

The Web didn’t exist until the 1980s. In 1989 Tim Berners-Lee created a set of technologies that
allowed information on the Internet to be linked together through the use of links, or
connections in documents.
His goals were in two areas:
● Developing ways of linking documents that were not stored on the same computer, but
were scattered across many different physical locations
● Enabling users to work together
The Web was mostly text based until Marc Andreessen created the Mosaic browser in 1992.
Accredited for popularizing the WWW. People started thinking about adding videos, sound, and
graphics on the Web.

W3 Consortium coordinates the development of WWW standards and ensures uniformity.

Two types on Computers:


● Computers which offer information to be read “Servers”
eg. IIS
● Computers that read the information offered “Clients”
eg. Internet Explorer

WWW Architecture

WWW architecture is divided into several layers:


● Identifiers and Character Set
Uniform Resource Identifier (URI) is used to uniquely identify resources on the web and
UNICODE makes it possible to built web pages that can be read and write in human
languages.

● Syntax
XML (Extensible Markup Language) helps to define common syntax in semantic web.

● Data Interchange
Resource Description Framework (RDF) framework helps in defining core representation
of data for web. RDF represents data about resource in graph form.

● Taxonomies
RDF Schema (RDFS) allows more standardized description of taxonomies and other
ontological constructs.

● Ontologies
Web Ontology Language (OWL) offers more constructs over RDFS. It comes in following
three versions:

○ OWL Lite for taxonomies and simple constraints.

○ OWL DL for full description logic support.

○ OWL for more syntactic freedom of RDF


● Rules
RIF and SWRL offers rules beyond the constructs that are available from RDFs and OWL.
Simple Protocol and RDF Query Language (SPARQL) is SQL like language used for
querying RDF data and OWL Ontologies.

● Proof
All semantic and rules that are executed at layers below Proof and their result will be
used to prove deductions.

● Cryptography
Cryptography means such as digital signature for verification of the origin of sources is
used.

● User Interface and Applications


On the top of layer User interface and Applications layer is built for user interaction.
TCP/IP

● It is the protocol used to communicate on the Internet.


● Every machine connected to the Internet must have an address by which it can be
located on the Internet.
● TCP/IP uses four numbers to address a computer. The numbers are always between 0
and 255. It uses 32 bit addresses. There are four different ways in which this set of
numbers can be broken into class type A, B, C & D.
● The rapid growth of the Internet led to a shortage of IP addresses. No one could have
anticipated the Internet when the protocol was first devised. The Internet Protocol
(IPv6) provided relief to this problem by lengthening the IP address from 32 bits to 128
bits.

Domain Names:
Names are easier to remember than a 12 digit (or longer) number.
Some applications let you identify a computer or an IP network by using a logical or domain
name.
www.ncirl.ie is a domain name
When you address a website, like http://www.tcd.ie, the name is translated to a number by a
Domain Name Server (DNS).
When a new domain is registered together with a TCP/IP address, DNS servers all over the
world are updated with this information.

Uniform Resource Locators (URL):


A Uniform Resource Locator (URL) is used to address a document on the Web. The name that
corresponds to an IP address in the DNS is known as a URL. A full Web address is like:
http://www.scss.tcd.ie/Owen.Conlan/php/xpath/index.html
A URL usually follows these syntax rules:
scheme://host.domain.country_code:port/path/filename

Retrieving a URL​:
➔ TCP/IP is a collection of communication protocols that controls the way that information
is broken up and posted over the Internet. HTTP takes care of the communication
between a web server and a web browser.
➔ To retrieve a Web resource, the user either specifies a URL in the Web browser’s
address or clicks on a hyperlink in a document. HTTP is used for sending requests from a
web client (a browser) to a web server, returning web content (web pages) from the
server back to the client.
➔ The Web browser specifies the details of the required Web page in a HTTP Request
message.
➔ The Web server receives this request and after processing it completes the operation by
returning either the document or an error in the HTTP Response message

OSI Model
Open System Interconnection is an open standard for all communication systems. This layer
model is a conceptualized view of how one system should communicate with the other, using
various protocols defined in each layer. Each layer is designated to a well-defined part of
communication system.
The OSI model has the following seven layers:
● Application
● Presentation
● Session
● Transport
● Network
● Data Link
● Physical

Physical Layer (layer-1): ​This layer defines the hardware, cabling wiring, power output, pulse
rate etc.

Data Link Layer (layer-2):​ This layer is responsible for reading and writing data from and onto
the line. Link errors are detected at this layer.
Network Layer (layer-3):​ This layer is responsible for address assignment and uniquely
addressing hosts in a network. This layer helps to uniquely identify hosts beyond the subnets
and defines the path which the packets will follow or be routed to reach the destination.
Transport Layer (layer-4):​ This layer provides end to end data delivery among hosts. This layer
takes data from the above layer and breaks it into smaller units called Segments and then gives
it to the Network layer for transmission.
Session Layer (layer-5):​ This layer maintains sessions between remote hosts. For example, once
user/password authentication is done, the remote host maintains this session for a while and
does not ask for authentication again in that time span. This layer can assist in synchronization,
dialog control and critical operation management (eg: an online bank transaction).

Presentation Layer (layer-6): ​This layer defines how data in the native format of remote host
should be presented in the native format of host. It helps to understand data representation in
one form on a host to another host in their native representation. Data from the sender is
connected to on-the-wire data (general standard format) and at the receiver’s end it is
converted to the native representation of the receiver,

Application Layer (layer-7): ​This layer is responsible for providing interface to the application
user. It encompasses protocols which directly interact with the user. This is where the user
application sits that need to transfer data between or among hosts. For example: HTTP, File
Transfer Protocol (FTP), and electronic mail, etc.
TCP/IP Model​:
TCP/IP (Transmission Control Protocol/Internet Protocol) has its own reference model which it
follows over the internet.
This model has following layers:
1. Application
2. Transport
3. Internet
4. Link

Application Layer: ​defines the protocol which enables user to interact with the network. For
example, FTP, HTTP, etc.
Transport Layer:​ defines how data should flow between hosts. Major protocol at this layer is
Transmission Control Protocol (TCP). This layer ensures data delivered between hosts is in order
and is responsible for end-to-end delivery.
Internet Layer:​ Internet Protocol (IP) works on this layer. This layer facilitates host addressing
and recognition. This layer defines routing.
Link layer:​ provides mechanism of sending and receiving actual data. Unlike its OSI model
counterpart, this layer is independent of underlying network architecture and hardware.
TCP/IP and OSI Model

TCP (Transmission Control Protocol)

TCP is an alternative transport layer protocol over IP.

TCP provides: Connection-oriented, Reliable, Full-duplex, Byte-Stream

Connection-Oriented​ means that a virtual connection is established before any user data is
transferred. If the connection cannot be established – the user program is notified. If the
connection is ever interrupted – the user program(s) is notified.

Reliable​ means that every transmission of data is acknowledged by the receiver. If the sender
does not receive acknowledgement within a specified amount of time, the sender retransmits
the data

Byte Stream:​ Stream means that the connection is treated as a stream of bytes. The user
application does not need to package data in individual datagrams (as with UDP).

Buffering:​ TCP is responsible for buffering data and determining when it is time to send a
datagram. It is possible for an application to tell TCP to send the data it has buffered without
waiting for a buffer to fill up.
Full Duplex:​ TCP provides transfer in both directions. To the application program these appear
as two unrelated data streams, although TCP can piggyback control and data communication by
providing control information (such as an ACK) along with user data.

TCP Ports:​ Interprocess communication via TCP is achieved with the use of ports (just like UDP).
UDP ports have no relation to TCP ports.

TCP Segments:​ The chunk of data that TCP asks IP to deliver is called a TCP segment. Each
segment contains: data bytes from the byte stream and control information that identifies the
data bytes.

TCP Segment Format:​ Control Flags URG, ACK, RST, SYN, FIN

TCP Lingo: When a client requests a connection it sends a “SYN” segment (a special TCP
segment) to the server port. SYN stands for synchronize. The SYN message includes the client’s
ISN. ISN is Initial Sequence Number. Every TCP segment includes a Sequence Number that
refers to the first byte of data included in the segment. Every TCP segment includes an
Acknowledgement Number that indicates the byte number of the next data that is expected to
be received. Window: Every ACK includes a Window field that tells the sender how many bytes
it can send before the receiver will have to toss it away (due to fixed buffer size).

TCP Connection Creation (client-server model):

In client-server model, any process can act as Server or Client. It is not the type of machine, size
of the machine, or its computing power which makes it server; it is the ability of serving request
that makes a machine a server.

A system can act as Server and Client simultaneously. That is, one process is acting as Server
and another is acting as a client. This may also happen that both client and server processes
reside on the same machine.

A server accepts a connection.


A client requests a connection.

A client starts by sending a SYN segment with the following information:


● Client’s ISN (generated pseudo-randomly)
● Maximum Receive Window for client
● Optionally (but usually) MSS (largest datagram accepted).
● No payload. (Only TCP headers)

When a waiting server sees a new connection request, the server sends back a SYN segment
with:
● Servers’s ISN (generated pseudo-randomly)
● Request Number is Client ISN+1
● Maximum Receive Window for server
● Optionally (but usually) MSS
● No payload (Only TCP headers)

When the Server’s SYN is received, the client sends back an ACK with:
● Acknowledgement Number is Server’s ISN+1

TCP data and ACK:


● Once the connection is established, data can be sent.
● Each data segment includes a sequence number identifying the first byte in the segment
● Each segment (data or empty) includes a request number indicating what data has been
received.

Buffering:
● TCP is part of the Operating System. The OS takes care of all these details
asynchronously.
● The TCP layer doesn’t know when the application will ask for any received data.
● TCP buffers incoming data so it's ready when we ask for it.

TCP Buffers:
● Both the client and server allocate buffers to hold incoming and outgoing data. The TCP
layer does this.
● Both the client and server announce with every ACK how much buffer space remains.

Send Buffers
● The application gives the TCP layer some data to send.
● The data is put in a send buffer, where it stays until the data is ACK’d.
● The TCP layer wont accept data form the application unless (or until) there is buffer
space.

● A receiver doesn’t have to ACK every segment (it can ACK many segment with a single
ACK segment).
● Each ACK can also contain outgoing data (piggybacking).
● If a sender doesn’t get an ACK after some time limit, it resends the data.

Termination:
● The TCP layer can send a RST statement
that terminates a connection if
something is wrong.
● Usually the application tells TCP to
terminate the connection with a FIN
segment.
● When FIN is sent, it means that the
application is done sending data/
● The FIN is ACK’d.
● The other end must now send a FIN.
● That FIN must be ACK’d.

Web Server

Web server is a computer where the web content is stored. It is used to host the web sites but
there exists other web servers also such as gaming, storage, FTP, email etc.

Web Server Working:


Web server respond to the client request in either of the following two ways:
● Sending the file to the client associated with the requested URL.
● Generating response by invoking a script and communicating with database
➔ When client sends request for a web page, the web server search for the requested
page. If requested page is found then it will send it to client with an HTTP response.

➔ If the requested web page is not found, web server will the send an HTTP response:
Error 404 Not found.

➔ If client has requested for some other resources then the web server will contact to the
application server and data store to construct the HTTP response.

Routing

When a device has multiple paths to reach a destination, it always selects one path by
preferring it over others. This selection process is termed as Routing. Routing is done by special
network devices called routers or it can be done by means of software processes. The software
based routers have limited functionality and limited scope.

A router is always configured with some default route. A default route tells the router where to
forward a packet if there is no route found for specific destination. In case there are multiple
path existing to reach the same destination, router can make decision based on the following
information:
● Hop Count
● Bandwidth
● Metric
● Prefix-length
● Delay

Routes can be statically configured or dynamically learnt. One route can be configured to be
preferred over others.

Firewalls

Firewall: A machine and its software that serve as a special gateway to a network, protecting it
from inappropriate access . It filters the network traffic that comes in, checking the validity of
the messages as much as possible and perhaps denying some messages altogether.

You might also like