You are on page 1of 28

Web Programming

Introduction to HTML and XHTML


WEB
Is a system of interlinked hypertext documents accessed via
the internet.
The hypertext documents are written using Hyper Text
Markup Language(HTML) or Extensible HyperText
Markup Language(XHTML) are transmitted from web
servers to web browser using Hypertext Transfer
Protocol(HTTP).
Standard Generalized Markup Language(SGML) is an ISO
standard for defining generalized markup languages for
documents. HTML is an application of SGML based
languages.
XML is a subset of SGML that defines a set of rules for
encoding documents in a format that are both human
readable and machine readable. XHTML is an application of
XML.
HTML
Hyper Text Markup Language(HTML) is the markup
language for creating the web pages and other
information that can be displayed by the browser.
HTML is written in the form of HTML elements
consisting of tags and content.
Tags: The fundamental syntactic units of HTML are
called tags, in general tags are used to specify
categories of content. The syntax of tag is tags
name(written in lower case) surrounded by angle
brackets(< >).
Most tags appear in pairs: an opening tag and a
closing tag. The name of closing tag is the
name of its corresponding opening tag with a
slash attached to it.
Content: whatever appears between the
opening and closing tag is content of the tag.
A browser display of an XHTML document shows
the content of all documents tags; it is the
information the document is meant to portray. Not
all tags have content.
Element: The opening and closing tag
together specify a container for the content
they enclose. The container and its content
together are called an element.
For example
<p> This is content of paragraph tag </p>
The purpose of web browser is to read HTML
documents. The browser does not display tags,
but uses the tags to interpret the content.
Attribute: These are used to specify alternative
meanings of a tag, can appear between an
opening tags name and its right-pointed
bracket.
They are specified in keyword form, i.e., attribute
name followed by equal to sign and attributes
value. These are written in lowercase and are
delimited by double quotes.
Comments: increase readability of the
program, they appear in XHTML in the
following form.
<!- - anything except two adjacent dashed - ->
HTML versions:
HTML 2.0- first classic version of HTML,
introduced on 24 November 1995.
HTML 3.2- introduced in January 1997, included
tags for tables, images ,subscripts, superscripts and
few more.
HTML 4.0- introduced in December 1997 ,
supported style sheets.
HTML 4.0.1- is current official version of HTML,
introduced in December 1999, supports CSS,
added table forms and JSS.
HTML 5 draft version- is latest version of HTML,
came into existence on January 2008.
XHTML
XHTML is the family of current and future
document types and modules that extend
HTML4. XHTML document types are XML
based and ultimately are designed to work in
conjunction with XML-based user agents.
According to W3C, XHTML 1.0 is a
reformulation of HTML 4.0.1 in XML and
combines the strength of HTML 4 with the
power of XML
XHTML versions
XHTML is an application of XML, a more restrictive
subset of SGML because XHTML documents need to
be well formed and can be parsed using standard
XML parser.
XHTML 1.0 became W3C recommendation on Jan
26 2000.
XHMTL 1.1 became W3C recommendation on May
31 2001.
XHTML 5 is undergoing development as of
September 2009, as part of HTML 5 specification.
Motivation to develop XHTML
XHTML was developed to make HTML more
extensible and increase interoperability with other
data formats.
HTML 4 was ostensibly an application of SGML;
however the specification for SGML was complex
and neither web browser nor HTML 4
recommendation were fully conformant to it.
The XML standard approved in 1998, provided a
simpler data format in simplicity to HTML 4.
By shifting to XML format it was hoped HTML
would become compatible with common XML tools.
HTML versus XHTML:
HTML over XHTML
XHTML requires that all elements be closed, either by a
separate closing tag or using self closing syntax(e.g. <br />),
while HTML syntax permits some elements to be unclosed
because either they are always empty or their end can be
determined implicitly.
XHTML is case sensitive for element and attribute names,
while HTML is not.
In general, HTML has lax syntax rules and is much easier and
simple to write, whereas XHTML requires a level of discipline
to write documents.
Since there are huge number of HTML documents available on
the web, browsers will continue to support it in future. But
some older browser have problems with some parts of
XHTML
XHTML over HTML
The quality and consistency in any endeavor, be it electrical
wiring, software development or web development, rely on
standards.
XHTML has strict syntactic rules this imposes a consistent
structure on all XHTML documents.
When XHTML documents are created its syntactic correctness
can be checked either by an XML browser or by validation
tool.
XHTML can be written correctly by using XHTML editors,
which provide a simple and effective approach to create
syntactically correct XHTML documents.
It is also possible to convert HTML documents to XHTML
documents using software tools.
Standard XHTML document structure.
Every XHTML document must begin with and xml
declaration element that simply identifies the
document as being one based on XML.
This element includes an attribute that specifies the
version number which is still 1.0. The xml declaration
usually includes a second attribute, encoding which
specifies the encoding used for the document.
This declaration must begin in the first character
position of document file.
<?xml version = "1.0" encoding = "utf-8" ?>
Immediately following the xml declaration is document type
declaration that defines document type definition(DTD) using
DOCTYPE command.
A document-type definition (DTD)specifies the syntax rules
for a particular category of XHTML documents or Document
type definition is a grammar of an XML or XHTML.
XML uses a subset of SGML DTD.
Purpose of DTD: when we write a document with DTD it
provides a structure for the document, though it is easy to
write an XML document without DTD, it has no syntactic
meaning to computer.
A DTD is associated with an XML or SGML document by
means of document type declaration(DOCTYPE) which
appears at the start of the document.
Document type declaration(DOCTYPE)
It is an instruction that associates a particular SGML or
XML document with a DTD.
The general syntax for document type declaration is:
<! DOCTYPE root-element PUBLIC FPI [URI] [
<!- -internal subset declaration - ->
]>
The root element represents the document is the first
element in the document. For example in XHTML, the root
element is <html>, being the first element opened and last
closed.
The keywords PUBLIC or SYSTEM suggest what kind of
DTD it is(one that is on private system or one that is open
to public)
If the keyword PUBLIC is chosen then this keyword is followed
by restricted form of public identifiers or Formal Public
Identifier.
If the keyword SYSTEM is chosen then this keyword is
followed by only system identifier.
The system identifier allows the XML parser to locate the DTD
in a specific system, by means of URI reference of DTD
enclosed in double quotes.
Most commonly used system identifiers are:
http://www.w3.org/TR/html4/strict.dtd (for HTML 4.0.1 documents)
http://www.w3.org/TR/html4/loose.dtd (to include some older
attributes and deprecated tags.)
http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd (for current version
of XHTML )

In our XHTML document we use:


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0
Strict//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml1-strict.dtd">
XML namespaces are used for avoiding the
ambiguity between identically named elements
or attributes within xml documents.
Namespace name is a uniform resource
identifier(URI).
Typically
http://www.w3.org/1999/xhtml is used to
identify the namespace and this reduces the
probability of different namespaces using
duplicate identifiers.
<html xmlns = "http://www.w3.org/1999/xhtml" >
Basic Tags:
<html> is typed before all the text in the document. This marks the
beginning of the html document.
<head> Web pages are divided into two main sections: the head and
the body. The head provides information about the document,
including the author, description, keywords, title, and other
information.
</head> This marks the closing of the head section.
<title> You must give your document a title. This title doesn't actually
appear within the web page, but appears in the title bar of the
browser window. This is also the title of the page that will be
displayed by default in search engine results or in user's Favorites.
</title> closes the title tag.
<body> The body section contains the contents of your document.
</body> closes the body tag.
</html> ends the html document.
Opening Tag Closing Tag Description
<h1> to <h6> </h1>to</h6> Headings. H1 is the main
heading, H2 is secondary, etc.

<p> </p> New paragraph.


<div> or <span> </div> or Serve as a container for
</span> content.
<em> </em> Gives the contained text
emphasis (usually as italics).

<strong> </strong> Makes the contained text


bold.
<a href = "document </a> Link to another document.
location">
<a name = "label"> </a> Link to another section of the
same page.
<ol> </ol> Makes ordered lists.
<ul> </ul> Makes unordered (or bulleted)
lists.
<li> </li> Marks items in either the
ordered or unordered list.
Tag Description

<br /> Causes a line break. It may be repeated


for multiple line breaks.

<hr /> Horizontal rule. It creates a line to


separate content.

<img src ="image Inserts an image into a web page.


location" />

<p /> The paragraph tag used in this manner


serves as a double line break. It does
not contain text. Unlike the <br /> tag
it cannot be used multiple times to
generate more white space.
Opening Tag Closing Sample Description
Tag Attributes
<table> </table> Adds table.
border="number" Border for rows &
columns.
cellpadding Thickness of cell wall.

cellspacing Spacing between border


and cell contents.

bgcolor Background color of


cells.
<tr> </tr> Table row (start & end).

align="left, Aligns text in row


center, right" horizontally.
align="top, Aligns text in row
middle, bottom" vertically.
<th scope="row" > </th> When creating a
<th scope="col" > table to display data,
use this tag to
differentiate the first
row or column of
cells as heading cells
for all the other cells
in the same column
or row. Content will
automatically be bold
and center aligned.
The scope attribute
defines which data
cells pertain to the
heading.

<td> </td> Defines data cell.


colspan="number" Spans cells across
column.
rowspan="number" Spans cells across
row.
align Alignment in cell.
Sample document
<?xml version=1.0 encoding=utf-8?>
<!DOCTYPE html PUBLIC -//W3C//DTD XHTML1 1//EN
http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd>
<html xmlns =http://www.w3.org/1999/xhtml>
<!- - greet.html
A trivial example-->
<head><title> Our First document </title>
</head>
<body>
<p>
Greetings from Web Master!
</p>
</body>
</html>
Scripting Languages
A scripting language is a programming language
that supports the writing of scripts.
Scripts are programs written for a special runtime
environment that can interpret and automate the
execution of tasks.
We usually embed these scripts in
HTML/XHTML documents to add functionality
to a web page, such as different menu styles or
graphic displays or to serve dynamic
advertisements.
These types are client-side scripting languages,
affecting the data that the end user sees in a
browser window.
The other scripting languages are server-side
languages that manipulate the data, usually in a
database, on the server.
JavaScript, ASP,JSP,PHP,Perl,TCL and Python
are examples of scripting languages.
The original goal of java script was to provide
programming capability at both server and
client ends of a web connection.
JavaScript are embedded, either directly or
indirectly, in XHTML documents. Scripts can
appear directly as the content of a <script> tag.
The type attribute of <script> must be set to
text/javascript.
The JavaScript script can be indirectly
embedded in an XHTML document using src
attribute of a <script> tag, whose value is the
name of a file that contains the script.
<script type= text/javascript src=tst_numbers.js>
In JavaScript, identifiers or names are similar to other
programming languages , they must begin with a letter,
an underscore or dollar sign($).The letters in variable
name in JavaScript are case sensitive.
JavaScript has a large collection of predefined
words, including alert,open,java and self.
Declaring variables:
A variable can be declared either by assigning it a
value, in which case the interpreter implicitly declares
it to be a variable, or by listing it in a declaration
statement that begins with the reserved word var.
Example:
var counter, index,pi=3.14;
JavaScript models the XHTML document with
Document object. The Document object has
several methods and write method is used to
create script output, which is dynamically created
XHTML document content.
This content is specified in the parameter to write.
<?xml version=1.0 encoding=utf-8?>
<!DOCTYPE html PUBLIC -//W3C//DTD XHTML1 1//EN
http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd>
<html xmlns =http://www.w3.org/1999/xhtml>
<!- - A trivial example-->
<head><title> Our first scripting document </title>
</head>
<body>
<script type=text/javascript>
document.write(Hello, enjoy scripting languages);
</script>
</body>
</html>

You might also like