You are on page 1of 346

Introduction, HTTP

Academia Tehnic! Militar!, 2011

Web Development
! What is it?
! Broad term for the work involved in developing a web site for the Internet (World Wide Web) or an intranet (a private network). ! This can include web design, web content development, client liaison, client-side/server-side scripting, web server and network security configuration, and e-commerce development. ! However, among web professionals, "web development" usually refers to the main non-design aspects of building web sites: writing markup and coding. ! Web development can range from developing the simplest static single page of plain text to the most complex web-based internet applications, electronic businesses, or social network services.

Typical Areas
! Basic Technologies
! HTTP, HTML, CSS, XML, XPath, XLink, XPointer, XSLT, ...

! Client Side Coding


! JavaScript, Ajax, Flash, Microsoft Silverlight, ...

! Server Side Coding


! ASP, ColdFusion, CGI, Perl, Groovy, Java EE, Lotus Domino, PHP, Python, Ruby, Smalltalk, Server-Side JavaScript, Websphere, .NET, ...

! Client Side + Server Side


! Google Web Toolkit, Pyjamas, ...

! Database Technology
! ...

The Web Is Dead. Long Live the Internet

Ce invatam in acest curs?


Basic Technologies ! ! ! ! ! ! ! ! ! ! HTTP URI HTML CSS XML XPath XLink XPointer RSS XSLT ! ! ! ! ! Client & Server Side Development PHP JavaScript AJAX Web Security Web Services

The HTTP Protocol

HTTP
! Quick Facts:
! ! ! ! HTTP stands for HyperText Transfer Protocol Standardized by IETF Current Version: HTTP/1.1 Relevant documents:
! RFC2616 (June 1999) ! RFC2068 (January 1997)

! What it is:
! Application layer protocol for distributed, collaborative, hypermedia information systems ! Request / Response typical for Client Server model ! Web browsers or spiders [user agents] act as clients and access resources located on a origin server.
! In between, there can be intermediaries: proxies, gateways, tunnels

! HTTP requires reliable transport.


! Usually TCP/IP is used

Definitions
! Hypertext
! "Text displayed on a computer or other electronic device with references (hyperlinks) to other text that the reader can immediately access, usually by a mouse click or keypress sequence"

! Hypermedia
! "Used as a logical extension of the term hypertext in which graphics, audio, video, plain text and hyperlinks intertwine to create a generally non-linear medium of information. This contrasts with the broader term multimedia, which may be used to describe non-interactive linear presentations as well as hypermedia."

The OSI Stack

Sample HTTP Exchange


GET /index.html HTTP/1.1 Host: www.example.com HTTP/1.1 200 OK Date: Mon, 23 May 2005 22:38:34 GMT Server: Apache/1.3.3.7 (Unix) (Red-Hat/Linux) Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT Etag: "3f80f-1b6-3e1cb03b" Accept-Ranges: bytes Content-Length: 438 Connection: close Content-Type: text/html; charset=UTF-8 <html> <head> <title>Welcome to example.com</title> </head> ..........................

Request / Response Message


! The request message contains the following:
! ! ! ! ! ! ! ! Request line, such as GET /images/logo.png Headers, such as Accept-Language: en An empty line An optional message body The status line, such as HTTP/1.0 200 OK Headers An empty line Message body

! The response message contains the following:

Message body
! An HTTP message may have a body of data sent after the header lines
! In responses, usually the resource returned to the client ! In requests, usually user entered data or uploaded files

! The following header lines are usually included if a body is present:


! The Content-Type: header gives the MIME-type of the data in the body, such as text/html or image/gif. ! The Content-Length: header gives the number of bytes in the body.

Header Lines
! Header lines provide information about the request or response, or about the object sent in the message body.
Last-Modified: Fri, 31 Dec 1999 23:59:59 GMT

! Format defined in RFC822:


! One line per header, of the form "Header-Name: value", ending with CRLF. ! Details: ! Headers should end in CRLF, but you should handle LF correctly. ! The header name is not case-sensitive (though the value may be). ! Any number of spaces or tabs may be between the ":" and the value. ! Header lines beginning with space or tab are actually part of the previous header line, folded into multiple lines for easy reading. ! Same format as in email and news postings

! HTTP 1.0 defines 16 headers (none mandatory) ! HTTP 1.1 defines 46 headers (Host: mandatory)

HTTP Methods
! ! ! HEAD
! Asks for the response identical to the one that would correspond to a GET request, but without the response body.

GET
! Requests a representation of the specified resource. ! Note: GET should not be used for operations that cause side-effects.

POST
! Submits data to be processed (e.g., from an HTML form) to the identified resource. ! Note: Data is sent in the body of the request

! ! ! ! !

PUT
! Uploads a representation of the specified resource.

DELETE
! Deletes the specified resource.

TRACE
! Echoes back the received request, so that a client can see what intermediate servers are adding or changing in the request.

OPTIONS
! Returns the HTTP methods that the server supports for specified URL.

CONNECT
! Converts the request connection to a transparent TCP/IP tunnel

Sample HTTP POST Exchange


POST /path/script.cgi HTTP/1.0 From: frog@jmarshall.com User-Agent: HTTPTool/1.0 Content-Type: application/x-www-form-urlencoded Content-Length: 32 home=Cosby&favorite+flavor=flies HTTP/1.0 200 OK ..........................

Status line / Status code / Reason phrase


! The status code is meant to be computer-readable; the reason phrase is meant to be human-readable
HTTP/1.0 404 Not Found

! The status code is a three-digit integer, and the first digit identifies the general category of response:
! ! ! ! ! 1xx indicates an informational message only 2xx indicates success of some kind 3xx redirects the client to another URL 4xx indicates an error on the client's part 5xx indicates an error on the server's part

! Most common codes:


! 200 OK, 404 Not Found, 301 Moved Permanently, 302 Moved Temporarily, 500 Server Error, ...

HTTP Proxies
! ! ! An HTTP Proxy mediates the access between the client browser and the destination web server Most proxies provide additional services: caching, authentication, authorization, etc. Two main differences when using a proxy:
1.! The browser sends the full URL (including protocol prefix http://) 2.! If HTTPS is used, the browser acts as a pure TCP relay:
! ! ! Browser sends CONNECT to proxy If proxy authorizes, will respond with 200 OK From here, proxy keeps connection open and acts as pure TCP relay

Web Application Proxies are tools for inspecting applications


! Examples: WebScarab, Paros, ...

Cookies
! What are cookies?
! A small piece of text stored on a users computer by a web browser ! It consists of one or more name-value pairs ! It contains user information (preferences, shopping cart contents, etc.)

! Why Cookies?
! Because the HTTP protocol is stateless ! A server can not identify a user that performs several requests ! Without cookies it is nearly impossible to implement an shopping cart application

Cookies: How do they work?

GET /index.html HTTP/1.1 Host: www.example.org HTTP/1.1 200 OK Content-type: text/html Set-Cookie: RMID=732423sdfs73242; expires=Fri, 31-Dec-2010 23:59:59 GMT; path=/; domain=.example.org (content of page) GET /spec.html HTTP/1.1 Host: www.example.org Cookie: name=value Accept: */*

!! !! !! !!

name = value
!! Cookie content

expires
!! Valability of the cookie

domain
!! The valability domain

path
!! The URL for which it is valid

!! Secure !! Only submit it over HTTPS !! HttpOnly !! Not possible to access it via JavaScript

HTTPS
! HTTP is a plain text protocol
! All data can be eavesdropped easily ! HTTP is subject to man-in-the-middle

! What is HTTPS?
! The same protocol (HTTP) ! SSL/TLS is used to provide:
! Authentication (two-sides authentication possible) ! Encryption of data

! PKI is used for authentication / encryption


! A server certificate is mandatory ! A client certificate is required for two-sides authentication

! Port 443 is used by default (instead of 80)

! HTTPS is specified in RFC2660 (August 1999)

HTTP Authentication
! What is it?
! A mechanism included in HTTP for authenticating users ! It is defined in RFC1945 (HTTP/1.0), RFC2616 (HTTP/1.1) and RFC2617 (HTTP Authentication)

! It uses the following authentication schemes:


! Basic
! Sends user credentials as Base64 encoded strings

! NTLM
! Challenge-response mechanism, using a version of the Windows NTLM protocol

! Digest
! Challenge-response mechanism using MD5 checksums

! Quick facts:
! It is rare to see Web applications using HTTP Authentication on the Internet
! However, it is common in corporate Intranet

! Basic authentication is insecure is a common myth


! If used properly, it is at least as secure as other methods of authentication (ex. form-based authentication)

How does it work?

HTTP Basic Authentication


GET /private/index.html HTTP/1.0 Host: localhost GET /private/index.html HTTP/1.0 Host: localhost Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==

HTTP/1.0 401 UNAUTHORIZED Server: HTTPd/1.0 Date: Sat, 27 Nov 2004 10:18:15 GMT WWW-Authenticate: Basic realm="Secure Area" Content-Type: text/html Content-Length: 311 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REChtml401-19991224/loose.dtd"> <HTML> <HEAD> <TITLE>Error</TITLE> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> </HEAD> <BODY><H1>401 Unauthorised.</H1></BODY> </HTML>

HTTP/1.0 200 OK Server: HTTPd/1.0 Date: Sat, 27 Nov 2004 10:19:07 GMT Content-Type: text/html Content-Length: 10476

User Agent

Server

HTTP Digest Authentication


HTTP/1.0 401 UNAUTHORIZED Server: HTTPd/1.0 Date: Sat, 27 Nov 2004 10:18:15 GMT HTTP/1.1 401 Unauthorized WWW-Authenticate: Digest realm="Protected", qop="auth,auth-int", nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093", opaque="5ccc069c403ebaf9f0171e9517f40e41 Content-Type: text/html Content-Length: 311 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/1999/REChtml401-19991224/loose.dtd"> <HTML> <HEAD> <TITLE>Error</TITLE> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> </HEAD> <BODY><H1>401 Unauthorised.</H1></BODY> </HTML> GET /private/index.html HTTP/1.0 Host: localhost Authorization: Digest username="frank", realm="Protected", nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093", uri="/dir/index.html", qop=auth, nc=00000001, cnonce="0a4f113b", response="6629fae49393a05397450978507c4ef1", opaque="5ccc069c403ebaf9f0171e9517f40e41

User Agent

Server

HTTP Digest Authentication


! Server message:
! realm
! Name of the authentication realm For compatibility reasons [specifies how the hash is calculated] Uniquely generated at each response [Included in the hash to avoid replay attacks]

! qop
! !

! nonce

User-agent message:
! username
! ! ! ! ! Name of the user in clear text Name of the authentication realm Nonce, generated by the client Nonce count, the number of nonces generated by the user agent The password, as MD5 [to avoid replay attacks, the hash contains nonce, cnonce and nc]

! realm ! cnonce [optional] ! nc [optional] ! response

Conclusions

Conclusions
! HyperText Transfer Protocol
! ! ! ! HTTP is a plain text protocol HTTP is request reply oriented HTTP is stateless Main HTTP methods are HEAD, GET, POST, ...
! GET should be used in an idempotent context ! POST can be used in a non-idempotent context

! Cookies allow browsers to store preferences


! Cookies are stored on the client side

! HTTP Authentication is a form of authentication built in the HTTP protocol

* Idempotence is the property of certain operations in mathematics and computer science, that they can be applied multiple times without changing the result

Resources
! Web Development ! Web Development, http://en.wikipedia.org/wiki/Web_development ! Web Development timeline, http://en.wikipedia.org/wiki/File:Web_development_timeline.png HTTP ! HTTP, http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol ! HTTP tutorial, http://www.jmarshall.com/easy/http

URIs
Academia Tehnic! Militar!, 2011

URIs
! What are URIs?
! A string of characters used to identify a name or a resource on the Internet. ! Such identification enables interaction with representations of the resource over a network (typically the World Wide Web) using specific protocols. ! Schemes specifying a concrete syntax and associated protocols define each URI.

! URL vs. URN


! A URI can be name (URN)
! This works like a person's name

! A URI can be a locator (URL)


! This works like a person's address

An example
! The ISBN system for uniquely identifying books provides a typical example of the use of URNs. ! ISBN 0486275574 (urn:isbn:0-486-27557-4) cites unambiguously a specific edition of Shakespeare's play Romeo and Juliet. ! To gain access to this object and read the book, one needs its location: a URL address. ! A typical URL for this book on a Unix-like operating system would be a file path such as file:///home/ username/RomeoAndJuliet.pdf, identifying the electronic book saved in a file on a local hard disk. ! URNs and URLs have complementary purposes.

Technical view
! Both URLs and URNs uniquely identify a resource ! An URL specifies the means of acting upon or obtaining the representation of the resource
! For example, the URL http://www.wikipedia.org/ identifies a resource (Wikipedia's home page) and implies that a representation of that resource (such as the home page's current HTML code, as encoded characters) is obtainable via HTTP from a network host named www.wikipedia.org.

! A Uniform Resource Name (URN) is a URI that identifies a resource by name, in a particular namespace. ! One can use a URN to talk about a resource without implying its location or how to access it.
! For example, the URN urn:isbn:0-395-36341-1 is a URI that specifies the identifier system, i.e. International Standard Book Number (ISBN), as well as the unique reference within that system and allows one to talk about a book, but doesn't suggest where and how to obtain an actual copy of it.

Generic Syntax
! RFC3986 defines the generic syntax for all URI schemes:
<scheme name> : <hierarchical part> [ ? <query> ] [ # <fragment> ]

! Scheme name is terminated with colon : ! The hierarchical part is intended for hierarchical information
! It usually begins with // ! Followed by an authority part (e.g. hostname) ! Followed by an optional path

! Query contains additional information that is not hierarchical by nature


! It is commonly organized as a sequence of <key>=<value> pairs

! Fragment is an optional part separated by hash #


! It holds additional identifying information

Examples: URL and URN


foo://username:password@example.com:8042/over/there/index.dtb?type=animal;name=ferret#nose \ / \_______________/ \_________/ \__/ \___/ \_/ \_____________________/ \__/ | | | | | | | | | userinfo hostname port | | query fragment | \________________________________/\_____________|____|/ scheme | | | | | authority path | | | | | | path interpretable as filename | ___________|____________ | / \ / \ | urn:example:animal:ferret:nose interpretable as extension

! Some officially IANA-registered schemes:


! fax,file, ftp, geo, http, https, ldap, mailto, news, nfs, sip, sms, tel, tv, urn

! Some unofficial but commonly used URI schemes:


! about, chrome, cvs, feed, gtalk, irc, jar, javascript, skype, ssh, svn, ymsgr

Examples of URIs
! URNs (ISBN)
urn:isbn:0451450523

! TEL
tel:<phonenumber>

! SMS
sms:<phone number>?<action> sms:+15105550101,+15105550102?body=hello%20there

! GEO
geo:<lat>,<lon>[,<alt>][;u=<uncertainty>] geo:37.786971,-122.399677

! One can not categorize many schemes as either locators or names ! The term URI is more appropriate in these cases

Encoding of URIs
! RFC1630 defines a uniform encoding scheme using ASCII characters ! The characters used are part of the ISO-Latin-1 set
! 256 characters in total ! Of these, the upper 128 characters and some other characters (ex. space, @) are considered unsure characters ! For these, an %HH encoding is used, where HH is the character code according to the ISO-Latin-1 set, in hex ! Exemple:
! Ein schner Name ! Ein%20sch%F6ner%20Name

URL: Generic syntax


! RFC1808 defines the following generic syntax for URLs:
<scheme> // <host> / <path> ; <parameter> ? <query string> # <fragment>

! Host is specified in the following form:


user : passwort @ host : port

! Path is the access path in a hierarchical system


! Most commonly, it is a path in a file system

! Parameter consists of a name value pair


! It is rarely used ! One example is the ftp scheme:
;type=a (ftp TYPE command)

URL: Generic syntax


! Query string contains non-hierarchical information
! It is organized as a sequence of <key>=<value> pairs
?operation=add&group=abc

! Fragment is not used to identify the resource itself, but to indicate a specific fragment or a part of it
! It is commonly used in the http scheme to indicate a specific part of a web page to a browser

ftp and file schemes


! FILE
file://host/path

! It is possible to specify the host ! However it is not possible to specify what protocol to use to access the resource
file://opincaru.ro/usr/oppy/doc/phd.tex file://localhost/usr/oppy/doc/phd.tex file:///usr/oppy/doc/phd.tex file://c:/Documents%20and%20Settings/oppy/My%20Documents/phd.doc

! FTP
ftp://[<user>[:<password>]@]<host>[:<port>]/<url-path> ftp://anonymous:cristian.opincaru%40gmail.com@ftp.opincaru.ro/pub/phd/phd.zip ftp://ftp.opincaru.ro:21/pub/phd/phd.zip

The http scheme


! The http scheme is used to locate resources that are accessible via the http protocol
http://www.google.ro/search? q=web%20development&ie=utf-8&oe=utf-8& aq=t&rls=org.mozilla:en-US:official& client=firefox-a http://www.wunderground.com/cgi-bin/findweather/getForecast? query=bucharest&wuSelect=WEATHER

Relative URLs
! All mentioned earlier were absolute URLs ! According to RFC1808 in the case of a relative URL, the prefix is not given, this is taken from the context (base URL) ! Base URL:
! Specified in the resource:
<HEAD> <TITLE>Example Document</TITLE> <BASE href="http://example.com/example.html"> </HEAD> ! If the resource is included in another resource

! Specified in the external resource


! The URL used t

! The URL used to access the resource ! Default


! In this case the URL is considered absolute

Relative URLs
Base URL: http://r/a/b/c;p?q#f //xxx /xxx xxx . .. ../xxx ;xxx ?xxx #xxx http://xxx http://r/xxx http://r/a/b/xxx http://r/a/b/ http://r/a/ http://r/a/xxx http://r/a/b/c;xxx http://r/a/b/c;p?xxx http://r/a/b/c;p?q#xxx

??????? ??

Design Failures
! Transcribable, but not easily spoken
! aitch tee tee pee colon slash slash double-you double-you double-you dot eye see es dot you see eye dot ee dee you slash tilde fielding slash

! Hierarchical path assumes only one root


! not true for FTP resources, leading to ambiguity ! gopher path isnt layered left-to-right

! Reliance on DNS as only naming authority


! vanity hostname explosion ! flat namespace under dot com (presentation of Roy Fielding, 1999)

Conclusions

Conclusions
! URI = URL + URN ! The generic syntax for URLs:
<scheme> // <host> / <path> ; <parameter> ? <query string> # <fragment>

! Relative URLs have the prefix omitted


! The prefix is extracted from the context (base URL)

References
! URI
! URIs, http://en.wikipedia.org/wiki/Uniform_Resource_Identifier ! Relative URIs, http://www.webreference.com/html/tutorial2/3.html ! Presentation of Roy Fielding, http://www.ics.uci.edu/~fielding/talks/uri_twist99/uri_twist99.ppt ! Article about URIs, http://www.ltg.ed.ac.uk/~ht/WhatAreURIs/

http://romania.startupweekend.org/

Hypertext Markup Language


Academia Tehnic Militar, 2011

Introduction
What is HTML?
HTML stands for Hyper Text Markup Language HTML is not a programming language, it is a markup language A markup language is a set of markup tags HTML uses markup tags to describe web pages

HTML Tags
HTML tags are keywords surrounded by angle brackets like <html> HTML tags normally come in pairs like <b> and </b> The first tag in a pair is the start tag, the second tag is the end tag Start and end tags are also called opening tags and closing tags

Markup
A markup language is
a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts.

Document Structure
<html> <head> <title>My Page</title> </head> <body> <h1>My First Heading</h1> <p>My first paragraph.</p> </body> </html>

HTML Basics
<h1>This is a heading</h1> <h2>This is a heading</h2> <h3>This is a heading</h3> <p>This is a paragraph</p> <p>This is another paragraph</p> <a href="http://www.w3schools.com">This is a link</a> <img src="w3schools.jpg" width="104" height="142" />

HTML Elements
Syntax
An HTML element starts with a start tag / opening tag An HTML element ends with an end tag / closing tag The element content is everything between the start and the end tag Some HTML elements have empty content Empty elements are closed in the start tag Most HTML elements can have attributes

What if the end tag is missing? Empty HTML elements Lower case vs. Upper case

HTML Attributes
HTML Attributes HTML elements can have attributes Attributes provide additional information about the element Attributes are always specified in the start tag Attributes come in name/value pairs like: name="value"

<a href="http://www.w3schools.com">This is a link</a>

Quotes in HTML attributes: both and are allowed Lowercase vs. Uppercase HTML Core attributes:

Other standard HTML attributes


Language attributes

Keyboard attributes

Headings
HTML Headings Must show document structure not display mode Search engines use this information for indexing
<h1>This is a heading</h1> <h2>This is a heading</h2>

HTML Rules HTML Comments

<p>This is a paragraph</p> <hr /> <!-- This is a comment -->

Paragraphs
Paragraphs
Browsers add an empty line before and after a paragraph

Line breaks
Start a new line without defining a paragraph
<p>This is <br> a paragraph</p> <p>This is </br> a paragraph</p>

Formatting
Formatting tags
HTML uses tags like <b> and <i> for formatting output, like bold or italic text. These HTML tags are called formatting tags.

Formatting tags

Styles
Why?
Provide a common way to style all HTML elements Introduced in HTML 4 to separate content from display Can be either specified through attribute or stylesheet
style="background-color:yellow" style="font-size:10px" style="font-family:Times" style="text-align:center"

Deprecated tags (should be avoided)

Links
Hyperlink
In web terms, a hyperlink is a reference (an address) to a resource on the web. Can be any resource type: movie, picture, text, etc.

Anchor
An anchor is a term used to define a hyperlink destination inside a document.

The HTML anchor element <a>, is used to define both hyperlinks and anchors.
<a href="http://www.w3schools.com/ target="_blank> Visit W3Schools!</a>

href attribute
The target address

target attribute
Where the link should open

name attribute
The name of the anchor (for named anchors) If a browser cannot find a named anchor that has been specified, it goes to the top of the document.

Link examples
<a href="http://www.w3schools.com/ target="_blank> Visit W3Schools!</a> <a href="http://www.w3schools.com/ target="_top">Click here</a> <a name="tips">Useful Tips Section</a> <a name="tips/> <a href="#tips>Jump to the Useful Tips Section</a> <a href="http://www.w3schools.com/html_tutorial.htm#tips"> Jump to the Useful Tips Section</a> <a href="lastpage.htm"> <img border="0" src="buttonnext.gif" width="65" height="38"> </a> <a href="mailto:someone@microsoft.com?subject=Hello%20again"> Send Mail</a>

Images
The <img> tag
Empty tag, image is referred via src

The alt attribute


Alternate text (in case the image can not be displayed)
<img src="boat.gif" alt="Big Boat" />

Background images
Use the background attribute
<html> <body background="background.jpg>

Image maps
In order to define clickable regions inside an image

Image maps
<img src="planets.gif width="145" height="126 usemap="#planetmap"> <map id="planetmap" name="planetmap"> <area shape="rect coords="0,0,82,126 alt="Sun" href="sun.htm"> <area shape="circle coords="90,58,3 alt="Mercury" href="mercur.htm"> <area shape="circle coords="124,58,8 alt="Venus" href="venus.htm"> </map>

Tables
Tables
Are defined with a <table> tag A table is divided in rows (<tr> tag) Each row is divided into cells (data, <td> tag) Cells may contain text, images, paragraphs, forms, tables, etc.

<table border="1"> <tr> <td>row 1, cell 1</td> <td>row 1, cell 2</td> </tr> <tr> <td>row 2, cell 1</td> <td>row 2, cell 2</td> </tr> </table>

Table tags

thead, tbody, and tfoot are used seldom because of bad browser support Cells with no content are not displayed in most brosers To force the browser to display a cell, use &nbsp

Tables with headings


<h4>Table headers:</h4> <table border="1"> <tr> <th>Name</th> <th>Telephone</th> <th>Telephone</th> </tr> <tr> <td>Bill Gates</td> <td>555 77 854</td> <td>555 77 855</td> </tr> </table> <h4>Vertical headers:</h4> <table border="1"> <tr> <th>First Name:</th> <td>Bill Gates</td> </tr> <tr> <th>Telephone:</th> ...................

Cells spanning
<table border="1"> <tr> <th>Name</th> <th colspan="2">Telephone</th> </tr> <tr> <td>Bill Gates</td> <td>555 77 854</td> <td>555 77 855</td> </tr> </table> <table border="1"> <tr> <th>First Name:</th> <td>Bill Gates</td> </tr> <tr> <th rowspan="2">Telephone:</th> <td>555 77 854</td> </tr> <tr> <td>555 77 855</td> </tr> </table>

More tables

Lists
Unordered lists (bullets different types)
<ul>, <li>

Ordered lists (numbered different types)


<ol>, <li>

Definition lists
<dl>, <dt>, <dd>

Lists can be nested with several levels

Forms
Forms
Are used to collect user input A form is defined with the <form> tag A form may contain different input types:
Text Fields Radio buttons Checkboxes Dropdown box Text area

<form name="input" action="html_form_submit.asp" method="get"> Username: <input type="text" name="user" /> <input type="submit" value="Submit" /> </form>

The action attribute specifies where the information is sent Forms usually contain a Submit button Forms may also be submitted via scripting

Form tags

Colors
Colors
Are defined via the hex codes of Red Green and Blue components Hex values are written as 3 double digit numbers, starting with a # sign.
#000000 for black #FFFFFF for white

16 colors are defined by name in HTML and CSS


aqua, black, blue, fuchsia, gray, green, lime, maroon, navy, olive, purple, red, silver, teal, white, and yellow.

Layout
For the layout of the page one can use:
Tables (remember, table cells can contain many HTML elements) Frames Floats (CSS)

The website below is designed using tables:

Frames
Frames
With frames you can display more than one HTML document in the same page Each document is called a frame and is independent from the other Disadvantage: printing is difficult

Frameset
A frameset divides the browser window in rows or columns One can specify the space occupied by each frame
As percentage (ex. cols="25%,75%") As fixed width in pixels (ex. cols="200,500")

<noframes> can be specified for browsers with no frames support <body> can not be used together with frames However <iframe> can be used inside an HTML document

Frames
<html> <frameset rows="50%,50%"> <frame src="frame_a.htm"> <frameset cols="25%,75%"> <frame src="frame_b.htm"> <frame src="frame_c.htm"> </frameset> </frameset> </html>

Inline frames
<html> <body> <iframe src="default.html"> </iframe> <p>Some older browsers don't support iframes.</p> <p>If they don't, the iframe will not be visible.</p> </body> </html>

Character entities
Some characters are reserved for HTML (i.e. <)
If used can be mistaken for markup

A character entity looks like this:


&entity_name; OR &#entity_number;

To display a less than sign we must write: &lt; or &#60; Commonly used character entities:

Head & Meta


Contains meta-information about the document Elements present in the <head> are not displayed by the browser The <meta> element
<meta <meta <meta <meta <meta Used by some search engines when indexing the pages However, because of abuse, some engines no longer use meta-tags Stores information about author, content, license, etc. Redirects There can also be custom meta-data name="description" content="Free Web tutorials on HTML and XHTML/> name="keywords" content="HTML, DHTML, XHTML, JavaScript" /> name="author content="Jan Egil Refsnes"> http-equiv="Refresh content="5;url=http://www.w3schools.com/"> name="security" content="low" />

Conclusions

HTML 3.2, HTML 4.0, XHTML


HTML 3.2
Original HTML was not intended to contain style information Markup was designed for content not style (i.e. <h1>, <p>, ...) HTML 3.2 added tags like <font>, color attributes, made web pages hard to develop and keep consistent

HTML 4.0
Main goal: formatting can be completely removed from the document and kept in a separate stylesheet Separation between the presentation and document structure

XHTML
Is almost identical to HTML 4.01 Is a stricter and cleaner version of HTML Is an XML language (can be validated)

HTML vs. XHTML


Most important differences
XHTML elements must be properly nested XHTML elements must always be closed XHTML elements must be in lowercase XHTML documents must have one root element

HTML5
New features
The canvas element for drawing The video and audio elements for media playback Better support for local offline storage New content specific elements, like article, footer, header, nav, section New form controls, like calendar, date, time, email, url, search

Status:
Working Draft at W3C (25 May 2011) The latest versions of all major browsers Safari, Chrome, Firefox, Opera and Internet Explorer 9 support some HTML5 features

References
HTML on Wikipedia
http://en.wikipedia.org/wiki/HTML

HTML 4.01 Specification


http://www.w3.org/TR/html4

HTML Tutorial on w3schools


http://www.w3schools.com/html/default.asp

HTML5 Tutorial on w3schools


http://www.w3schools.com/html5/default.asp

HTML Tags on w3schools


http://www.w3schools.com/tags/default.asp

Very good example of Image Maps


http://cliptank.com/PeopleofInfluencePainting.htm

Cheat Sheet

Acknowledgement
Slides containing the w3schools.com logo contain information taken from www.w3schools.com The use is permitted for academic purposes and these materials shall not be freely made available on the internet.

Cascading Style Sheets


Academia Tehnic Militar, 2011

Introduction [1]
What is CSS?
CSS stands for Cascading Style Sheets Styles define how to display HTML elements Styles were added to HTML 4.0 to solve a problem External Style Sheets can save a lot of work External Style Sheets are stored in CSS files CSS is mostly used use HTML, however it can be applied to any XML

History
Style sheets are around since SGML (1970s), in one form or another CSS level 1 was published by W3C in 1996 CSS level 2 was published by W3C in 1998
Recommendation on 07 June 2011

CSS level 3 was started in 2005, is still under development Adoption:


Internet Explorer 5.0 / Mac, first to achieve full support of CSS 1 As of august 2010, no browser fully implemented CSS2

Introduction [2]
Why cascading?
The style of one document can be influenced by several style sheets One style can inherit or cascade from another

Main advantages
Separation of content from presentation Site wide consistency Bandwidth Table-less design
Accessibility
Web page design using tables changes the structure of the document For normal humans it makes no difference, however special user agents (spiders, text-to-speech, etc.) have difficulties

Maintainability
A single style file can be used for a whole web site

Syntax
selector [, selector2, ...][:pseudo-class] { property: value; [property2: value2; ...] } /* comment*/

Selector
Usually, the HTML element / tag you wish to define

Property
The attribute you wish to change Several properties can be specified
p { text-align:center; color:black; font-family:arial }

Selectors
The class selector
p.right {text-align:right} p.center {text-align:center} .center {text-align:center} /* all elements of one class */ <p class="right">This paragraph will be right-aligned.</p> <p class="center">This paragraph will be center-aligned.</p> <p class="center bold>Two styles</p>

The id selector
#green {color:green} p#para1 { text-align:center; color:red } <p id="para1">This paragraph will be right-aligned.</p> <p>This paragraph is not affected by the style.</p>

External, Internal, Inline


External style sheet
<head> <link rel="stylesheet" type="text/css" href="mystyle.css" /> </head>

Internal style sheet


<head> <style type="text/css> hr {color:sienna} p {margin-left:20px} body {background-image:url("images/back40.gif")} </style> </head>

Inline style
<p style="color:sienna;margin-left:20px">This is a paragraph.</p>

Multiple styles. Cascading order.


Multiple styles
Several styles can affect the same element In this case, the values will be inherited from the more specific style

Cascading order
Browser default External style sheet Internal style sheet Inline style

/* External */ h3 { color:red; text-align:left; font-size:8pt } <h3 style=font-size:16pt>My heading</h3>

/* Internal */ h3 { text-align:right; font-size:20pt }

Background
Properties
background-color background-image background-repeat background-attachment background-position

body { background-color:#ffffff; background-image:url('img_tree.png'); background-repeat:no-repeat; background-position:top right; } body { background:#ffffff url('img_tree.png') no-repeat top right; }

Text
Text color
body {color:blue} h1 {color:#00ff00} h2 {color:rgb(255,0,0)} h1 {text-align:center} p.date {text-align:right} p.main {text-align:justify} h1 h2 h3 h4 {text-decoration:overline} {text-decoration:line-through} {text-decoration:underline} {text-decoration:blink}

Text alignment

Text decoration

Text transformation

p.uppercase {text-transform:uppercase} p.lowercase {text-transform:lowercase} p.capitalize {text-transform:capitalize} p {text-indent:50px}

Text indentation

Other: vertical align, letter spacing, word spacing, text direction, ...

Font
Font family

p { font-family:"Times New Roman",Georgia,Serif; }

Font style

p.normal {font-style:normal} p.italic {font-style:italic} p.oblique {font-style:oblique} h1 {font-size:40px} h2 {font-size:30px} p {font-size:14px} h1 {font-size:2.5em} /* 40px/16=2.5em */ h2 {font-size:1.875em} /* 30px/16=1.875em */ p {font-size:0.875em} /* 14px/16=0.875em */

Font size

Using em

Font [2]
Generic property

p.ex1 { font:italic arial,sans-serif; } p.ex2 { font:italic bold 12px/30px arial,sans-serif; }

The Box Model


All HTML elements can be considered as boxes Margin - Clears an area around the border. The margin does not have a background color, and it is completely transparent Border - A border that lies around the padding and content. The border is affected by the background color of the box Padding - Clears an area around the content. The padding is affected by the background color of the box Content - The content of the box, where text and images appear

Size: when setting the size you only specify the size of the content

Border
Border style
none, dotted, dashed, solid, double, groove, ridge, inset, outset

Border width
The width of the border

Border color
The color of the border
p { border-top-style:dotted; border-right-style:solid; border-bottom-style:dotted; border-left-style:solid; } p { border-style:solid; border-top:thick double #ff0000; }

p { border:5px solid red; }

Margin
Possible values

p { margin-top:100px; margin-bottom:100px; margin-right:50px; margin-left:50px; }

p { margin:100px 50px; }

Padding
Possible values

p { padding-top:25px; padding-bottom:25px; padding-right:50px; padding-left:50px; }

p { padding:100px 50px; }

Lists
list-style-type
none, disc, circle, square, decimal, decimal-leading-zero, armenian, georgian, lower-alpha, upper-alpha, lower-greek, lower-latin, upper-latin, lower-roman, upper-roman, inherit

list-style-image

ul { list-style-image:url('arrow.gif'); list-style-type:square; }

Other possibility:

ul { list-style-type:none; padding:0px; margin:0px; } li { background-image:url(arrow.gif); background-repeat:no-repeat; background-position:0px 5px; padding-left:14px; }

Table

Advanced features
Sizes, positioning, pseudo elements/classes

Sizes

height
<head> <style type="text/css"> img.normal {height:auto} img.big {height:120px} </style> </head> <body> <img class="normal" src="logocss.gif" width="95" height="84" /><br /> <img class="big" src="logocss.gif" width="95" height="84" />

max-width [in percentage]


<html> <head> <style type="text/css"> p { max-width: 50% } </style> </head> <body> <p>This is some text. This This is some text. This This is some text. This This is some text. This This is some text.</p> </body> </html>

is is is is

some some some some

text. text. text. text.

display / visibility
display
Sets how/if an element is displayed Possible values: none, inline, block, list-item, ..

visibility
Sets if an element should be visible or invisible Possible values: visible, hidden, collapse

Block elements
A block element is an element that takes up the full width available, and has a line break before and after it. Examples: <h1>, <p>, <div>

Inline elements
An inline element only takes up as much width as necessary, and does not force line breaks. Examples: <a>, <span>

display / visibility
<html> <head> <style type="text/css"> h1.hid {visibility:hidden} </style> </head> <body> <h1>This is a visible heading</h1> <h1 class="hid">This is a hidden heading</h1> <p>Notice that the hidden heading still takes up space.</p> </body> </html>

Positioning [1]
Static positioning
HTML elements are positioned static by default. A static positioned element is always positioned according to the normal flow of the page. Static positioned elements are not affected by the top, bottom, left, and right properties.

Fixed positioning
An element with fixed position is positioned relative to the browser window. Fixed positioned elements are removed from the normal flow. The document and other elements behave like the fixed positioned element does not exist. Fixed positioned elements can overlap other elements.
p.pos_fixed { position:fixed; top:30px; right:5px; }

Positioning [2]
Relative positioning
A relative positioned element is positioned relative to its normal position. The content of a relatively positioned elements can be moved and overlap other elements, but the reserved space for the element is still preserved in the normal flow.
h2.pos_left { position:relative; left:-20px; }

Absolute positioning
An absolute position element is positioned relative to the first parent element that has a position other than static. If no such element is found, the containing block is <html> Absolutely positioned elements are removed from the normal flow. The document and other elements behave like the absolutely positioned element does not exist. Absolutely positioned elements can overlap other elements.
h2 { position:absolute; left:100px; top:150px; }

Overlapping elements
z-index
Specifies the stack order of an element. An element with greater stack order is always in front of an element with a lower stack order.

img { position:absolute; left:0px; top:0px; z-index:-1; } <h1>This is a heading</h1> <img src="../images/w3css.png" width="100" height="140" /> <p>Because the image has a z-index of -1, it will be placed behind the text.</p>

Floating
How?
Elements are floated horizontally, this means that an element can only be floated left or right, not up or down. The elements after the floating element will flow around it. The elements before the floating element will not be affected.
<style type="text/css"> img { float:right; } </style> <p>In the paragraph below, we have added an image with style <b>float:right</ b>. The result is that the image will float to the right in the paragraph.</p> <p> <img src="logocss.gif" width="95" height="84" /> This is some text. This is some text. This is some text. ....................

Floating

<style type="text/css"> .thumbnail { float:left; width:110px; height:90px; margin:5px; } .text_line { clear:both; margin-bottom:2px; } </style>

Align
Using the margin property
p.right { margin-left:auto; width:70%; background-color:#b0e0e6; } p.right { position:absolute; right:0px; width:300px; background-color:#b0e0e6; } p.right { float:right; width:300px; background-color:#b0e0e6; }

Using the position property

Using the float property

Pseudo-class
CSS pseudo-classes are used to add special effects to some selectors.
selector:pseudo-class {property:value} selector.class:pseudo-class {property:value}

Examples:
a:link {color:#FF0000} a:visited {color:#00FF00} a:hover {color:#FF00FF} a:active {color:#0000FF} /* /* /* /* unvisited link */ visited link */ mouse over link */ selected link */

p:first-child { /* first element in a paragraph */ color:blue; } p > i:first-child { /* first i element in a paragraph */ font-weight:bold }

Pseudo-elements
CSS pseudo-elements are used to add special effects to some selectors.
selector:pseudo-element {property:value} selector.class:pseudo-element {property:value}

Examples:

p:first-line { color:#ff0000; font-variant:small-caps; }

p:first-letter { color:#ff0000; font-size:xx-large; }

Media property
Media Types allow you to specify how documents will be presented in different media. The document can be displayed differently on the screen, on the paper, with an aural browser, etc.
<html> <head> <style> @media screen { p.test {font-family:verdana,sans-serif;font-size:14px} } @media print { p.test {font-family:times,serif;font-size:10px} } @media screen,print { p.test {font-weight:bold} } </style> </head>

Conclusions

CSS 1, 2, 3, ...
CSS Level 1
Font properties such as typeface and emphasis Color of text, backgrounds, and other elements Text attributes such as spacing between words, letters, and lines of text Alignment of text, images, tables and other elements Margin, border, padding, and positioning for most elements Unique identification and generic classification of groups of attributes Superset of CSS Level 1 Positioning (absolute, relative, fixed, z-index) Support for media types Some new properties (font, etc.)

CSS Level 2

CSS Level 2, Revision 1 (CSS 2.1)


Removes some poorly implemented features Adds some already implemented browser extensions Candidate Recommendation of W3C as of 19 July 2007

CSS Level 3
Under development Will consist of several separate recommendations As of March 2011, there are over 40 CSS modules published from the CSS Working Group

Limitations
Vertical control limitations
While horizontal placement of elements is generally easy to control, vertical placement is frequently unintuitive, convoluted, or impossible.

Absence of expressions
There is currently no ability to specify property values as simple expressions (such as margin-left: 10% - 3em + 4px;).

Lack of ortogonality
Multiple properties often end up doing the same job. For instance, position, display and float specify the placement model, and most of the time they cannot be combined meaningfully.

Control of element shapes


CSS currently only offers rectangular shapes. Rounded corners or other shapes may require non-semantic markup.

Inconsistent browser support


Different browsers will render CSS layout differently as a result of browser bugs or lack of support for CSS features.

Inspecting styles and layouts


Firebug (plugin for Firefox):
see www.getfirebug.com

References
CSS on Wikipedia
http://en.wikipedia.org/wiki/CSS

CSS property reference


http://www.w3schools.com/css/css_reference.html

CSS tutorial on w3schools.com


http://www.w3schools.com/css/

CSS specification on W3C


http://www.w3.org/Style/CSS/

Tableless Web Design


http://en.wikipedia.org/wiki/ Tableless_web_design.html

Cheat Sheet

PHP
Academia Tehnic Militar, 2011

Introduction
What is PHP?
PHP stands for PHP: Hypertext Preprocessor PHP is a server-side scripting language, like ASP PHP scripts are executed on the server PHP supports many databases (MySQL, Informix, Oracle, etc.) PHP is an open source software

What is a PHP file?


PHP files can contain text, HTML tags and scripts PHP files are returned to the browser as plain HTML PHP files have a file extension of ".php", ".php3", or ".phtml"

Why PHP?
PHP runs on different platforms (Windows, Linux, Unix, etc.) PHP is compatible with almost all servers used today (Apache, IIS, etc.) PHP is FREE to download PHP is easy to learn and runs efficiently on the server side

History [1]
PHP originally stood for Personal Home Page

History [2]

The past: CGI


#include #include #include #include <stdio.h> <stdlib.h> <ctype.h> <string.h>

#define ishex(x) (((x) >= '0' && (x) <= '9') || ((x) >= 'a' && (x) <= 'f') || ((x) >= 'A' && (x) <= 'F')) int htoi(char *s) { int value; char c; c = s[0]; if(isupper(c)) c = tolower(c); value=(c >= '0' && c <= '9' ? c - '0' : c - 'a' + 10) * 16; c = s[1]; if(isupper(c)) c = tolower(c); value += c >= '0' && c <= '9' ? c - '0' : c - 'a' + 10; return(value); }

The past: CGI [2]


void main(int argc, char *argv[]) { char *params, *data, *dest, *s, *tmp; char *name, *age; puts("Content-type: text/html\r\n"); puts("<html><head><title>Form Example</title></head>"); puts("<body><h1>My Example Form</h1>"); puts("<form action=\"form.cgi\" method=\"GET\">"); puts("Name: <input type=\"text\" name=\"name\">"); puts("Age: <input type=\"text\" name=\"age\">"); puts("<br><input type=\"submit\">"); puts("</form>");

The past: CGI [3]


data = getenv("QUERY_STRING"); if(data && *data) { params = data; dest = data; while(*data) { if(*data=='+') *dest=' '; else if(*data == '%' && ishex(*(data+1))&&ishex(*(data+2))) { *dest = (char) htoi(data + 1); data+=2; } else *dest = *data; data++; dest++; } *dest = '\0'; s = strtok(params,"&"); do { tmp = strchr(s,'='); if(tmp) { *tmp = '\0'; if(!strcmp(s,"name")) name = tmp+1; else if(!strcmp(s,"age")) age = tmp+1; } } while(s=strtok(NULL,"&")); printf("Hi %s, you are %s years old\n",name,age); } puts("</body></html>");

The past: Perl


The same, in Perl using CGI.pm:
use CGI qw(:standard); print header; print start_html('Form Example'), h1('My Example Form'), start_form, "Name: ", textfield('name'), p, "Age: ", textfield('age'), p, submit, end_form; if(param()) { print "Hi ",em(param('name')), "You are ",em(param('age')), " years old"; } print end_html;

PHP Alternative
<html> <head> <title>Form Example</title> </head> <body> <h1>My Example Form</h1> <form action="form.phtml" method="POST"> Name: <input type="text" name="name"> Age: <input type="text" name="age"> <br><input type="submit"> </form> <?if($name):?> Hi <?echo $name?>, you are <?echo $age?> years old <?endif?> </body> </html>

Template processing
PHP is server-side and all PHP tags will be replaced by the server before anything is sent to the web browser.
<html> <?php echo "Hello World" ?> </html>

<html> Hello World </html>

Template Processing Technologies


Other template processing languages:
Cold Fusion
Adobe

Active Server Pages


Microsoft

Java Server Pages


Sun / Java

Ruby on Rails
Open Source

Velocity
Apache Software Foundation

Tags in PHP
Four types of tags available:
Short-style Long-style ASP-style Block-style
<html> <body> <? echo 'Short Tags - Most common' ?> <br /> <?php echo 'Long Tags - Portable' ?> <br /> <%= 'ASP Tags' %> <br /> <script language="php"> echo 'Really Long Tags - rarely used'; </script> <br /> </body> </html>

Language Basics
<?php // Variables $foo = 1; $bar = "Testing"; $xyz = 3.14; $foo = $foo + 1; ?> <?php // Functions phpinfo(); foo(); $len = strlen($foo); ?> <?php // Output echo $foo; printf(".2f",$price); ?> <?php // Arrays $foo[1] = 1; $foo[2] = 2; $bar[1][2] = 3; ?>

<?php // Control structures while($foo) { ... } ?>

Constants in PHP
Definition
Constants, are variables which cannot be modified during the execution of the script. Constants are typically recognizable by their recommended naming convention of ALL_CAPS. Constants are defined via the define() function

Predefined constants
PHP_VERSION, PHP_OS, PHP_INT_MIN, M_PI, See http://www.php.net/manual/en/reserved.constants.php
<?php define('CONSTANT', 'Hello'); echo CONSTANT; // Would echo 'Hello' ?>

Variables in PHP
Basic Syntax
All variables in PHP start with a dollar $ sign symbol Usually variable name start with lower letter

Dynamic Typing
C, C++, Java and other use static variable typing
Variables have a given data type that remains the same until they are destroyed

PHP automatically converts the variable to the correct data type, depending on its value
A variable does not need to be declared before adding a value to it You do not need to tell PHP which data type a variable is

The following data types are supported by PHP:


int float double string boolean

Variables in PHP: Examples


<?php $a = $b = $c = $d = echo 1234; 0777; 0xff; 1.25; "$a $b $c $d<br />\n";

1234 511 255 1.25

$name = 'Rasmus $last'; $str = "Hi $name\n"; echo $str; $greeting = true; if($greeting) { echo "Hi Carl"; $greeting = false; } echo 5 + "1.5" + "10e2"; ?>

Hi Rasmus $last

Hi Carl

1006.5

Variables in PHP: Scope

<?php $var = 1; require 'another_page.php'; // another_page.php has full // access to $var ?>

<?php $var = 'Test'; function test() { echo $var; // var not visible here } test(); ?>

Global variables
<?php $var = 'Test'; function test() { global $var; echo $var; // var visible here } test(); ?> <?php $GLOBALS['var'] = 'Test'; function test() { echo $GLOBALS['var']; } test(); ?>

Variables in PHP: Scope


Static variables
Are declared with the static keyword Exist within a function and are saved after the function executes
<?php function increment() { static $a = 0; echo $a; $a++; } increment(); // 0 increment(); // 1 increment(); // 2 increment(); // 3 ?>

How much would a be (if not static) ?

References
What are references?
References are means to access the same variable content by different names References are NOT C-pointers: they are not memory addresses
You can not do pointer arithmetic, etc. with PHP references
<?php $foo = 'Hello'; $bar = 'World'; ?> <?php $bar = & $foo; ?>

What can you do with references ?


Assign by reference
<?php $a = & $b ?> <?php function inc(& $b) { $b++; } $a = 1; inc($a); echo $a; ?> <?php function & get_data() { $data = "Hello World"; return $data; } $foo = & get_data(); ?>

Pass by reference

Return by reference
Return a reference instead of a copy of the data

IF ELSE ELSEIF
Syntax:
if (condition) code to be executed if condition is true; if (condition) code to be executed if condition is true; else code to be executed if condition is false; if (condition) code to be executed if condition is true; elseif (condition) code to be executed if condition is true; else code to be executed if condition is false;

IF ELSE ELSEIF
<html> <body> <?php $d=date("D"); if ($d=="Fri") echo "Have a nice weekend!"; ?> </body> </html> <html> <body> <?php $d=date("D"); if ($d=="Fri") { echo "Hello!<br />"; echo "Have a nice weekend!"; echo "See you on Monday!"; } ?> </body> </html> <html> <body> <?php $d=date("D"); if ($d=="Fri") echo "Have a nice weekend!"; else echo "Have a nice day!"; ?> </body> </html>

Switching mode
It is possible to combine PHP tags with HTML, also in decision statements

<? if(strstr($_SERVER['HTTP_USER_AGENT'],"MSIE")) { echo '<b>You are using Internet Explorer</b>' else echo '<b>You are not using Internet Explorer</b>' ?>

<? if(strstr($_SERVER['HTTP_USER_AGENT'],"MSIE")) { ?> <b>You are using Internet Explorer</b> <? } else { ?> <b>You are not using Internet Explorer</b> <? } ?>

Arrays
Three kinds of arrays:
Numeric array Associative array Multidimensional array
$cars=array("Saab","Volvo","BMW","Toyota"); // or $cars[0]="Saab"; $cars[1]="Volvo"; $cars[2]="BMW"; $cars[3]="Toyota"; $ages = array("Peter"=>32, "Quagmire"=>30, "Joe"=>34); // or $ages['Peter'] = "32"; $ages['Quagmire'] = "30"; $ages['Joe'] = "34";

Multidimensional arrays
$families = array ( "Griffin"=>array ( "Peter", "Lois", "Megan" ), "Quagmire"=>array ( "Glenn" ), "Brown"=>array ( "Cleveland", "Loretta", "Junior" ) );

echo "Is " . $families['Griffin'][2] . " a part of the Griffin family?";

Is Megan a part of the Griffin family?

WHO?

Loops
<?php // While $i = 0; while($i < 5) { echo $i."<br>"; $i++; } ?> <?php // Do-While $i = 0; do { echo $i."<br>"; $i++; } while($i < 0); ?> <?php // For for($i = 0; $i < 5; $i++) { echo $i."<br />"; } ?>

<?php // For Each $animals = array("Dog", "Cat", "Snake", "Tiger"); foreach($animals as $animal) echo $animal . "<br />"; ?>

Form handling
<form action="action.php" method="POST"> Your name: <input type=text name=name><br> You age: <input type=text name=age><br> <input type=submit> </form>

Action PHP Hi <?php echo $_POST['name']?>. You are <?php echo $_POST['age']?> years old.

$_GET and $_POST


The built-in $_GET predefined variable is used to collect values in a form with method="get" The built-in $_POST predefined variable is used to collect values in a form with method="post" POST vs. GET
GET encodes parameters in the URL
Pages can be bookmarked [+] Parameters are visible [-] [not a good idea for passwords] Length is limited [-]

POST encodes parameters differently


Pages can not be bookmarked [-] Parameters are not visible [+] There is no size limit [+]

???

$_REQUEST combines $_GET, $_POST and $_COOKIE


Welcome <?php echo $_REQUEST["fname"]; ?>!<br /> You are <?php echo $_REQUEST["age"]; ?> years old.

Exercise
Write HTML code and PHP script for an authentication form Passwords are maintained in an array, declared in the script
<HTML> <?php

</HTML>

?>

User defined functions


Functions may start with letters and underscore Return type and the type of parameters are not specified
// Simple function functionName() { code to be executed; } // With parameters function functionName($param1, $param2, etc.) { code to be executed; } // With return value function functionName($param1, $param2, etc.) { code to be executed; return some_value; }

Examples
<?php function head($title) {?> <HTML><HEAD><TITLE> <? echo $title ?> </TITLE></HEAD><BODY><? } head(Welcome to my home page!); ?> <?php function head($title=default) {?> <HTML><HEAD><TITLE> <? echo $title ?> </TITLE></HEAD><BODY><? } head(); ?> <?php function add($x, $y) { $total = $x + $y; return $total; } echo "1 + 2 = " . add(1,2); ?>

<?php $pi = 3.14; function area($radius) { return $pi * $radius * $radius; } echo 5m circle area:" . area(5); ?>

Built in functions
Array functions Calendar functions Date functions Directory functions Error functions Filesystem functions Filter functions FTP functions HTTP functions LibXML functions Mail functions Math functions Math functions Misc functions MySQL functions SimpleXML functions String functions XML Parser functions Zip functions

Array manipulation
Sorting:
sort(), rsort(), ksort(), usort(), array_multisort()

Traversal:
reset(), end(), next(), each(), current(), key(), array_walk()

Advanced:
array_diff(), array_intersect(), array_merge(), array_merge_recursive(), array_slice(), array_splice() ....
<?php $my_array = array("a" => "Dog", "b" => "Cat", "c" => "Horse"); sort($my_array); print_r($my_array); ?> Array ( [0] => Cat [1] => Dog [2] => Horse )

sort: sorts the array, does not maintain associations asort: sorts the array and maintains associations rsort: reverse sort ksort: sorts an array by key, maintaining key to data correlations.

Strings
The concatenation operator is .
<?php $str = "Fast String Manipulation"; echo substr($str,0,4) . substr($str,-9); ?> Fastipulation

More complex example:


<?php // Explode breaks a string into an array $a = explode(":", "This:string:has:delimiters."); foreach ($a as $value) { if (strcmp($value, "has") == 0) { echo "had "; } else echo $value." "; } ?> This string had delimiters.

Chear sheet

References
Wikipedia
http://en.wikipedia.org/wiki/PHP

PHP Manual
http://php.net/manual/en/index.php

Tutorials
http://www.w3schools.com/PHP http://en.wikiversity.org/wiki/PHP http://talks.php.net/show/oscon2002 http://www.php5-tutorial.com/

Cheat sheet
http://www.emezeta.com/weblog/emezeta-php-card-v0.2.png

Acknowledgement
Slides containing the w3schools.com logo contain information taken from www.w3schools.com The use is permitted for academic purposes and these materials shall not be freely made available on the internet.

PHP Advanced Topics


Academia Tehnic Militar, 2011

File Upload [html]


Enctype = multipart/form-data
Used when a form requires binary data, like the contents of a file, to be uploaded

Type = file
The input should be processed as a file. For example, when viewed in a browser, there will be a browse-button next to the input field

<html> <body> <form action="upload_file.php" method="post" enctype="multipart/form-data"> <label for="file">Filename:</label> <input type="file" name="file" id="file" /> <br /> <input type="submit" name="submit" value="Submit" /> </form> </body> </html>

File Upload
When a file is uploaded, PHP stores it in a temporary location $_FILES global variable, contains all the uploaded file information
$_FILES["file"]["name"] The name of the uploaded file $_FILES["file"]["type"] The type of the uploaded file $_FILES["file"]["size"] The size in bytes of the uploaded file $_FILES["file"]["tmp_name"] The name of the temporary copy of the file stored on the server $_FILES["file"]["error"] The error code resulting from the file upload move_uploaded_file(string $filename, string $destination)

Moves an uploaded file to a new location

File upload [php]


<?php if (($_FILES["file"]["type"] == "image/jpeg") && ($_FILES["file"]["size"] < 20000)) { if ($_FILES["file"]["error"] > 0) { echo "Return Code: " . $_FILES["file"]["error"] . "<br />"; } else { echo "Upload: " . $_FILES["file"]["name"] . "<br />"; echo "Type: " . $_FILES["file"]["type"] . "<br />"; echo "Size: " . ($_FILES["file"]["size"] / 1024) . " Kb<br />"; echo "Temp file: " . $_FILES["file"]["tmp_name"] . "<br />"; if (file_exists("upload/" . $_FILES["file"]["name"])) { echo $_FILES["file"]["name"] . " already exists. "; } else { move_uploaded_file($_FILES["file"]["tmp_name"], "upload/" . $_FILES["file"]["name"]); echo "Stored in: " . "upload/" . $_FILES["file"]["name"]; } } } else { echo "Only image/jpeg allowed"; } ?>

File handling
File functions are similar to C
// Open a file $file = fopen("welcome.txt", "r"); // Check for EOF [End-Of-File] if (feof($file)) echo "End of file"; // Read a line fgets($file) // Read a character fgetc($file) // Close the file fclose($file)

Example read all lines from a file

<? // Read all lines from a file $file = fopen(users.txt", "r) or exit("Unable to open file!"); //Output a line of the file until the end is reached while(!feof($file)) { echo fgets($file). "<br />"; } fclose($file); ?>

Server-side log

<?php $ip = $_SERVER['REMOTE_ADDR']; //Get there ip address. $agent = $_SERVER['HTTP_USER_AGENT']; //Get there user agent $ref = $_SERVER['HTTP_REFERER']; // Referer, how they got to your website, who // linked them, where they clicked that link. $date = date("H:i dS F"); //Get the date and time. $file = "log.txt"; //Where the log will be saved. $open = fopen($file, "a+"); //open the file, (log.htm). fwrite($open, "$date || $ip || $agent || $ref\n"); fclose($file);

?>

08:39 20th November || 188.25.154.233 || Mozilla/5.0 (Macintosh; Intel Mac OS X 10_5_8) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2 || http://opincaru.ro/tmp/ 08:41 20th November || 188.25.154.233 || Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:6.0.2) Gecko/20100101 Firefox/6.0.2 || GET /tmp/log.php HTTP/1.1 Host: opincaru.ro Connection: keep-alive Cache-Control: max-age=0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_5_8) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.121 Safari/535.2 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Encoding: gzip,deflate,sdch Accept-Language: en-US,en;q=0.8 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3

HTTP Headers
HTTP headers can be added to the response Because the headers are sent before the actual response, headers have to be added before outputting any data
HTTP/1.1 200 OK Date: Mon, 23 May 2005 22:38:34 GMT Server: Apache/1.3.3.7 (Unix) (Red-Hat/Linux) Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT Etag: "3f80f-1b6-3e1cb03b" Accept-Ranges: bytes Content-Length: 438 Connection: close Content-Type: text/html; charset=UTF-8

// Redirection <?php header('Location: http://www.php.net')?> // Setting the Last-Modified Header <?php header('Last-Modified: '. gmdate('D, d M Y H:i:s',getlastmod()).' GMT')?> // Avoid all caching <?php Header('Cache-Control: no-cache, must-revalidate'); Header('Pragma: no-cache'); Header('Expires: Mon,26 Jul 1980 05:00:00 GMT'); ?>

Cookies
What are cookies? A small piece of text stored on a users computer by a web browser It consists of one or more name-value pairs It contains user information (preferences, shopping cart contents, etc.) Why Cookies? Because the HTTP protocol is stateless A server can not identify a user that performs several requests Without cookies it is nearly impossible to implement an shopping cart application

How do cookies work?

GET /index.html HTTP/1.1 Host: www.example.org HTTP/1.1 200 OK Content-type: text/html Set-Cookie: RMID=732423sdfs73242; expires=Fri, 31-Dec-2010 23:59:59 GMT; path=/; domain=.example.org (content of page) GET /spec.html HTTP/1.1 Host: www.example.org Cookie: name=value Accept: */*

name = value
Cookie content

expires
Valability of the cookie

domain
The valability domain

path
The URL for which it is valid

Secure Only submit it over HTTPS HttpOnly Not possible to access it via JavaScript

Cookies in PHP
Setting a cookie
// Syntax setcookie(name, value, expire, path, domain); // Example Setting a session cookie setcookie("user", "copincar); // Example - Setting a persistent cookie // expiration time is given in seconds setcookie("user", "copincar", time()+3600); <? echo $_COOKIE["user"]; ?>

Reading a cookie Testing if a cookie is set

<?php // isset: determines if a variable is not NULL if (isset($_COOKIE["user"])) echo "Welcome " . $_COOKIE["user"] . "!"; else echo "Welcome guest!"; ?> <? SetCookie('Cookie_Name',''); ?> <? SetCookie('Cookie_Name','', mktime(12,0,0,11,22,1970) ); ?>

Deleting cookies

Cookie functions must appear before the <html> tag

Page views using session


Create a page that counts the number of page views The page counter shall be stored in a cookie Do not use sessions
<?php // isset: determines if a variable is not NULL if (isset($_COOKIE["count"])) { $times = $_COOKIE["count"]+1; setcookie("count", $times, time()+3600); } else { $times = 1; setcookie("count", 1, time()+3600); } echo "You have been here $times times." ?>

???

You have been here 5 times.

Sessions
PHP implements session management
It creates a unique ID (UID) for each visitor This is stored in a cookie or propagated through the URL

$_SESSION
Allows to associate user information with one session Is available to all pages in one application
<?php session_start(); ?> <html> <body> </body> </html> <?php session_start(); // store session data $_SESSION['views']=1; ?> .................................. <?php //retrieve session data echo "Pageviews=". $_SESSION['views']; ?>

// To destroy a session <? unset($_SESSION['views']); ?> <? session_destroy(); ?>

Page views using session


<?php session_start(); ?> <HTML> <BODY> You have been here <?php if(isset($_SESSION['views'])) $_SESSION['views']=$_SESSION['views']+1; else $_SESSION['views']=1; echo $_SESSION['views']; ?> times. </BODY> </HTML>

???

You have been here 5 times.

Error handling
By default, on error PHP will:
Display an error message with filename, line number and a message describing the error
<?php $file=fopen("welcome.txt","r"); ?>

Default error handling may lead to:


Unprofessional look Security risks

Warning: fopen(welcome.txt) [function.fopen]: failed to open stream: No such file or directory in C:\webfolder\test.php on line 2

Basic error handling: the die function


<?php if(!file_exists("welcome.txt")) { die("File not found"); } else { $file=fopen("welcome.txt","r"); } ?>

File not found

Error handlers
An error handler is a function that is called when an error occurs Syntax: error_function(error_level, error_message, error_file,
error_line, error_context)

Error Report levels:


E_WARNING: Non fatal run-time error E_NOTICE: Run-time notice E_USER_ERROR: Fatal user-generated error, for example trigger_error E_USER_WARNING: User-generated warning E_USER_NOTICE: User-generated notice E_ALL: All errors and warnings

Custom handlers
Functions for error handlers:
set_error_handler(callback, error_types) restore_error_handler() trigger_error(error_msg, error_type)
<?php //error handler function function customError($errno, $errstr) { echo "<b>Error:</b> [$errno] $errstr<br />"; echo "Ending Script"; die(); } //set error handler set_error_handler("customError",E_USER_WARNING); //trigger error $test=2; if ($test>1) { trigger_error("Value must be 1 or below",E_USER_WARNING); } ?>

Data Filtering in PHP


A filter is used to validate and filter data coming from insecure sources External data should filter data before using it:
Input data from a form Cookies Web services data Server variables Database query results

Two types of filters:


Validation is used to validate or check if the data meets certain qualifications
Example: FILTER_VALIDATE_EMAIL will determine if the data is a valid email address, but will not change the data itself.

Sanitization is used to sanitize the data


Example: FILTER_SANITIZE_EMAIL will remove characters that are inappropriate for an email address to contain.

Custom filters are defined through callback functions

PHP Functions for Filters


Functions: filter_var($variable_name, $filter) Filters a variable with a specified filter filter_var_array($array_data, $array_filter) Filters the elements of an array. A separate filter can be defined for each element of array_data filter_input($type, $variable_name, $filter) Gets the value of an external variable and filters it according to the specified filter. $type can be INPUT_GET, INPUT_POST, INPUT_COOKIE, INPUT_SERVER, or INPUT_ENV filter_has_var($type, $variable_name) Verifies if an external variable exists or not filter_input_array($input_type, $array_filter) Gets the value of several external variables. A separate filter for each of the variables can be defined.

Standard PHP Filters


Sanitize Filter Constants
FILTER_SANITIZE_EMAIL FILTER_SANITIZE_ENCODED FILTER_SANITIZE_MAGIC_QUOTES FILTER_SANITIZE_NUMBER_FLOAT FILTER_SANITIZE_NUMBER_INT FILTER_SANITIZE_SPECIAL_CHARS FILTER_SANITIZE_STRING FILTER_SANITIZE_STRIPPED FILTER_SANITIZE_URL FILTER_UNSAFE_RAW

Validate Filter Constants


FILTER_VALIDATE_BOOLEAN FILTER_VALIDATE_EMAIL FILTER_VALIDATE_FLOAT FILTER_VALIDATE_INT FILTER_VALIDATE_IP FILTER_VALIDATE_REGEXP FILTER_VALIDATE_URL

Filter examples: Validation


<?php // Validating an email address $email_a = 'joe@example.com'; $email_b = 'bogus'; if (filter_var($email_a, echo "This (email_a) } if (filter_var($email_b, echo "This (email_b) } ?> FILTER_VALIDATE_EMAIL)) { email address is considered valid."; FILTER_VALIDATE_EMAIL)) { email address is considered valid.";

This (email_a) email address is considered valid.

Filter examples: Sanitizing


<?php $a = 'joe@example.org'; $b = 'bogus - at - example dot org'; $c = '(bogus@example.org)'; $sanitized_a = filter_var($a, FILTER_SANITIZE_EMAIL); if (filter_var($sanitized_a, FILTER_VALIDATE_EMAIL)) { echo "This (a) sanitized email address is considered valid.\n"; } $sanitized_b = filter_var($b, FILTER_SANITIZE_EMAIL); if (filter_var($sanitized_b, FILTER_VALIDATE_EMAIL)) { echo "This sanitized email address is considered valid."; } else { echo "This (b) sanitized email address is considered invalid.\n"; } $sanitized_c = filter_var($c, FILTER_SANITIZE_EMAIL); if (filter_var($sanitized_c, FILTER_VALIDATE_EMAIL)) { echo "This (c) sanitized email address is This (a) sanitized email address is considered valid. considered valid.\n"; This (b) sanitized email address is considered invalid. echo "Before: $c\n"; echo "After: $sanitized_c\n"; This (c) sanitized email address is considered valid. Before: }(bogus@example.org) After: bogus@example.org ?>

Filter Input
<?php // Filter an email address if(!filter_has_var(INPUT_GET, "email")) { echo("Input type does not exist"); } else { if (!filter_input(INPUT_GET, "email", FILTER_VALIDATE_EMAIL)) { echo "E-Mail is not valid"; } else { echo "E-Mail is valid"; } } ?> <?php // Filter a URL from a form transmitted via POST if(!filter_has_var(INPUT_POST, "url")) { echo("Input type does not exist"); } else { $url = filter_input(INPUT_POST, "url", FILTER_SANITIZE_URL); } ?>

http://www.W3Schools.com/ ---> http://www.W3Schools.com/

Filter Input
All input to the application must be checked for special characters In order to prevent Cross-Site Scripting attacks all output of the application is escaped The following example shows how to properly use filtering to avoid Cross-Site Scripting in a Web search form:
<?php $search_html = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_SPECIAL_CHARS); $search_url = filter_input(INPUT_GET, 'search', FILTER_SANITIZE_ENCODED); echo "You have searched for $search_html.\n"; echo "<a href='?search=$search_url'>Search again.</a>"; ?> You have searched for Me &#38; son. <a href='?search=Me%20%26%20son'>Search again.</a>

Stupid things one can do: Examples of code open to injection


<?php $dir = _REQUEST["something"]; system("ls $dir"); ?> <?php // Escape a string to be used // as a shell argument $dir=escapeshellarg($dir); system("ls $dir"); ?> $safevar = "0"; $param1 = ""; $param2 = ""; $param3 = ""; # my own "register globals" for param[1,2,3] foreach ($_GET as $key => $value) { $$key = $value; }

<?php $myvar = 'somevalue'; $x = $_GET['arg']; eval('$myvar = ' . $x . ';'); ?>

<?php $color = 'blue'; if (isset( $_GET['COLOR'] ) ) $color = $_GET['COLOR']; require( $color . '.php' ); ?> $myfunc = $_GET['myfunc']; $myfunc();

$myfunc = $_GET['myfunc']; ${"myfunc"}();

Debugging in PHP
Useful functions for debugging: print_r($var) displays information about a variable in a way that's readable by humans var_dump($var) same as print_r, but it also shows type information die($msg) exit($msg) halts the execution of further code
<?php $variable="Hello World"; print_r($variable); $array=array('a', 'c', 'd', 'b'); print_r($array); ?> Hello World Array ([0] => a [1] => c [2] => d [3] => b) <?php echo "Hello"; die(); // or exit(), or exit("Text to output") or die("Text to output"); echo " World"; ?> <?php $variable="Hello World"; var_dump($variable); ?> string(23) "Hello World"

Web site acceleration & Load balancing


Caching Proxy
Accelerates service requests by retrieving content saved from a previous request made by the same client Caching proxies keep local copies of frequently requested resources, allowing large organizations to significantly reduce their upstream bandwidth usage and cost, while significantly increasing performance

Reverse Proxy
A proxy server that is installed in the neighborhood of one or more web servers All traffic coming from the Internet and with a destination of one of the web servers goes through the proxy server

Squid
GNU Open-source http://www.squid-cache.org/ Can be configured as both Caching Proxy and Reverse Proxy

SQUID and PHP

single-server accelerator

front-end cache

Conclusions

Extensions
GD (GIF, JPEG, PNG, WBMP) LDAP SNMP IMAP (POP, NNTP) FTP MCAL IMSP IPTC BC/GMP (arbitrary precision math) Hyperwave XML parser PDF generation FDF (PDF forms) System V Semaphores and Shared memory DCOM (Win32 only) Java Connectivity mnogosearch (udmsearch support) Cybermut Iconv Satellite Curl gettext (GNU internationalization) zlib (compressed IO) Charset/text conversion (UTF-8, Cyrillic, Hebrew) Browser capabilities extension EXIF SWF (Flash) ASPELL/PSPELL MCRYPT Cybercash Recode Readline XSLT (Sablotron, libxslt, Xalan) WDDX NIS YAZ (Z39.50 client) Payflow Pro CCVS (Credit Card Verification System) Fribidi Ncurses Muscat

Conclusions
Today we learned about:
File uploads & File handling HTTP headers, Cookies and Sessions Error handling Data filtering and Security Proxies & Reverse proxies (Squid) Object Oriented PHP

Chear sheet

References
Wikipedia, http://en.wikipedia.org/wiki/PHP PHP Manual, http://php.net/manual/en/index.php Code Injection, http://en.wikipedia.org/wiki/Code_injection Tutorials
http://www.w3schools.com/PHP http://en.wikiversity.org/wiki/PHP http://talks.php.net/show/oscon2002 http://www.php5-tutorial.com/

Cheat sheet
http://www.emezeta.com/weblog/emezeta-php-card-v0.2.png

Acknowledgement
Slides containing the w3schools.com logo contain information taken from www.w3schools.com The use is permitted for academic purposes and these materials shall not be freely made available on the internet.

Object Oriented PHP


Basics, Exception Handling (Optional)

Object Oriented PHP


PHP5 has a full object model
Visibility
public, protected, private

Abstract classes and methods Final classes and methods Magic methods
__construct, __destruct, __toString, __get, __set, etc.

Interfaces Cloning Type hinting


Functions may force parameters to be objects or arrays Failing to satisfy the type hint results in a catchable fatal error.

The Basics of OO PHP


<?php // Class definition class Cart { var $items; function add_item($artnr, $num) { $this->items[$artnr] += $num; } } ?> <?php // Inheritance class NamedCart extends Cart { var $owner; function NamedCart($name) { $this->owner = $name; } } ?>

<?php // Invocation $cart = new NamedCart("PenguinGear"); $cart->add_item(170923, 2); ?>

<?php // Static method calls class foo { function bar() { echo "bar() called"; } } foo::bar(); ?>

The Basics of OO PHP Calling methods in the parent class


<?php // Calling class foo2 function echo } } methods in the parent class { foo2() { "Constructor for foo2"; <?php // The same thing, using magic methods class bar extends foo2 { function bar() { echo "Constructor for bar<br/>"; parent::__construct(); } } ?>

class bar extends foo2 { function bar() { echo "Constructor for bar<br/>"; $name = get_parent_class($this); parent::$name(); } } $a = new bar(); ?>

The Basics of OO PHP: Interfaces


<?php // Declare the interface 'iTemplate' interface iTemplate { public function setVariable($name, $var); public function getHtml($template); } // Implement the interface // This will work class Template implements iTemplate { private $vars = array(); public function setVariable($name, $var) { $this->vars[$name] = $var; // This will not work } // Fatal error: Class BadTemplate contains 1 abstract methods // and must therefore be declared abstract (iTemplate::getHtml) classpublic function getHtml($template) {{ BadTemplate implements iTemplate foreach($this->vars private $vars = array(); as $name => $value) { $template = str_replace('{' . $name . '}', $value, $template); } public function setVariable($name, $var) { } } $this->vars[$name] = $var; return $template;

} } ?>

Exceptions
Exception model is similar to other programming languages Exceptions can be:
Thrown Caught Extended from the base class Exception
<?php try { // some operations } catch (Exception $e) { // deal with the error } // Continue execution ?>

If an exception is not caught, a fatal error will be issued:


Fatal error: with message Stack trace: checkNum(28) Uncaught exception 'Exception' 'Value must be 1 or below' in C:\webfolder\test.php:6 #0 C:\webfolder\test.php(12): #1 {main} thrown in C:\webfolder\test.php on line 6

Exceptions - example
<?php function inverse($x) { if (!$x) { throw new Exception('Division by zero.'); } else return 1/$x; } try { echo inverse(5) . "\n"; echo inverse(0) . "\n"; } catch (Exception $e) { echo 'Caught exception: ', } // Continue execution echo 'Hello World'; ?>

$e->getMessage(), "\n";

0.2 Caught exception: Division by zero. Hello World

Custom exceptions
Custom exceptions extend the base class Exception
<?php // Custom Exception Class class customException extends Exception { public function errorMessage() { $errorMsg = $this->getMessage().' is not a valid E-Mail address.'; return $errorMsg; } } try { try { // check for "example" in mail address if(strpos($email, "example") !== FALSE) { // throw exception if email is not valid throw new CustomException($email); } } catch (customException $e) { //display custom message echo $e->errorMessage(); } ?>

The exception class definition


<?php class Exception { protected $message = 'Unknown exception'; private $string; protected $code = 0; protected $file; protected $line; private $trace; private $previous; // // // // // // // exception message __toString cache user defined exception code source filename of exception source line of exception backtrace previous exception if nested exception

public function __construct($message = null, $code = 0, Exception $previous = null); final private function __clone(); final final final final final final final public public public public public public public function function function function function function function getMessage(); getCode(); getFile(); getLine(); getTrace(); getPrevious(); getTraceAsString(); // Inhibits cloning of exceptions. // // // // // // // message of exception code of exception source filename source line an array of the backtrace() previous exception formatted string of trace

/* Overrideable */ public function __toString(); } ?> // formatted string for display

Exception handlers
User-defined exception handlers can be defined These will be called if an exception is not caught via try/catch
<?php function exception_handler($exception) { echo "Uncaught exception: " , $exception->getMessage(), "\n"; } set_exception_handler('exception_handler'); throw new Exception('Uncaught Exception'); echo "Not Executed\n"; ?>

JavaScript & Active Browser Pages


Academia Tehnic Militar, 2011

Active Browser Pages


HTML limitation
It is simply a markup language Does not provide a mechanism to modify the page, once loaded Does not provide possibility for interactivity

DHTML
Dynamic HTML Collection of technologies to create interactive and animated web pages DHTML is NOT a different version of HTML

Technologies used for dynamic web pages


Style-sheet language (such as CSS) Client-side scripting language (such as JavaScript) HTML Document Object Model (HTML DOM) AJAX (Asynchronous JavaScript and XML)

JavaScript - Introduction
What is JavaScript?
JavaScript was designed to add interactivity to HTML pages JavaScript is a scripting language A scripting language is a lightweight programming language JavaScript is usually embedded directly into HTML pages JavaScript is an interpreted language (means that scripts execute without preliminary compilation) Everyone can use JavaScript without purchasing a license

Java vs. JavaScript?


Two completely different languages Both use a syntax derived from C, but this is the only similarity

What can you do with JavaScript?


JavaScript gives HTML designers a programming tool - HTML authors are normally not programmers, but JavaScript is a scripting language with a very simple syntax! Almost anyone can put small "snippets" of code into their HTML pages JavaScript can put dynamic text into an HTML page - A JavaScript statement like this: document.write("<h1>" + name + "</h1>") can write a variable text into an HTML page JavaScript can react to events - A JavaScript can be set to execute when something happens, like when a page has finished loading or when a user clicks on an HTML element JavaScript can read and write HTML elements - A JavaScript can read and change the content of an HTML element JavaScript can be used to validate data - A JavaScript can be used to validate form data before it is submitted to a server. This saves the server from extra processing JavaScript can be used to detect the visitor's browser - A JavaScript can be used to detect the visitor's browser, and - depending on the browser - load another page specifically designed for that browser JavaScript can be used to create cookies - A JavaScript can be used to store and retrieve information on the visitor's computer

JavaScript: The World's Most Misunderstood Programming Language


Most people don't know but:
JavaScript is a complete object-oriented language JavaScript can be used as a general-purpose programming language:
JavaScript is used as an embedded scripting language in Adobe Acrobat, OpenOffice, Apple Safari 5 extensions, and many more JavaScript is supported by Microsoft Active Scripting (cscript.exe), embedded in JDK 1.6 and in the Qt toolkit ActionScript is used as programming language in Adobe Flash, Adobe AIR is another implementation of JavaScript, the Mozilla platform uses JavaScript to implement Graphical User Interface

Amateurs
"Most of the people writing in JavaScript are not programmers. They lack the training and discipline to write good programs. JavaScript has so much expressive power that they are able to do useful things in it, anyway. This has given JavaScript a reputation of being strictly for the amateurs, that it is not suitable for professional programming. This is simply not the case."
For more, read JavaScript: The World's Most Misunderstood Programming Language, http://javascript.crockford.com/javascript.html

History and flavors


JavaScript vs. JScript vs. ECMAScript
JavaScript was introduced by Netscape in 1995 (Navigator 2.0)
Previous names: Moucha, LiveScript

JScript was introduced by Microsoft in 1996 (IE 3.0) ECMA standardizes currently JavaScript as ECMA-262 ECMAScript is also a ISO standard (ISO/IEC 16262)

History
Edition 1, 1997: First edition Edition 2, 1998: Editorial changes for ISO/IEC 16262 Edition 3, 1999: Added regular expressions, control statements, try/catch and other features Edition 4: Abandoned work split between Edition 5 and Harmony Edition 5, December 2009 Adds strict mode, new language features (getters / setters, JSON and others). Harmony, work in progress: Multiple new concepts and language features

ActionScript
A scripting language based on ECMAScript used by Adobe Flash player
ECMA International is the European Computer Manufacturers Association (ECMA)

JavaScript / JScript / ECMA Script


JavaScript
Trademark by Oracle Further developed by Mozilla Foundation Current version 1.8.5 from 27.07.2010

JScript
Trademark by Microsoft Further developed into JScript .NET

ECMA Script
Standardized by ECMA International Current version 5.1 from June 2011

How does it look like?


To embed JavaScript in a page, use the <script> tag
<html> <body> <script type="text/javascript"> document.write("This is my first JavaScript!"); </script> </body> </html>

If the browser does not support JavaScript


It will display the code as page content To prevent this, comments can be used:
<html> <body> <script type="text/javascript"> <!-document.write("This is my first JavaScript!"); //--> </script> </body> </html>

The <script> tag


Inline script
<script type="text/javascript"> document.write("Hello World!"); </script>

Script located in an external file


<script type="text/javascript" src="yui/build/json/json-min.js"> </script>

No script
Alternate content if scripts can not be executed Can contain any HTML elements
<script type="text/javascript"> document.write("Hello World!") </script> <noscript>Your browser does not support JavaScript!</noscript>

JavaScript Syntax
JavaScript is caSe seNsitive
Unlike HTML, but like most programming language

Statements are delimited via ;


alert("First message"); alert("Second message");

Statements are grouped together in blocks via { }


if (a > 2) { alert("First message"); alert("Second message"); }

Comments are similar to C, Java


// Single line comment /* Multiple Lines Comment */

Variables
Like all identifiers in JavaScript, the name of variables:
Must begin with a letter, dollar sign or underscore Can contain letters (upper and lower case), numbers or underscore Naming conventions are inherited from Java:
Use myVariable, threeWordsName, etc. Not my_variable, three_words_name, etc.

Declaration (recommended):
var myVariable; var anotherVariable = 5;

Declaration (without using the var keyword):


// myVariable not declared so far myVariable = 5;

JavaScript is loosely typed:


The data type is not specified at declaration Variables are automatically converted as needed during execution

Constants:
const CURRENT_VERSION = 1.0; const YEAR = 2010;

Primitive Data Types


Primitive Data types:
Undefined
Variables that have not assigned a value

Null
Similar to C, Java, etc.

Boolean
True / False

Number
Double precision Includes two special values Infinity, NaN

String
Both single as well as double quotes are allowed
var var var var var var var birthday = true; min = -Infinity; hex = 0xFF; oct = 077; msg1 = 'This is a string'; msg2 = "Message='abc'" variable; Boolean / true Number / -Infinity Number / 255 Number / 63 String / This is a string String / Message='abc' Undefined

Scope of variables
A variable can be:
Local to a specific function
A variable with local scope is one that's defined, initialized, and used within a function; when the function terminates, the variable ceases to exist

Global to the entire JavaScript application


A global variable, on the other hand, can be accessed anywhere within any JavaScript contained within a web page whether the JS is embedded directly in the page or imported through a JavaScript library
var message = "global"; function testScope() { var message = "local"; alert(message); } alert(message); testScope(); alert(message); global local global var message = "global"; function testScope() { message = "local"; alert(message); } alert(message); testScope(); alert(message); global local local

Statements
// if else else if if (condition) { code to be executed } if (condition) { code to be executed } else { code to be executed } // switch switch (n) { case 1: execute code block 1 break; case 2: execute code block 2 break; default: code to be executed if n is different from case 1 and 2 } // do-while, while, for, for-in do { code to be executed } while (var<=endvalue); while (var <= endvalue) { code to be executed } for (var = start; var <= end; var++) { code to be executed } for (variable in object) { code to be executed } // break & continue in loops for (i = 0; i < 10; i++) { // some code here if (i == 3) continue; if (i == 7) break; }

Exercise: Sorting an array


var list = new Array(10, 2, 7, 9, -3, 44, 22, 11); var total = 8;

var sorted = false; while (!sorted) { sorted = true; for (var i = 0; i < total 1; i++) if (list[i] > list[i+1]) { var tmp = list[i]; list[i] = list[i+1]; list[i+1] = tmp; sorted = false; } } alert(list);

User defined functions


Syntax:
// Function definition function myFunction1(var1,var2,...,varX) { some code } // Function definition with return value function myFunction2(var1,var2,...,varX) { some code return some_value; } // Function call myFunction1(var1,var2,...,varX); x = myFunction2(var1,var2,...,varX);

Example:

// Celsius - Fahrenheit function fahrenheit(celsius) { return celsius * 9 / 5 + 32; }

JavaScript Objects
JavaScript is an Object Oriented Language An Object in JavaScript contains:
Properties
Only readable Both readable and writable

// Create a new instance of an object item = new Object(); myArray = new Array(10); // Access methods and properties len = myArray.length; myArray.sort();

Methods

It is possible to define your own classes & objects


money = new Object(); money.quarters = 10; money = { 'quarters': 10, 'addQuarters': function(amount) { this.quarters += amount; } } money.addQuarters(10);

money = { 'quarters': 10 };

Native Objects in JavaScript


The following are native objects defined in JavaScript
Global Object Function Array String Boolean Number Math Date RegExp Error The methods of the Global object are called top level functions. In order to use the them is not necessary to add a prefix.

When developing a Web Application, the following objects are also accessible:
Browser Objects (Window, Navigator, History, ...) HTML DOM Objects (Document, Button, Form, ...)

Top-Level functions in JavaScript


eval
Syntax: eval ( string ) Evaluates a string and executes it as if it was script code
<script type=text/javascript> eval("x=10;y=20;document.write(x*y)"); document.write("<br />" + eval("2+2")); document.write("<br />" + eval(x+17)); </script> 200 4 27

encodeURI
Syntax: encodeURI ( uri ) Encodes the given string as URI, escaping the special characters

decodeURI Syntax: decodeURI ( uri)


Decodes the given URI by removing escape sequences

parseInt
Syntax: parseInt ( string, radix)

parseFloat
Syntax: parseFloat ( string )

Array
Declaration:
// Regular array definition myArray = new Array(10); // 10 = initial capacity // Without specifying the size foo = new Array(); // With initialization var myCars=new Array("Saab","Volvo","BMW"); // Literal array var myCars=["Saab","Volvo","BMW"]; // Read element alert(Today I drive a + myCars[0]); // Modify element myCars[1] = Benz;

Access:

Most important properties:


length

Most important methods:


concat(), indexOf(), join(), lastIndexOf(), pop(), push(), reverse(), shift() sort(), toString(), unshift()

String
Declaration:
var txt = Hello; var txt = Hello; var txt = new String(hello);

Concatenation:

var txt1 = Hello; var txt2 = World; alert(txt1 + txt2);

Properties:
length

Methods:
charAt(), indexOf(), lastIndexOf(), search() concat(), match(), split() replace(), slice(), substring(), toLowerCase(), toUpperCase()

Exercise Validate an email address


<html> <body> <script type="text/javascript"> /** Function that validates an email address. Parameter: email - the email address to be validated Return: true / false whether or not the address is valid */ function emailvalidation(email) { apos=email.indexOf("@"); dotpos=email.lastIndexOf("."); lastpos=email.length-1; if (apos < 1 || dotpos apos < 2 || lastpos dotpos > 3 || lastpos dotpos < 2) return false; else return true; } </script> </body> </html>

Extend the function above to validate the domain using a list of Top Level Domains.

Error, Try / Catch, Throw


Instances of Error objects are thrown as exceptions when runtime errors occur throw new Error(Error message); Create and Throw an Error object:
throw Error Message

Properties and Methods of Error:


name, message, toString()

Types of Error:
EvalError, RangeError, ReferenceError, SyntaxError, TypeError, URIError, NativeError

Try / Catch:

<html> <body> <script type="text/javascript"> try { eval("var a='abc'alert(a);"); } catch (err) { alert("Error: " + err.name + ":" + err.message); } </script> </body> </html>

How do you see errors in browsers?


Internet Explorer:

Firefox:

Debugging JavaScript
For large non-trivial scripts access to a debugger is invaluable
Start / Stop scripts, Add breakpoints Step over, Step into, Step out, ... Inspect values / types of variables

Debuggers in Internet Explorer


Microsoft Visual Studio Microsoft Script Editor (part of MS Office) Microsoft Script Debugger

Debuggers in Firefox
Firebug Venkman

Debuggers in Opera
DragonFly

Debuggers in Chrome
Built in

Input-Output functions
(provided by the browser via the Window object)
Alert Box
Syntax: alert(message); Displays the message in a dialog box containing an OK button
<script type="text/javascript"> // Celsius - Fahrenheit function fahrenheit(celsius) { return celsius * 9 / 5 + 32; } var c = prompt("Enter a \ temperature in Celsius","0"); if (c != null) { alert(Fahrenheit value:" + fahrenheit(c)); } else { alert("You canceled"); } </script>

Confirm Box
Syntax: confirm(question); Displays the message in a dialog box containing OK and Cancel buttons Returns true / false

Prompt Box
Syntax: prompt(some text, defalut value); Opens a prompt popup, asking the user for input Returns the input of the user or null if the user pressed Cancel

Cheat Sheets - JavaScript

Acknowledgement
Slides containing the w3schools.com logo contain information taken from www.w3schools.com The use is permitted for academic purposes and these materials shall not be freely made available on the internet.

HTML DOM
Academia Tehnic Militar, 2011

HTML DOM Objects


What is the DOM?
Document Object Model W3C standard Defines the objects and properties of all document elements together with methods to access them Works with HTML and XML documents Current release:
DOM Level 3, April 2004 This one builds on DOM Level 2 and DOM Level 1

What is the HTML DOM?


A standard object model for HTML A standard programming interface for HTML Platform and language independent language

The HTML DOM is a standard for how to get, change, add or delete HTML elements.

DOM Nodes
In the DOM, everything in an HTML document is a node:
The entire document is a document node Every HTML element is an element node The text in the HTML elements are text nodes Every HTML attribute is an attribute node Comments are comment nodes

<html> <head> <title>My title</title> </head> <body> <h1>My header</h1> <a href=...>My link</a> </body> </html>

The Node Object


Properties
nodeName
nodeName is read-only nodeName of an element node is the same as the tag name nodeName of an attribute node is the attribute name nodeName of a text node is always #text nodeName of the document node is always #document

nodeValue
nodeValue for element nodes is undefined nodeValue for text nodes is the text itself nodeValue for attribute nodes is the attribute value

nodeType

The Node Object


Other properties:
parentNode childNodes firstChild lastChild previousSibling nextSibling attributes ownerDocument insertBefore(newChild, refChild) replaceChild(newChild, oldChild) appendChild(newChild) hasChildNodes()

Most important functions:

Accessing nodes in the DOM


By navigation
<html> <body> <a href="http://cnn.com">CNN</a> <script type="text/javascript"> alert(document.body.firstChild.nextSibling.text); </script> </body> </html> <html> <body> <a href="http://cnn.com" id=a1>CNN</a> <script type="text/javascript"> alert(document.getElementById(a1).text); </script> </body> </html> <html> <body> <a href="http://cnn.com" id=a1>CNN</a> <script type="text/javascript"> alert(document.getElementsByTagName("a")[0].text); </script> </body> </html>

getElementById()
Returns the element with the given ID

getElementsByTagName()
Returns all elements with a given tag name

Exercise:
Capitalize first word of each paragraph in a web page
<html> <head> <script type="text/javascript"> function capitalize() { // Get all paragraphs list = document.getElementsByTagName("p"); for (i=0; i < list.length; i++) { text = list[i].innerHTML; // Split the text in words words = text.split(" "); // Uppercase the first word newText = words[0].toUpperCase(); // Add the other words for (j = 1; j < words.length; j++) newText = newText + " " + words[j]; // Modify the text list[i].innerHTML = newText; } } </script> </head> <body onLoad="capitalize();"> <p>This is the first paragraph.</p> <p>This is the second one.</p> <p>Yet another one.</p> </body>

* Hints * 1) The innerHTML property gives the HTML content of an element. 2) The split() method splits a string (returns a list of substrings). 3) The toUpperCase() method transforms a string to upper case.

Objects in JavaScript
JavaScript Objects Array String Date Math etc. Browser Objects Window Navigator Screen History Location HTML DOM Objects Document Anchor Body Button Form Image etc.

There is no public standard that applies to the Browser Objects, but most browser implement them.

The Window Object


The window object represents an open window in a browser If the document contains frames, the browser will create
One window object for the HTML document One additional window object for each frame

Important properties:
frames[] document history location name opener status top

Important Methods:
alert(message) close() confirm(message) open(URL, name, specs, replace) prompt(text, default) resizeBy() resizeTo()

The Navigator Object


The navigator object contains information about the browser. Important properties:
appCodeName appName appVersion userAgent platform cookieEnabled
var browser=navigator.appName; var b_version=navigator.appVersion; var version=parseFloat(b_version); document.write("Browser name: "+ browser); document.write("<br />"); document.write("Browser version: "+ version); if (((browser=="Netscape") || (browser=="Microsoft Internet Explorer")) && (version>=4)) alert("Your browser is OK."); else alert(Page may not load optimally.");

Browser detection:

The History Object


The history object contains the URLs visited by the user (within a browser window). The history object is part of the window object and is accessed through the window.history property. Important properties:
length

Important methods:
back() forward() go(number)
ex: window.history.go(-1);

The Location Object


The location object contains information about the current URL. The location object is part of the window object and is accessed through the window.location property. Important properties:
protocol hostname port pathname href
// Browser detection with redirection var browser=navigator.appName; var b_version=navigator.appVersion; var version=parseFloat(b_version); document.write("Browser name: "+ browser); document.write("<br />"); document.write("Browser version: "+ version); if (browser == "Netscape") window.location=index.netscape.html; else if (browser=="Microsoft Internet Explorer") window.location=index.msie.html;

Important methods:
reload() replace(newURL)

HTML DOM Objects


The HTML DOM defines objects for each HTML element
Document, Anchor, Body, Button, Form, Image, etc.

The Document Object:


Important properties:
anchors[] forms[] images[] links[] cookie domain title URL

Important methods:
getElementById(id) getElementByName(name) getElementByTagName(tag) write(exp1, exp2, exp3, ...)

Events
What are events?
Events are actions that can be detected by the browser Events are associated with an HTML element Events can be associated with a custom handler Events are defined as attributes for HTML elements
<input type="text" size="20" onchange="validateEmail()>

Most important events:

Form Validation
HTML Code

<html> <body> <form action="target.php" method="post" onsubmit="return validateForm(this)"> Email: <input type="text" name="email" size="30"> <br> Password: <input type="password" name="password"> <br> Retype: <input type="password" name="retype"> <br> <input type="submit"> </form> </body> </html>

* Hint * By returning false, the event action is canceled

Form Validation
JavaScript Code <script
type="text/javascript"> function validateField(field, text) { with (field) { if ((value == null) || (value == "")) { alert(text); focus(); return false; } } return true; } function validateForm(f) { with (f) { if (!validateField(email, "Email is empty")) return false; if (!validateField(password, "Passwords is empty")) return false; if (!validateField(retype, "Retype is empty")) return false; if (password.value != retype.value) { alert("Passwords don't match"); return false; } } return true; } </script>

CSS and DOM


The style of elements can be manipulated via the Style object Style is a property of HTML elements To access the style object:
document.getElementById("id").style.property="value"

The style object contains properties for all style properties defined by css Examples:
document.body.style.background="#FFCC80 url(bgdesert.jpg) repeat-y"; document.getElementById("p1").style.visibility="hidden"; document.getElementById("p1").style.font="italic bold 12px arial,serif"; document.getElementById("p1").style.pageBreakAfter="always";

The className property represents the associated style:


document.getElementById("p1").className = userParagrah1;

Exercise: Sliding Frame [1]


Master Frame HTML Code
A sliding frame similar with the one in Google Reader:

<HTML> <HEAD> <TITLE>Sliding Frame Example</TITLE> </HEAD> <FRAMESET cols="250,10,*" id="frameset" frameborder="no"> <FRAME src="l-frm.html" name="l-frm"> <FRAME src="sliding.html" name="s-frm" scrolling="no"> <FRAME src="r-frm.html" name="r-frm"> </FRAMESET> </FRAMESET> </HTML>

Exercise: Sliding Frame [2]


Sliding Frame HTML Code
<html> <head> <style type="text/css"> body {background:white; padding:0px; margin:0px} div.right:hover div.right div.left:hover div.left </style> {background:#A0A0A0; {background:#F0F0F0; {background:#A0A0A0; {background:#F0F0F0; height:100%; height:100%; height:100%; height:100%; width:10px} width:10px} width:10px} width:10px}

<script type=text/javascript src=slide.js></script> </head> <body onLoad="init();"> <div id="div-toggle" onClick="toggleFrame();">&nbsp;</div> </body> </html>

Exercise: Sliding Frame [3]


JavaScript Code
var toggle; function init() { toggle = true; el = document.getElementById("div-toggle"); el.className = "right"; } function toggleFrame() { fs = top.document.getElementById("frameset"); if (toggle == true) { fs.cols = "0,15,*"; toggle = false; el.className = "left"; } else { fs.cols = "250,15,*"; toggle = true; el.className = "right"; } }

Leverage 3rd party JavaScripts


Numerous sites provide JavaScript API
Google Maps, Facebook, Ebay, Amazon, Picasa, etc.

How to embed a map on your homepage (using Google Maps):

<script type="text/javascript" src="http://www.google.com/jsapi?key=ABCDEFG"> </ script> <script type="text/javascript"> google.load("maps", "2.x"); // Call this function when the page has been loaded function initialize() { var map = new google.maps.Map2(document.getElementById("map")); map.setCenter(new google.maps.LatLng(37.4419, -122.1419), 13); } google.setOnLoadCallback(initialize); </script>

JavaScript Frameworks
Existing JavaScript frameworks provide numerous features:
User Interface (Widgets)
Example: Progress bar, Tree, etc.

Special FX:
Example: Animations, etc.

Data manipulation objects


Example: Atom Parser, CSV Parser

Some of the JavaScript frameworks:


Yahoo User Interface (YUI) The Dojo Toolkit (dojo)

Google Web Toolkit


Toolkit for rapid Web application development Contains a Java-to-JavaScript compiler Both client and server-side code is developed in Java
JavaScript client-side code is then generated from the Java code

Abstracts the browser layer:


Different code is generated for each browser type / version

YUI: TreeView Example

AJAX
Asynchronous JavaScript and XML

AJAX
What is AJAX?
Asynchronous JavaScript with XML A technique used for client-side programming (not a technology) Web applications retrieve data asynchronously, in the background Usually, the XMLHttpRequest object is used Despite the name, JavaScript and XML are not required
Nor does it need to be asynchronous

Technologies used:
HTML/XHTML and CSS for presentation Document Object Model for dynamic display and interaction with data XML (sometimes JSON) for data interchange The XMLHttpRequest for asynchronous communication JavaScript to glue the technologies together

Example applications:
Google Maps, Google Reader, Google Suggest

The XMLHttpRequest Object


Allows to interact with servers It is supported by most browsers:
Internet Explorer, Firefox, Chrome, Opera, Safari

Attributes:
readyState

status responseText responseXML onreadystatechange

HTTP status The loaded data as String The loaded data as XML (DOM can be used) Handler for the ready event

Methods:
open(mode, url, asyncMode) mode GET / POST url The URL to open asyncMode true (asynchronous) / false (synchronous) send(string) Sends the request to the server

How to get the XMLHttpRequest Object


/* Returns an instance of XMLHttpRequest. Deals with differences in browsers. */ function GetXmlHttpObject() { if (window.XMLHttpRequest) { // code for IE7+, Firefox, Chrome, Opera, Safari return new XMLHttpRequest(); } if (window.ActiveXObject) { // code for IE6, IE5 return new ActiveXObject("Microsoft.XMLHTTP"); } return null; }

Suggest Example
Client HTML code

<html> <head> <script src="clienthint.js"></script> </head> <body> <form> First Name: <input type="text" id="txt1" onkeyup="showHint(this.value)" /> </form> <p>Suggestions: <span id="txtHint"></span></p> </body> </html>

Suggest Example
Client JavaScript code
var xmlhttp function showHint(str) { if (str.length==0) { document.getElementById("txtHint").innerHTML=""; return; } xmlhttp=GetXmlHttpObject(); if (xmlhttp==null) { alert ("Your browser does not support XMLHTTP!"); return; } var url="gethint.php"; url=url+"?q="+str; xmlhttp.onreadystatechange=stateChanged; xmlhttp.open("GET",url,true); xmlhttp.send(null); } function stateChanged() { if (xmlhttp.readyState==4) { document.getElementById("txtHint").innerHTML=xmlhttp.responseText; } }

Suggest Example
Server PHP code
You write it!

Drawbacks of AJAX
If JavaScript is not activated, AJAX wont work Because data is loaded dynamically
Content will not be indexed by search engines It is hard to bookmark the content

AJAX pages are substantially harder to develop than static pages Pages do not automatically register with the browser history engine.
The back button may not work This may be overcome.

Acknowledgement
Slides containing the w3schools.com logo contain information taken from www.w3schools.com The use is permitted for academic purposes and these materials shall not be freely made available on the internet.

eXtensible Markup Language


Academia Tehnic Militar, 2011

Introduction
What is XML?
XML stands for eXtensible Markup Language XML is not itself a markup language
It is a specification for defining markup languages

XML was designed to carry data, not to display data XML tags are not predefined. You must define your own tags XML is designed to be self-descriptive XML has strong support via Unicode for many languages XML is a W3C Recommendation There are hundreds of XML-based languages
RSS, SOAP, XHTML, SVG, SMIL, Office Open XML, OpenDocument, etc.

XML Technologies
XML Languages
Languages defined on top of XML
XHTML, SOAP, WSDL, RSS, SMIL, SVG

Schema definition languages


Languages for definition of XML content
DTD, XML Schema, RELAX NG, Schematron

XML Processing
Languages for processing XML
XPath, XQuery, XSLT, XSL-FO

XML APIs
Programming interfaces for dealing with XML
DOM, SAX, StAX

XML Applications
Applications using any of the above technologies
Web Services (SOAP, WSDL, UDDI)

History & Status


Standard Generalized Markup Language SGML
Originated in the 1970s SGML is not itself a markup language
Rather a specification for defining markup languages

Applied by military, aerospace, technical reference Very complex

HTML
Simplified and specialized version of SGML Not powerful enough, not rigorous enough

XML
Developed and maintained by W3C Actual versions:
XML 1.0, 5th edition (published on 26.11.2008) XML 1.1, 2nd edition (published on 16.08.2006)

For more about history, read: http://www.itwriting.com/xmlintro.php

First example
<?xml version="1.0"?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

XML does not DO anything XML is used for data serialization XML is just plain text With XML you invent your own tags XML is a complement (not replacement) for HTML

General structure
An XML document consists of the following: Prolog
Contains meta-information about the XML document
XML Version File encoding - ex. UTF-8, ISO-8859-1, etc. Processing instructions

Document Type Definition


Defines the description of the document structure

Instance
Consists of the actual XML data
<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE greeting SYSTEM "hello.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

XML Tree
XML documents must contain a single root element All elements can have sub-elements
<root> <child> <sub-child> ... </subchild> ... </child> </root>

XML Tree

<bookstore> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> <book category="WEB"> <title lang="en">Learning XML</title> <author>Erik T. Ray</author> <year>2003</year> <price>39.95</price> </book> </bookstore>

Parent Child Sibling

Syntax
1. 2. 3. 4. 5. 6. 7. 8. All elements must have a closing tag XML tags are case-sensitive XML documents must be properly nested XML documents must have one single root element XML attribute values must be quoted Entity references are similar to HTML Comments in XML have the same syntax as in HTML White spaces are preserved in XML

Find the flaws

All elements must have a closing tag (1) XML tags are case-sensitive (2) XML documents must be properly nested (1) XML documents must have one single root element (1) XML attribute values must be quoted (3) <?xml version="1.0"?> Entity references are similar to HTML (1) <book category="COOKING"> Comments in XML have the same syntax as in HTML (0) <title lang=de>Everyday Italian</title> W <author>Giada De hite spaces are preserved in XML (0) Laurentiis</author>
<year>2005</year> <price>30.00</price> <!-- No description for this book --> </book> <book category="CHILDREN"> <title lang=en>Harry Potter</title> <author>J K. Rowling & Other</author> <year>2005</year> <price>29.99</price> <description>Children <i> book</description> </Book> <book category="WEB"> <title lang=en>Learning XML</title> <author>Erik T. Ray</author> <year>2003</Year> <price>39.95</price> <description>The best <b><i>XML Book</b></i> on the market &lt; 50$</description> </book>

XML Elements
What is an element?
An XML element is everything from (including) the element's start tag to (including) the element's end tag An element can contain other elements, simple text or a mixture of both Elements can also have attributes

Naming rules
Names can contain letters, numbers, and other characters Names cannot start with a number or punctuation character Names cannot start with the letters xml (or XML, or Xml, etc) Names cannot contain spaces No names are reserved, all names can be used Non-English characters are perfectly legal

XML Elements
Elements can be easily extended
Most applications will support such data changes without modifications
<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE greeting SYSTEM "hello.dtd"> <note> <to>Tove</to> <from>Jani</from> <body>Don't forget me this weekend!</body> </note> <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE greeting SYSTEM "hello-new.dtd"> <note date="2008-01-10" > <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

Attributes
XML elements can have attributes in the start tag
Similar to HTML

XML attributes provide more information about elements XML Attributes must be quoted
<person sex="female"> <person sex='female'> <gangster name='George "Shotgun" Ziegler'> <gangster name="George &quot;Shotgun&quot; Ziegler">

XML Attributes vs. XML Elements


<note date="10/01/2008"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>How are you?</body> </note> <note> <date>10/01/2008</date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>How are you?</body> </note> <note> <date> <day>10</day> <month>01</month> <year>2008</year> </date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>How are you?</body> </note>

There are no rules if an information shall be element or attribute When deciding, consider the following: Attributes cannot contain multiple values (elements can) Attributes cannot contain tree structures (elements can) Attributes are not easily expandable (for future changes)

Good / Bad examples


<note day="10" month="01" year="2008" to="Tove" from="Jani" heading="Reminder" body="Don't forget me this weekend!"> </note> <messages> <note id="501"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> <note id="502"> <to>Jani</to> <from>Tove</from> <heading>Re: Reminder</heading> <body>I will not</body> </note> </messages>

Use elements for data. Use attributes for information that is not relevant to the data.

XML Namespaces
<table> <tr> <td>Apples</td> <td>Bananas</td> </tr> </table> <table> <name>African Coffee Table</name> <width>80</width> <length>120</length> </table>

Elements in XML are not fixed, but defined by developers Assume you want to combine the two, in one single document
A name conflict can occur:
Table has a different meaning in the two documents

XML Namespaces
<root> <h:table xmlns:h="http://www.w3.org/TR/html4/"> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr> </h:table> <f:table xmlns:f="http://www.w3schools.com/furniture"> <f:name>African Coffee Table</f:name> <f:width>80</f:width> <f:length>120</f:length> </f:table> </root>

Name conflicts are resolved with prefixes The namespace is defined by the xmlns attribute in the start tag of an element. The namespace declaration has the following syntax:

xmlns:prefix="URI"

XML Namespaces
Some rules:
The namespace is defined by the xmlns attribute in the start tag of an element When a namespace is defined for an element, all child elements with the same prefix are associated with the same namespace. Namespaces can be declared in the elements where they are used or in the XML root element The namespace URI is not used by the parser, it is simply an identifier A default namespace can be defined for all child elements xmlns="namespaceURI"
<table xmlns="http://www.w3.org/TR/html4/"> <tr> <td>Apples</td> <td>Bananas</td> </tr> </table>

Find the namespace


<?xml version="1.0" encoding="UTF-8"?> <Invoice xmlns:it="http://www.example.ro/item" xmlns="http://www.example.ro/invoice"> <Header invoiceNumber="12345"> <Date xmlns="http://www.example.ro/date-def"> <Month>July</Month> <Day>15</Day> <Year>2001</Year> </Date> <BillTo custNumber="X5739" name="Milton McGoo" phone="416-448-4414"> <ad:address xmlns:ad="http://www.external.com/address"> <ad:street1>IBM</ad:street1> <ad:street2>1150 Eglinton Ave East</ad:street2> <ad:city>Toronto</ad:city> <ad:state>Ontario</ad:state> <ad:zip>M3C 1H7</ad:zip> <ad:country>Canada</ad:country> </ad:address> </BillTo> </Header> <it:Item discount="promotion" price="57"> <it:description>high speed 3D graphics card</it:description> </it:Item> </Invoice>

Namespaces
<?xml version="1.0" encoding="UTF-8"?> <in:Invoice xmlns:in="http://www.example.ro/invoice"> <in:Date> <in:Month>July</in:Month> <in:Day>15</in:Day> <in:Year>2001</in:Year> </in:Date> <in:Item discount="promotion" price="57"> <in:description>high speed 3D graphics card</in:description> </it:Item> </in:Invoice>

<?xml version="1.0" encoding="UTF-8"?> <other:Invoice xmlns:other="http://www.example.ro/invoice"> <other:Date> <other:Month>July</other:Month> <other:Day>15</other:Day> <other:Year>2001</other:Year> </other:Date> <other:Item discount="promotion" price="57"> <other:description>high speed 3D graphics card</other:description> </other:Item> </other:Invoice>

Well formed XML


Well formed: An XML document is well formed if it conforms to the XML specification. Syntax rules:
XML documents must have a root element XML elements must have a closing tag XML tags are case sensitive XML elements must be properly nested XML attribute values must be quoted

Valid XML
Valid: An XML document is called valid if it is well formed and it conforms to the rules of a Document Type Definition (DTD) or an XML Schema
<?xml version="1.0" encoding="ISO-8859-1"?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE note SYSTEM "Note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

Well-Formed

Valid

XML DTD
The purpose of a DTD (Document Type Definition) is to define the legal building blocks of an XML document. A DTD defines the document structure with a list of legal elements and attributes. DTD is part of the XML Specification
<!DOCTYPE <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ATTLIST <!ATTLIST <!ATTLIST <!ATTLIST ]> NEWSPAPER [ NEWSPAPER (ARTICLE+)> ARTICLE (HEADLINE,BYLINE,LEAD,BODY,NOTES)> HEADLINE (#PCDATA)> BYLINE (#PCDATA)> LEAD (#PCDATA)> BODY (#PCDATA)> NOTES (#PCDATA)> ARTICLE ARTICLE ARTICLE ARTICLE AUTHOR CDATA #REQUIRED> EDITOR CDATA #IMPLIED> DATE CDATA #IMPLIED> EDITION CDATA #IMPLIED>

DTD declaration
Internal Declaration: <!DOCTYPE root-element [element-declarations]> External Declaration: <!DOCTYPE root-element SYSTEM "filename">
<?xml version="1.0"?> <!DOCTYPE note [ <!ELEMENT note (to,from,heading,body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)> ]> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Hy there!</body> </note> <?xml version="1.0"?> <!DOCTYPE note SYSTEM "note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Hi there!</body> </note> <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT note (to,from,heading,body)> to (#PCDATA)> from (#PCDATA)> heading (#PCDATA)> body (#PCDATA)>

DTD Declaring elements


Declaration: <!ELEMENT element-name category> or <!ELEMENT element-name (element-content)>
<!ELEMENT element-name EMPTY> <!ELEMENT element-name (#PCDATA)> <!ELEMENT element-name ANY> <!ELEMENT element-name (child1)> <!ELEMENT element-name (child1,child2,...)> <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT element-name element-name element-name element-name element-name (child-name)> (child-name+)> (child-name*)> (child-name?)> (child1|child2)>

<!ELEMENT note (#PCDATA|to|from|header|message)*>

DTD Declaring attributes


Declaration:
<!ATTLIST element-name attribute-name attribute-type default-value>

Attribute Type:

Default Value:

DTD Attributes Examples


<!ATTLIST payment type CDATA "check"> <!ELEMENT square EMPTY> <!ATTLIST square width CDATA "0"> <!ATTLIST <!ATTLIST <!ATTLIST <!ATTLIST person number CDATA #REQUIRED> contact fax CDATA #IMPLIED> sender company CDATA #FIXED "Microsoft"> payment type (check|cash) "cash"> <square width="100" /> <square width="-10" /> <person number="5677" /> <person /> <contact fax="555-667788" /> <contact /> <sender company="Microsoft" /> <sender company="W3Schools" /> <payment type="check" /> <payment type="cash" /> <payment type="credit" />

<square width="100" /> <square width="-10" /> <person number="5677" /> <person /> <contact fax="555-667788" /> <contact /> <sender company="Microsoft" /> <sender company="W3Schools" /> <payment type="check" /> <payment type="cash" /> <payment type="credit" />

DTD Entities
Entities are variables used to define shortcuts to standard text or special characters.
Entity references are references to entities Entities can be declared internal or external

Declaration Internal entity

<!ENTITY name "entity_value">


<?xml version="1.0" standalone="yes" ?> <!DOCTYPE author [ <!ELEMENT author (#PCDATA)> <!ENTITY js "Jo Smith"> ]> <author>&js;</author> <?xml version="1.0" standalone="no" ?> <!DOCTYPE copyright [ <!ELEMENT copyright (#PCDATA)> <!ENTITY c SYSTEM "http://www.xmlwriter.net/copyright.xml"> ]> <copyright>&c;</copyright>

External entity

To make things more complicated, you can even have entities inside entities

Write a valid XML document


<!DOCTYPE <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ELEMENT <!ATTLIST <!ATTLIST <!ATTLIST <!ATTLIST NEWSPAPER [ NEWSPAPER (ARTICLE+)> ARTICLE (HEADLINE,BYLINE,LEAD,BODY,NOTES)> HEADLINE (#PCDATA)> BYLINE (#PCDATA)> LEAD (#PCDATA)> BODY (#PCDATA)> NOTES (#PCDATA)> ARTICLE ARTICLE ARTICLE ARTICLE

AUTHOR CDATA #REQUIRED> EDITOR CDATA #IMPLIED> DATE CDATA #IMPLIED> EDITION CDATA #IMPLIED> <?xml version="1.0" standalone="no" ?> <NEWSPAPER> <!ENTITY NEWSPAPER "Vervet Logic Times"> <!ENTITY PUBLISHER "Vervet Logic<ARTICLE AUTHOR="Cristian"> Press"> <HEADLINE>University today</HEADLINE> <!ENTITY COPYRIGHT "Copyright 1998 Vervet Logic <BYLINE/> Press"> ]> <LEAD>An survey on the quality of education</LEAD> <BODY>The body of the article</BODY> <NOTES>Published by &PUBLISHER; &COPYRIGHT;</NOTES> </ARTICLE> </NEWSPAPER>

XML Schema
What is XML Schema?
XML Schema is an XML-based alternative to DTD. An XML Schema describes the structure of an XML document.

Why XML Schema?


XML Schemas are extensible to future additions XML Schemas are richer and more powerful than DTDs XML Schemas are written in XML XML Schemas support data types XML Schemas support namespaces

Status
ML Schema became a W3C Recommendation 02. May 2001.

Example
<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> <?xml version="1.0"?> <note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>

XML Editors
XML is text based, can be edited with any editor Notepad
You can edit XML, However it laks:
Syntax highlighting Validation

Notepad ++ [http://notepad-plus.sourceforge.net]
Syntax highlighting No validation

Altova XML Spy [http://www.altova.com/]


Syntax highlighting, XML Validation, XSD definition, XPath evaluation, etc. Commercial license

Other:
Most IDEs today come with an intelligent XML editor For example: Eclipse [http://www.eclipse.org/]

How to view, verify syntax and validate XML?


How to view an XML file?
Open it with a browser To view the source:
Right-Click View Source

How to verify the syntax of an XML file?


Open it with a browser

How to validate an XML file?


Open it with a browser

Conclusions

Conclusion
Well formed: An XML document is well formed if it conforms to the XML specification. Valid: An XML document is called valid if it is well formed and it conforms to the rules of a Document Type Definition (DTD) or an XML Schema DTD: A DTD defines the document structure with a list of legal elements and attributes. XML Schema: XML Schema is an XML-based alternative to DTD

Cheat Sheet

References
XML and XML Schema on Wikipedia:
http://en.wikipedia.org/wiki/XML http://en.wikipedia.org/wiki/XML_Schema_(W3C)

W3C XML homepage:


http://www.w3.org/XML/

XML Tutorial on w3schools.com:


http://www.w3schools.com/xml/

Stay up-to-date about XML Technologies


http://xml.coverpages.org/

there is even a Romanian translation


http://xml.coverpages.org/REC-xml-19980210-ro.html

Acknowledgement
Slides containing the w3schools.com logo contain information taken from www.w3schools.com The use is permitted for academic purposes and these materials shall not be freely made available on the internet.

XML DOM + Really Simple Syndication


Academia Tehnic Militar, 2011

What is the DOM?


What is the DOM?
Document Object Model W3C standard Defines the objects and properties of all document elements together with methods to access them Works with any structured document, ex. HTML and XML documents Current release:
DOM Level 3, April 2004 This one builds on DOM Level 2 and DOM Level 1

What is the XML DOM?


A standard object model for XML Documents A standard programming interface for XML Documents Platform and language independent language

The XML DOM is a standard for how to get, change, add or delete XML documents from a scripting / programming language.

The DOM Structure Model


The DOM presents documents as a hierarchy of Node objects Nodes can be of various types
Each sub-type implements additional methods and properties

Some nodes have child nodes of different types


<?xml version="1.0"?> <books> <book> <author>Carson</author> <price format="dollar">31.95</price> <pubdate>05/01/2001</pubdate> </book> <pubinfo> <publisher>MSPress</publisher> <state>WA</state> </pubinfo> </books>

Important Methods and Properties


Typical DOM properties
x.nodeName: the name of x x.nodeValue: the value of x x.parentNode: the parent node of x x.childNodes: the child nodes of x x.attributes: the attributes nodes of x

Typical DOM methods


x.getElementsByTagName(name): get all elements with a specified tag name x.appendChild(node): insert a child node to x x.removeChild(node): remove a child node from x

Example:
txt = xmlDoc.getElementsByTagName("title") [0].childNodes[0].nodeValue

Loading & Parsing XML Documents


The XMLHttpRequest object
Implements an interface exposed by a scripting engine that allows scripts to perform HTTP client functionality, such as submitting form data or loading data from a server

XMLHttpRequest can be used to load and parse XML Documents


responseXML returns the loaded data as XML (DOM Document

Object)

// Loads an XML document via XMLHttpRequest function loadXMLDoc(docName) { if (window.XMLHttpRequest){ xhttp=new XMLHttpRequest(); } else { xhttp=new ActiveXObject("Microsoft.XMLHTTP"); } try { xhttp.open("GET",docName,false); xhttp.send(); } catch (e) { alert("Unable to load XML: " + e); return null; } return xhttp.responseXML;

AJAX XML Exercise


Build a suggest function for an online music library The details about the CDs are stored in an XML file

<html> <head> <script src="clienthint.js"></script> </head> <body> <h1>Online Music Library</h1> <form> Search Title: <input type="text" id="txt1" onkeyup="showHint(this.value)" /> </form> <p>Suggestions: <span id="txtHint"></span></p> </body> </html>

The XML Document


<?xml version='1.0'?> <catalog> <cd> <title>Empire Burlesque</title> <artist>Bob Dylan</artist> <country>USA</country> <company>Columbia</company> <price>10.90</price> <year>1985</year> </cd> <cd> <title>Hide your heart</title> <artist>Bonnie Tyler</artist> <country>UK</country> <company>CBS Records</company> <price>9.90</price> <year>1988</year> </cd> ............................. </catalog>

The JavaScript
//Load the CD data-base var library = loadXMLDoc("cd.xml"); // Called on each update of search field function showHint(str) { var suggest = ""; var upperStr = str.toUpperCase(); var list = library.getElementsByTagName("TITLE"); for (var i=0; i<list.length; i++) { if (list[i].firstChild.nodeValue.toUpperCase().indexOf(upperStr) == 0) { if (suggest.length > 0) suggest += ", "; suggest += list[i].firstChild.nodeValue; } } } document.getElementById("txtHint").innerHTML = suggest;

Really Simple Syndication

What is RSS?
What is RSS?
RSS stands for Really Simple Syndication RSS allows you to syndicate your site content RSS defines an easy way to share and view headlines and content RSS files can be automatically updated RSS allows personalized views for different sites RSS is written in XML

How does this work ?


The Website or Blog creates an RSS feed. The RSS Feed lives on an Internet Server The Feed Reader reads the RSS file and displays it.

Benefits.
Benefits
Users do not disclose their email address when subscribing to a feed No spam, viruses, phishing, and identity theft. Users do not have to send an unsubscribe request to stop receiving news The feed items are automatically sorted in that each feed URL has its own sets of entries (unlike an email box)

Short History / Terminology


Web Feed
Generic name for data format used for providing users with frequently updated content

Technology for web feeds


RSS Atom
Atom Syndication Format Atom Publishing Protocol

History
1999 2000 2003 Netscape develops RSS 0.90 Userland develops RSS 0.92 RSS 2.0 is published

Status
RSS is an informal specification, not published by a standards consortium

Syntax
RSS is XML
All XML syntax rules apply
<?xml version="1.0" encoding="ISO-8859-1" ?> <rss version="2.0"> <channel> <title>W3Schools Home Page</title> <link>http://www.w3schools.com</link> <description>Free web building tutorials</description> <item> <title>RSS Tutorial</title> <link>http://www.w3schools.com/rss</link> <description>New RSS tutorial on W3Schools</description> </item> <item> <title>XML Tutorial</title> <link>http://www.w3schools.com/xml</link> <description>New XML tutorial on W3Schools</description> </item> </channel> </rss>

Channel
The distribution channel describes the RSS feed. Mandatory: <title>
Defines the title of the channel

<link>
Defines the hyperlink to the channel

<description>
Describes the channel

<item>
Each <item> element defines an article or "story" in an RSS feed

Optional: <category>
Defines one or more categories for the feed

<copyright> Notifies about copyrighted material <image> Allows an image to be displayed when aggregators present a feed

Channel - example
<?xml version="1.0" encoding="ISO-8859-1" ?> <rss version="2.0"> <channel> <title>W3Schools Home Page</title> <link>http://www.w3schools.com</link> <description>Free web building tutorials</description> <category>Web development</category> <copyright>2006 Refsnes Data as. All rights reserved.</copyright> <image> <url>http://www.w3schools.com/images/logo.gif</url> <title>W3Schools.com</title> <link>http://www.w3schools.com</link> </image> <item> <title>RSS Tutorial</title> <link>http://www.w3schools.com/rss</link> <description>New RSS tutorial on W3Schools</description> </item> </channel> </rss>

Items
Each <item> element defines an article or "story" in an RSS feed Mandatory: <title>
Defines the title of the item

<link>
Defines the hyperlink to the item

<description>
Describes the item

Optional: <author>
Is used to specify the e-mail address of the author of an item

<comments>
Allows an item to link to comments about that item

<enclosure> Allows a media-file to be included with an item

Item - example
<?xml version="1.0" encoding="ISO-8859-1" ?> <rss version="2.0"> <channel> <title>W3Schools Home Page</title> <link>http://www.w3schools.com</link> <description>Free web building tutorials</description> <item> <title>RSS Tutorial</title> <link>http://www.w3schools.com/rss</link> <description>New RSS tutorial on W3Schools</description> <author>hege@refsnesdata.no</author> <comments>http://www.w3schools.com/comments</comments> <enclosure url="http://www.w3schools.com/rss/rss.mp3" length="5000" type="audio/mpeg" /> </item> </channel> </rss>

Exercise: Build a DTD for RSS 2.0


<!ELEMENT rss (channel)> <!ATTLIST rss version CDATA #FIXED "2.0"> <!-- A channel can apparently either have one or more items, or just a title, link, and description of its own --> <!ELEMENT channel ((item+)| (title,link,description,(language|copyright| managingEditor|webMaster|pubDate|lastBuildDate| category|generator|docs|cloud|ttl|image| textInput|skipHours|skipDays)*))> <!ELEMENT item ((title|description)+,link?, (author|category|comments|enclosure|guid|pubDate|source)*)> <!ELEMENT <!ELEMENT <!ATTLIST <!ELEMENT <!ATTLIST author (#PCDATA)> category (#PCDATA)> category domain CDATA #IMPLIED> cloud (#PCDATA)> cloud domain CDATA #IMPLIED port CDATA #IMPLIED path CDATA #IMPLIED registerProcedure CDATA #IMPLIED protocol CDATA #IMPLIED> <!ELEMENT comments (#PCDATA)> <!ELEMENT copyright (#PCDATA)> ..........For a complete DTD see www.silmaril.ie/software/rss2.dtd

Linking RSS to Websites


To let browsers know there is a feed on a website
<link rel="alternate" type="application/rss+xml" href="http://www.xul.fr/rss.xml" title="Your title">

How to validate a feed ?


Via DTD or XML Schema http://www.feedvalidator.org/

RSS in practice
RSS is used for:
News feeds
http://www.hotnews.ro/rss http://rss.cnn.com/rss/cnn_topstories.rss http://rss.slashdot.org/Slashdot/slashdot

Podcasts
http://www.manager-tools.com/feeds/mt_podcasts.xml http://www.radioguerrilla.ro/rss/podcast

Video podcasts
http://feeds.feedburner.com/tedtalks_video

BitTorrent and RSS [Broadcatching] Many more

References
Wikipedia
http://en.wikipedia.org/wiki/Web_feed http://en.wikipedia.org/wiki/RSS

Tutorials
http://www.w3schools.com/rss/ http://www.xul.fr/en-xml-rss.html

Feed validator
http://www.feedvalidator.org/

Some RSS feeds


http://rss.cnn.com/rss/cnn_topstories.rss http://rss.slashdot.org/Slashdot/slashdot http://www.joelonsoftware.com/rss.xml http://www.manager-tools.com/feeds/mt_podcasts.xml

Acknowledgement
Slides containing the w3schools.com logo contain information taken from www.w3schools.com The use is permitted for academic purposes and these materials shall not be freely made available on the internet.

XPath
Academia Tehnic Militar, 2012

XPath

XPath
What is XPath?
XPath is a syntax for defining parts of an XML document XPath uses path expressions to navigate in XML documents XPath contains a library of standard functions XPath is a major element in XSLT XPath is a W3C recommendation

XPath uses path expressions to select nodes or nodesets in an XML document. XPath includes over 100 built-in functions. Status:
Edition 1.0, W3C Recommendation, 16.11.1999 Edition 2.0, W3C Recommendation, 23.01.2007

XPath, XQuery, XPointer, XLink, XSLT


XPath, the XML Path Language, is a query language for selecting nodes from an XML document. XPointer is a system for addressing components of XML based internet media. The XML Linking Language, or XLink, is an XML markup language used for creating hyperlinks in XML documents. XQuery is a query and functional programming language that is designed to query collections of XML data. XSL Transformations (XSLT) is a declarative XMLbased language used for the transformation of XML documents into other XML documents.

Terminology
Relationships
Parent Each element or attribute has one parent Children Element nodes can have zero or more children Sibling Nodes that have the same parent Ancestor A node's parent, parent's parent, etc. <bookstore> Descendant <book> <title>Harry A node's children, children's children, etc. Potter</title> <author>J K. Rowling</author>
<year>2005</year> <price>29.99</price> </book> </bookstore>

Abbreviated Syntax
XPath uses expressions to select nodes or node-sets in an XML document A very simple example:
<A> <B> <C/> </B> </A> /A/B/C

Most useful path expressions:

Abbreviated Syntax - Examples


<?xml version="1.0" encoding="ISO-8859-1"?> <bookstore> <book> <title lang="eng">Harry Potter</title> <price>29.99</price> </book> <book> <title lang="eng">Learning XML</title> <price>39.95</price> </book> </bookstore>

??????????????? ???????????

Predicates
Predicates are used to find a specific node or a node that contains a specific value Predicates are always embedded in square brackets There is no limit to the number of predicates in a step

<?xml version="1.0" encoding="ISO-8859-1"?> <bookstore> <book> <title lang="eng">Harry Potter</title> <price>29.99</price> </book> <book> <title lang="eng">Learning XML</title> <price>39.95</price> </book> </bookstore>

/bookstore/book[1] /bookstore/book[last()] /bookstore/book[last()-1] /bookstore/book[position()<3] //title[@lang] //title[@lang='eng'] /bookstore/book[price>35.00] /bookstore/book[price>35.00]/title

Predicates (answers)

Unknown nodes & several paths

????????????

????????????

Expanded Syntax - Axes


An axis defines a node-set relative to the current node.

Expanded Syntax - Path Expressions


Absolute location path
Starts with /
/step/step/step/...

Relative location path


Do not start with /
step/step/...

A step consists of:


An axis (defines the tree-relationship between the selected nodes and the current node) A node-test (identifies a node within an axis) Zero or more predicates (to further refine the selected node-set)
axisname::nodetest[predicate]

A node test consists of a node name or more general expressions


child::book[1] child::text()[@lang=eng]

Abbreviated vs. Expanded


Abbreviated syntax
Easier to type, less verbose

Expanded syntax is longer


More verbose, less cryptic, more flexible
/A/B/C /child::A/child::B/child::C A//B/*[1] child::A/descendant-or-self::node()/child::B/child::*[position()=1]

Functions
The available operators are:
The "/", "//" and "[...]" operators, used in path expressions, as described A union operator, "|", which forms the union of two node-sets. Boolean operators "and" and "or", and a function "not()" Arithmetic operators "+", "-", "*", "div" (divide), and "mod" Comparison operators "=", "!=", "<", ">", "<=", ">="

The function library includes:


Functions to manipulate strings: concat(), substring(), contains(), substringbefore(), substring-after(), translate(), normalize-space(), string-length() Functions to manipulate numbers: sum(), round(), floor(), ceiling() Functions to get properties of nodes: name(), local-name(), namespace-uri() Functions to get information about the processing context: position(), last() Type conversion functions: string(), number(), boolean()

Usage of operators and functions


<?xml version='1.0'?> <EXAMPLE> <CUSTOMER id="1" type="B">Mr. <CUSTOMER id="2" type="C">Mr. </EXAMPLE>

Jones</CUSTOMER> Johnson</CUSTOMER>

//item[@price > 2*@discount] //EXAMPLE/CUSTOMER[@id='2' or @type='C'] //EXAMPLE/CUSTOMER[@id='1' and (@type='B' or @type='C')] //EXAMPLE/CUSTOMER[. !='EggHeadCafe']" //EXAMPLE/CUSTOMER[substring(@type,1,2) ='DE'] //EXAMPLE/CUSTOMER[contains(@type,'DECEA')]

More examples ...


<?xml version="1.0" encoding="utf-8"?> <wikimedia> <projects> <project name="Wikipedia" launch="2001-01-05"> <editions> <edition language="English">en.wikipedia.org</edition> <edition language="German">de.wikipedia.org</edition> <edition language="French">fr.wikipedia.org</edition> <edition language="Polish">pl.wikipedia.org</edition> <edition language="Spanish">es.wikipedia.org</edition> </editions> </project> <project name="Wiktionary" launch="2002-12-12"> <editions> <edition language="English">en.wiktionary.org</edition> <edition language="French">fr.wiktionary.org</edition> <edition language="Vietnamese">vi.wiktionary.org</edition> <edition language="Turkish">tr.wiktionary.org</edition> <edition language="Spanish">es.wiktionary.org</edition> </editions> </project> </projects> </wikimedia> /wikimedia/projects/project/@name /wikimedia//editions /wikimedia/projects/project/editions/edition[@language="English"]/text() /wikimedia/projects/project[@name="Wikipedia"]/editions/edition/text()

XPath 1.0 vs. XPath 2.0


XPath 2.0
Not as widely implemented as XPath 1.0 Language is significantly larger than XPath 1.0 Some basic concepts have changed:
Data model Type system

XPath 2.0 was developed together with XSLT 2.0 and XQuery 1.0

Conclusions

References
Wikipedia
http://en.wikipedia.org/wiki/XPath_1.0 http://en.wikipedia.org/wiki/XPath_2.0

XPath tutorial on W3Schools.com


http://www.w3schools.com/xpath

XPath function reference


http://www.w3schools.com/xpath/xpath_functions.asp

The XPath specification


http://www.w3.org/TR/xpath

Some complex XPath examples


http://www.eggheadcafe.com/articles/20030627d.asp http://wiki.novell.com/index.php/XPATH_Examples

Cheat Sheet

Acknowledgement
Slides containing the w3schools.com logo contain information taken from www.w3schools.com The use is permitted for academic purposes and these materials shall not be freely made available on the internet.

You might also like