You are on page 1of 56

Networking Basics

Computers running on the Internet communicate to each other using either the Transmission Control Protocol (TCP) or the User Datagram Protocol (UDP), as this diagram illustrates:

When you write Java programs that communicate over the network, you are programming at the application layer. Typically, you don't need to concern yourself with the TCP and UDP layers. Instead, you can use the classes in the java.net package. These classes provide systemindependent network communication. However, to decide which Java classes your programs should use, you do need to understand how TCP and UDP differ.

TCP
When two applications want to communicate to each other reliably, they establish a connection and send data back and forth over that connection. This is analogous to making a telephone call. If you want to speak to Aunt Beatrice in Kentucky, a connection is established when you dial her phone number and she answers. You send data back and forth over the connection by speaking to one another over the phone lines. Like the phone company, TCP guarantees that data sent from one end of the connection actually gets to the other end and in the same order it was sent. Otherwise, an error is reported. TCP provides a point-to-point channel for applications that require reliable communications. The Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP), and Telnet are all examples of applications that require a reliable communication channel. The order in which the data is sent and received over the network is critical to the success of these applications. When HTTP is used to read from a URL, the data must be received in the order in which it was sent. Otherwise, you end up with a jumbled HTML file, a corrupt zip file, or some other invalid information. Definition: TCP (Transmission Control Protocol) is a connection-based protocol that provides a reliable flow of data between two computers.

UDP
The UDP protocol provides for communication that is not guaranteed between two applications on the network. UDP is not connection-based like TCP. Rather, it sends independent packets of data, called datagrams, from one application to another. Sending datagrams is much like sending

a letter through the postal service: The order of delivery is not important and is not guaranteed, and each message is independent of any other. Definition: UDP (User Datagram Protocol) is a protocol that sends independent packets of data, called datagrams, from one computer to another with no guarantees about arrival. UDP is not connection-based like TCP. For many applications, the guarantee of reliability is critical to the success of the transfer of information from one end of the connection to the other. However, other forms of communication don't require such strict standards. In fact, they may be slowed down by the extra overhead or the reliable connection may invalidate the service altogether. Consider, for example, a clock server that sends the current time to its client when requested to do so. If the client misses a packet, it doesn't really make sense to resend it because the time will be incorrect when the client receives it on the second try. If the client makes two requests and receives packets from the server out of order, it doesn't really matter because the client can figure out that the packets are out of order and make another request. The reliability of TCP is unnecessary in this instance because it causes performance degradation and may hinder the usefulness of the service. Another example of a service that doesn't need the guarantee of a reliable channel is the ping command. The purpose of the ping command is to test the communication between two programs over the network. In fact, ping needs to know about dropped or out-of-order packets to determine how good or bad the connection is. A reliable channel would invalidate this service altogether. The UDP protocol provides for communication that is not guaranteed between two applications on the network. UDP is not connection-based like TCP. Rather, it sends independent packets of data from one application to another. Sending datagrams is much like sending a letter through the mail service: The order of delivery is not important and is not guaranteed, and each message is independent of any others. Note: Many firewalls and routers have been configured not to allow UDP packets. If you're having trouble connecting to a service outside your firewall, or if clients are having trouble connecting to your service, ask your system administrator if UDP is permitted.

Understanding Ports
Generally speaking, a computer has a single physical connection to the network. All data destined for a particular computer arrives through that connection. However, the data may be intended for different applications running on the computer. So how does the computer know to which application to forward the data? Through the use of ports. Data transmitted over the Internet is accompanied by addressing information that identifies the computer and the port for which it is destined. The computer is identified by its 32-bit IP address, which IP uses to deliver data to the right computer on the network. Ports are identified by a 16-bit number, which TCP and UDP use to deliver the data to the right application.

In connection-based communication such as TCP, a server application binds a socket to a specific port number. This has the effect of registering the server with the system to receive all data destined for that port. A client can then rendezvous with the server at the server's port, as illustrated here:

Definition: The TCP and UDP protocols use ports to map incoming data to a particular process running on a computer. In datagram-based communication such as UDP, the datagram packet contains the port number of its destination and UDP routes the packet to the appropriate application, as illustrated in this figure:

Port numbers range from 0 to 65,535 because ports are represented by 16-bit numbers. The port numbers ranging from 0 - 1023 are restricted; they are reserved for use by well-known services such as HTTP and FTP and other system services. These ports are called well-known ports. Your applications should not attempt to bind to them.

Networking Classes in the JDK


Through the classes in java.net, Java programs can use TCP or UDP to communicate over the Internet. The URL, URLConnection, Socket, and ServerSocket classes all use TCP to communicate over the network. The DatagramPacket, DatagramSocket, and MulticastSocket classes are for use with UDP.

Working with URLs


URL is the acronym for Uniform Resource Locator. It is a reference (an address) to a resource on the Internet. You provide URLs to your favorite Web browser so that it can locate files on the Internet in the same way that you provide addresses on letters so that the post office can locate your correspondents.

Java programs that interact with the Internet also may use URLs to find the resources on the Internet they wish to access. Java programs can use a class called URL in the java.net package to represent a URL address. Terminology Note: The term URL can be ambiguous. It can refer to an Internet address or a URL object in a Java program. Where the meaning of URL needs to be specific, this text uses "URL address" to mean an Internet address and "URL object" to refer to an instance of the URL class in a program.

What Is a URL? If you've been surfing the Web, you have undoubtedly heard the term URL and have used URLs to access HTML pages from the Web. It's often easiest, although not entirely accurate, to think of a URL as the name of a file on the World Wide Web because most URLs refer to a file on some machine on the network. However, remember that URLs also can point to other resources on the network, such as database queries and command output. Definition: URL is an acronym for Uniform Resource Locator and is a reference (an address) to a resource on the Internet. The following is an example of a URL which addresses the Java Web site hosted by Sun Microsystems:

As in the previous diagram, a URL has two main components: Protocol identifier Resource name Note that the protocol identifier and the resource name are separated by a colon and two forward slashes. The protocol identifier indicates the name of the protocol to be used to fetch the resource. The example uses the Hypertext Transfer Protocol (HTTP), which is typically used to serve up hypertext documents. HTTP is just one of many different protocols used to access different types of resources on the net. Other protocols include File Transfer Protocol (FTP), Gopher, File, and News. The resource name is the complete address to the resource. The format of the resource name depends entirely on the protocol used, but for many protocols, including HTTP, the resource name contains one or more of the components listed in the following table:

Host Name Filename Port Number Reference

The name of the machine on which the resource lives. The pathname to the file on the machine. The port number to which to connect (typically optional). A reference to a named anchor within a resource that usually identifies a specific location within a file (typically optional).

For many protocols, the host name and the filename are required, while the port number and reference are optional. For example, the resource name for an HTTP URL must specify a server on the network (Host Name) and the path to the document on that machine (Filename); it also can specify a port number and a reference. In the URL for the Java Web site java.sun.com is the host name and an empty path or the trailing slash is shorthand for the file named /index.html.

Creating a URL The easiest way to create a URL object is from a String that represents the human-readable form of the URL address. This is typically the form that another person will use for a URL. For example, the URL for the Gamelan site, which is a directory of Java resources, takes the following form: http://www.gamelan.com/ In your Java program, you can use a String containing this text to create a URL object: URL gamelan = new URL("http://www.gamelan.com/"); The URL object created above represents an absolute URL. An absolute URL contains all of the information necessary to reach the resource in question. You can also create URL objects from a relative URL address. Creating a URL Relative to Another A relative URL contains only enough information to reach the resource relative to (or in the context of) another URL.

Relative URL specifications are often used within HTML files. For example, suppose you write an HTML file called JoesHomePage.html. Within this page, are links to other pages, PicturesOfMe.html and MyKids.html, that are on the same machine and in the same directory as JoesHomePage.html. The links to PicturesOfMe.html and MyKids.html from JoesHomePage.html could be specified just as filenames, like this: <a href="PicturesOfMe.html">Pictures of Me</a> <a href="MyKids.html">Pictures of My Kids</a> These URL addresses are relative URLs. That is, the URLs are specified relative to the file in which they are contained--JoesHomePage.html. In your Java programs, you can create a URL object from a relative URL specification. For example, suppose you know two URLs at the Gamelan site: http://www.gamelan.com/pages/Gamelan.game.html http://www.gamelan.com/pages/Gamelan.net.html You can create URL objects for these pages relative to their common base URL: http://www.gamelan.com/pages/ like this: URL gamelan = new URL("http://www.gamelan.com/pages/"); URL gamelanGames = new URL(gamelan, "Gamelan.game.html"); URL gamelanNetwork = new URL(gamelan, "Gamelan.net.html"); This code snippet uses the URL constructor that lets you create a URL object from another URL object (the base) and a relative URL specification. The general form of this constructor is: URL(URL baseURL, String relativeURL) The first argument is a URL object that specifies the base of the new URL. The second argument is a String that specifies the rest of the resource name relative to the base. If baseURL is null, then this constructor treats relativeURL like an absolute URL specification. Conversely, if relativeURL is an absolute URL specification, then the constructor ignores baseURL. This constructor is also useful for creating URL objects for named anchors (also called references) within a file. For example, suppose the Gamelan.network.html file has a named anchor called BOTTOM at the bottom of the file. You can use the relative URL constructor to create a URL object for it like this:

URL gamelanNetworkBottom = new URL(gamelanNetwork, "#BOTTOM"); Other URL Constructors The URL class provides two additional constructors for creating a URL object. These constructors are useful when you are working with URLs, such as HTTP URLs, that have host name, filename, port number, and reference components in the resource name portion of the URL. These two constructors are useful when you do not have a String containing the complete URL specification, but you do know various components of the URL. For example, suppose you design a network browsing panel similar to a file browsing panel that allows users to choose the protocol, host name, port number, and filename. You can construct a URL from the panel's components. The first constructor creates a URL object from a protocol, host name, and filename. The following code snippet creates a URL to the Gamelan.net.html file at the Gamelan site: new URL("http", "www.gamelan.com", "/pages/Gamelan.net.html"); This is equivalent to new URL("http://www.gamelan.com/pages/Gamelan.net.html"); The first argument is the protocol, the second is the host name, and the last is the pathname of the file. Note that the filename contains a forward slash at the beginning. This indicates that the filename is specified from the root of the host. The final URL constructor adds the port number to the list of arguments used in the previous constructor: URL gamelan = new URL("http", "www.gamelan.com", 80, "pages/Gamelan.network.html"); This creates a URL object for the following URL: http://www.gamelan.com:80/pages/Gamelan.network.html If you construct a URL object using one of these constructors, you can get a String containing the complete URL address by using the URL object's toString method or the equivalent toExternalForm method.

URL addresses with Special characters Some URL addresses contain special characters, for example the space character. Like this: http://foo.com/hello world/ To make theses characters legal they need to encoded before passing them to the URL constructor. URL url = new URL("http://foo.com/hello%20world"); Encoding the special character(s) in this example is easy as there is only one character that needs encoding, but for URL addresses that have several of these characters or if you are unsure when writing your code what URL addresses you will need to access, you can use the multi-argument constructors of the java.net.URI class to automatically take care of the encoding for you. URI uri = new URI("http", "foo.com", "/hello world/", ""); And then convert the URI to a URL. URL url = uri.toURL();

MalformedURLException Each of the four URL constructors throws a MalformedURLException if the arguments to the constructor refer to a null or unknown protocol. Typically, you want to catch and handle this exception by embedding your URL constructor statements in a try/catch pair, like this: try { URL myURL = new URL(. . .) } catch (MalformedURLException e) { . . . // exception handler code here . . . } See Exceptions for information about handling exceptions.

Note: URLs are "write-once" objects. Once you've created a URL object, you cannot change any of its attributes (protocol, host name, filename, or port number). Parsing a URL The URL class provides several methods that let you query URL objects. You can get the protocol, authority, host name, port number, path, query, filename, and reference from a URL using these accessor methods: getProtocol getAuthority getHost getPort Returns the port number component of the URL. The getPort method returns an integer that is the port number. If the port is not set, getPort returns -1. Returns the path component of this URL. Returns the query component of this URL. getFile Returns the filename component of the URL. The getFile method returns the same as getPath, plus the concatenation of the value of getQuery, if any. getRef Returns the reference component of the URL. Note: Remember that not all URL addresses contain these components. The URL class provides these methods because HTTP URLs do contain these components and are perhaps the most commonly used URLs. The URL class is somewhat HTTPcentric. Returns the protocol identifier component of the URL. Returns the authority component of the URL. Returns the host name component of the URL.

getPath getQuery

You can use these getXXX methods to get information about the URL regardless of the constructor that you used to create the URL object. The URL class, along with these accessor methods, frees you from ever having to parse URLs again! Given any string specification of a URL, just create a new URL object and call any of the accessor methods for the information you need. This small example program creates a URL from a string specification and then uses the URL object's accessor methods to parse the URL: import java.net.*; import java.io.*; public class ParseURL { public static void main(String[] args) throws Exception { URL aURL = new URL("http://java.sun.com:80/docs/books/tutorial"+"/index.ht ml?name=networking#DOWNLOADING"); System.out.println("protocol = " + aURL.getProtocol()); System.out.println("authority = " + aURL.getAuthority()); System.out.println("host = " + aURL.getHost()); System.out.println("port = " + aURL.getPort()); System.out.println("path = " + aURL.getPath()); System.out.println("query = " + aURL.getQuery()); System.out.println("filename = " + aURL.getFile()); System.out.println("ref = " + aURL.getRef()); } } Here's the output displayed by the program: protocol = http authority = java.sun.com:80 host = java.sun.com port = 80 path = /docs/books/tutorial/index.html query = name=networking filename = /docs/books/tutorial/index.html?name=networking ref = DOWNLOADING Reading Directly from a URL

After you've successfully created a URL, you can call the URL's openStream() method to get a stream from which you can read the contents of the URL. The openStream() method returns a java.io.InputStream object, so reading from a URL is as easy as reading from an input stream. The following small Java program uses openStream() to get an input stream on the URL http://www.yahoo.com/. It then opens a BufferedReader on the input stream and reads from the BufferedReader thereby reading from the URL. Everything read is copied to the standard output stream: import java.net.*; import java.io.*; public class URLReader { public static void main(String[] args) throws Exception { URL yahoo = new URL("http://www.yahoo.com/"); BufferedReader in = new BufferedReader( new InputStreamReader( yahoo.openStream())); String inputLine; while ((inputLine = in.readLine()) != null) System.out.println(inputLine); in.close(); } }

When you run the program, you should see, scrolling by in your command window, the HTML commands and textual content from the HTML file located at http://www.yahoo.com/. Alternatively, the program might hang or you might see an exception stack trace. If either of the latter two events occurs, you may have to set the proxy host so that the program can find the Yahoo server.

Connecting to a URL

After you've successfully created a URL object, you can call the URL object's openConnection method to get a URLConnection object, or one of its protocol specific subclasses, e.g. java.net.HttpURLConnection You can use this URLConnection object to setup parameters and general request properties that you may need before connecting. Connection to the remote object represented by the URL is only initiated when the URLConnection.connect method is called. When you do this you are initializing a communication link between your Java program and the URL over the network. For example, you can open a connection to the Yahoo site with the following code: try { URL yahoo = new URL("http://www.yahoo.com/"); URLConnection yahooConnection = yahoo.openConnection(); yahooConnection.connect(); } catch . . } catch failed . . } (MalformedURLException e) { . (IOException e) { . // new URL() failed // openConnection()

A new URLConnection object is created every time by calling the openConnection method of the protocol handler for this URL. You are not always required to explicitly call the connect method to initiate the connection. Operations that depend on being connected, like getInputStream, getOutputStream, etc, will implicitly perform the connection, if necessary. Now that you've successfully connected to your URL, you can use the URLConnection object to perform actions such as reading from or writing to the connection. The next section shows you how.

URLConnection class The abstract class URLConnection is the superclass of all classes that represent a communications link between the application and a URL. Instances of this class can be used both to read from and to write to the resource referenced by the URL. In general, creating a connection to a URL is a multistep process:

The connection object is created by invoking the openConnection method on a URL. The setup parameters and general request properties are manipulated. The actual connection to the remote object is made, using the connect method. The remote object becomes available. The header fields and the contents of the remote object can be accessed.

The setup parameters are modified using the following methods: 1. setAllowUserInteraction. 2. setDoInput- setDoOutput- Sets the value of the this URLConnection to the specified value. 3. setIfModifiedSince 4. setUseCaches and the general request properties are modified using the method: 1. setRequestPropertyDefault values for the AllowUserInteraction and UseCaches parameters can be set using the methods setDefaultAllowUserInteraction and setDefaultUseCaches. Each of the above set methods has a corresponding get method to retrieve the value of the parameter or general request property. The specific parameters and general request properties that are applicable are protocol specific. The following methods are used to access the header fields and the contents after the connection is made to the remote object: 1. getContent 2. getHeaderField 3. getInputStream 4. getOutputStream Certain header fields are accessed frequently. The methods: 1. getContentEncoding 2. getContentLength field for

doOutput

3. getContentType 4. getDate 5. getExpiration 6. getLastModifed provide convenient access to these fields. The getContentType method is used by the getContent method to determine the type of the remote object; subclasses may find it convenient to override the getContentType method.

Constructor Summary
protected

URLConnection(URL url) Constructs a URL connection to the specified URL.

Method Summary

void addRequestProperty(String key, String value) Adds a general request property specified by a key-value pair.
abstract void

connect() Opens a communications link to the resource referenced by this URL, if such a connection has not already been established. getAllowUserInteraction()

boolean

Returns the value of the allowUserInteraction field for this object. Object getContent() Retrieves the contents of this URL connection. Object getContent(Class[] classes) Retrieves the contents of this URL connection. String getContentEncoding() Returns the value of the content-encoding header field.
int

getContentLength() Returns the value of the content-length header field.

String getContentType() Returns the value of the content-type header field.


long

getDate() Returns the value of the date header field. getDefaultAllowUserInteraction() Returns the default value of the allowUserInteraction field.

static boolean

static

String getDefaultRequestProperty(String key) Deprecated. The instance specific getRequestProperty method should be used after an appropriate instance of URLConnection is obtained.
boolean

getDefaultUseCaches() Returns the default value of a URLConnection'suseCaches flag.

boolean

getDoInput() Returns the value of this URLConnection'sdoInput flag. getDoOutput() Returns the value of this URLConnection'sdoOutput flag. getExpiration() Returns the value of the expires header field.

boolean

long

static

FileNam getFileNameMap() eMap Loads filename map (a mimetable) from a data file. String getHeaderField(int n) Returns the value for the nth header field. String getHeaderField(String name) Returns the value of the named header field.
long

getHeaderFieldDate(String name, long Default) Returns the value of the named field parsed as date. getHeaderFieldInt(String name, int Default) Returns the value of the named field parsed as a number.

int

String getHeaderFieldKey(int n) Returns the key for the nth header field. Map getHeaderFields() Returns an unmodifiable Map of the header fields.
long

getIfModifiedSince()

Returns the value of this object's ifModifiedSince field. InputStream getInputStream() Returns an input stream that reads from this open connection.
long

getLastModified() Returns the value of the last-modified header field.

OutputStrea getOutputStream() m Returns an output stream that writes to this connection. Permission getPermission() Returns a permission object representing the permission necessary to make the connection represented by this object. Map getRequestProperties() Returns an unmodifiable Map of general request properties for this connection. String getRequestProperty(String key) Returns the value of the named general request property for this connection. URL getURL() Returns the value of this URLConnection'sURL field.
boolean

getUseCaches() Returns the value of this URLConnection'suseCaches field.

static

String guessContentTypeFromName(String fname)

Tries to determine the content type of an object, based on the specified "file" component of a URL.
static

String guessContentTypeFromStream(InputStream is) Tries to determine the type of an input stream based on the characters at the beginning of the input stream.
void

setAllowUserInteraction(boolean allowuserinter action) Set the value of the allowUserInteraction field of this URLConnection. setContentHandlerFactory(ContentHandlerFactory fac) Sets the ContentHandlerFactory of an application. setDefaultAllowUserInteraction(boolean default allowuserinteraction) Sets the default value of the allowUserInteraction field for all future URLConnection objects to the specified value. setDefaultRequestProperty(String key, String value) Deprecated. The instance specific setRequestProperty method should be used after an appropriate instance of URLConnection is obtained. setDefaultUseCaches(boolean defaultusecaches) Sets the default value of the useCaches field to the specified value. setDoInput(boolean doinput) Sets the value of the doInput field for this URLConnection to the specified value.

static void

static void

static void

void

void

void

setDoOutput(boolean dooutput) Sets the value of the doOutput field for this URLConnection to the specified value. setFileNameMap(FileNameMap map) Sets the FileNameMap. setIfModifiedSince(long ifmodifiedsince) Sets the value of the ifModifiedSince field of this URLConnection to the specified value. setRequestProperty(String key, String value) Sets the general request property. setUseCaches(boolean usecaches) Sets the value of the useCaches field of this URLConnection to the specified value.

static void

void

void

void

String toString() Returns a String representation of this URL connection.

Reading from and Writing to a URLConnection The URLConnection class contains many methods that let you communicate with the URL over the network. URLConnection is an HTTP-centric class; that is, many of its methods are useful only when you are working with HTTP URLs. However, most URL protocols allow you to read from and write to the connection. This section describes both functions.

Reading from aURLConnection The following program performs the same function as the URLReader program shown in Reading Directly from a URL.

However, rather than getting an input stream directly from the URL, this program explicitly retrieves a URLConnection object and gets an input stream from the connection. The connection is opened implicitly by calling getInputStream. Then, like URLReader, this program creates a BufferedReader on the input stream and reads from it. The bold statements highlight the differences between this example and the previous import java.net.*; import java.io.*; public class URLConnectionReader { public static void main(String[] args) throws Exception { URL yahoo = new URL("http://www.yahoo.com/"); URLConnection yc = yahoo.openConnection(); BufferedReader in = new BufferedReader( new InputStreamReader( yc.getInputStream())); String inputLine; while ((inputLine = in.readLine()) != null) System.out.println(inputLine); in.close(); } } The output from this program is identical to the output from the program that opens a stream directly from the URL. You can use either way to read from a URL. However, reading from aURLConnection instead of reading directly from a URL might be more useful. This is because you can use the URLConnection object for other tasks (like writing to the URL) at the same time. Again, if the program hangs or you see an error message, you may have to set the proxy host so that the program can find the Yahoo server. Writing to a URLConnection Many HTML pages contain forms-- text fields and other GUI objects that let you enter data to send to the server. After you type in the required information and initiate the query by clicking a button, your Web browser writes the data to the URL over the network. At the other end the server receives the data, processes it, and then sends you a response, usually in the form of a new HTML page.

Connecting to Server

What Is a Socket?

Normally, a server runs on a specific computer and has a socket that is bound to a specific port number. The server just waits, listening to the socket for a client to make a connection request. On the client-side: The client knows the hostname of the machine on which the server is running and the port number on which the server is listening. To make a connection request, the client tries to rendezvous with the server on the server's machine and port. The client also needs to identify itself to the server so it binds to a local port number that it will use during this connection. This is usually assigned by the system.

If everything goes well, the server accepts the connection. Upon acceptance, the server gets a new socket bound to the same local port and also has its remote endpoint set to the address and port of the client. It needs a new socket so that it can continue to listen to the original socket for connection requests while tending to the needs of the connected client.

On the client side, if the connection is accepted, a socket is successfully created and the client can use the socket to communicate with the server. The client and server can now communicate by writing to or reading from their sockets. Definition: A socket is one endpoint of a two-way communication link between two programs running on the network. A socket is bound to a port number so that the TCP layer can identify the application that data is destined to be sent. An endpoint is a combination of an IP address and a port number. Every TCP connection can be uniquely identified by its two endpoints. That way you can have multiple connections between your host and the server. The java.net package in the Java platform provides a class, Socket, that implements one side of a two-way connection between your Java program and another program on the network. The Socket class sits on top of a platform-dependent implementation, hiding the details of any particular system from your Java program. By using the java.net.Socket class instead of relying on native code, your Java programs can communicate over the network in a platform-independent fashion.

Additionally, java.net includes the ServerSocket class, which implements a socket that servers can use to listen for and accept connections to clients. This lesson shows you how to use the Socket and ServerSocket classes. If you are trying to connect to the Web, the URL class and related classes (URLConnection, URLEncoder) are probably more appropriate than the socket classes. In fact, URLs are a relatively high-level connection to the Web and use sockets as part of the underlying implementation. Reading from and Writing to a Socket Let's look at a simple example that illustrates how a program can establish a connection to a server program using the Socket class and then, how the client can send data to and receive data from the server through the socket. The example program implements a client, EchoClient, that connects to the Echo server. The Echo server simply receives data from its client and echoes it back. EchoClient creates a socket thereby getting a connection to the Echo server. It reads input from the user on the standard input stream, and then forwards that text to the Echo server by writing the text to the socket. The server echoes the input back through the socket to the client. The client program reads and displays the data passed back to it from the server: import java.io.*; import java.net.*; public class EchoClient { public static void main(String[] args) throws IOException { String host="localhost"; Socket echoSocket = null; PrintWriter out = null; BufferedReader in = null; try { InetAddress address= InetAddress.getByName(host); echoSocket = new Socket(address, 4444); out = new PrintWriter(echoSocket.getOutputStream(), true);

in = new BufferedReader(new InputStreamReader( echoSocket.getInputStream())); } catch (UnknownHostException e) { System.err.println("Don't know about host"); System.exit(1); } catch (IOException e) { System.err.println("Couldn't get I/O for " + "the connection ."); System.exit(1); } BufferedReaderstdIn = new BufferedReader( newInputStreamReader(System.in)); String userInput; while ((userInput = stdIn.readLine()) != null) { out.println(userInput); System.out.println("echo: " + in.readLine()); } out.close(); in.close(); stdIn.close(); echoSocket.close(); } }

The EchoServer class implements the server who listens at port no 4444 for client requests.Only one client is served. public class EchoServer {

public static void main(String[] args) throws Exception { // create socket int port = 4444; ServerSocket serverSocket = new ServerSocket(port); System.err.println("Started server on port " + port); // repeatedly wait for connections, and process while (true) { // a "blocking" call which waits until a connection is requested Socket clientSocket = serverSocket.accept(); System.err.println("Accepted connection from client"); // open up IO streams BufferedReader in InputStreamReader( = new BufferedReader(new

clientSocket.getInputStream())); PrintWriter out = new PrintWriter(clientSocket.getOutputStream(), true); // waits for data and reads it in until connection dies // readLine() blocks until the server receives a new line from client String s; while ((s = in.readLine()) != null) { out.println(s); System.out.println(s); } // close IO streams, then socket System.err.println("Closing connection with client");

out.close(); in.close(); clientSocket.close(); System.exit(1); } } }

This client program is straightforward and simple because the Echo server implements a simple protocol. The client sends text to the server, and the server echoes it back. When your client programs are talking to a more complicated server such as an HTTP server, your client program will also be more complicated. However, the basics are much the same as they are in this program:

Open a socket. Open an input stream and output stream to the socket. Read from and write to the stream according to the server's protocol. Close the streams. Close the socket.

Only step 3 differs from client to client, depending on the server. The other steps remain largely the same.

Serving Multiple Clients at a time. There is one problem with the simple server in the preceding example.

Suppose we want to allow multiple clients to connect to our server at the same time. Typically, a server runs constantly on a server computer, and clients from all over the Internet may want to use the server at the same time. Rejecting multiple connections allows any one client to monopolize the service by connecting to it for a long time. We can do much better through the magic of threads. Every time we know the program has established a new socket connection,that is, when the call to accept was successful, we will launch a new thread to take care of the connection between the server and that client. The main program will just go back and wait for the next connection. import java.io.*; import java.net.*; public class ThreadedEchoServer {

public static void main(String[] args) throws Exception {

// create socket int port = 4444; int i=1; try{

ServerSocket serverSocket = new ServerSocket(port); System.err.println("Started server on port " + port);

// repeatedly wait for connections, and process for(;;) {

// a "blocking" call which waits until a connection is requested Socket clientSocket = serverSocket.accept();

System.err.println("Accepted connection from client"+ i); ClientThread ct=new ClientThread(clientSocket,i); ct.start(); i++; } } catch(IOException e) { e.printStackTrace(); } } }

class ClientThread extends Thread {

Socket sc; int counter; BufferedReader in; PrintWriter out; public ClientThread(Socket sc,int counter) {this.sc=sc; this.counter=counter; } public void run() {

try{

// open up IO streams BufferedReader in = new BufferedReader(new InputStreamReader(sc.getInputStream())); PrintWriter out = new PrintWriter(sc.getOutputStream(), true/*autoFlush */);

// waits for data and reads it in until connection dies // readLine() blocks until the server receives a new line from client String s; while ((s = in.readLine()) != null) { out.println(s); System.out.println(s); }

// close IO streams, then socket System.err.println("Closing connection with client:"+ counter); out.close(); in.close(); sc.close(); } catch(IOException e) { e.printStackTrace(); } }

Methods in java.net.Socket
Socket(String host, int port) creates a socket and connects it to a port on a remote host. Socket() creates a socket that has not yet been connected. void connect(SocketAddress address) connects this socket to the given address. (Since SDK 1.4) void connect(SocketAddress address, int timeout) connects this socket to the given address or returns if the time interval expired boolean isConnected() returns true if the socket is connected. (Since SDK 1.4) void close() closes the socket. boolean isClosed() returns true if the socket is closed. (Since SDK 1.4) InputStream getInputStream() gets the input stream to read from the socket. OutputStream getOutputStream() gets an output stream to write to this socket. void setSoTimeout(int timeout) sets the blocking time for read requests on this Socket. If the timeout is reached, then an InterruptedIOException is raised. void shutdownOutput() sets the output stream to "end of stream."

void shutdownInput() sets the input stream to "end of stream." boolean isOutputShutdown returns true if output has been shut down. (Since SDK 1.4) boolean isInputShutdown returns true if input has been shut down

Methods in java.net.ServerSocket
ServerSocket(int port) throws IOException creates a server socket that monitors a port. Socket accept() throws IOException waits for a connection. This method will block (that is, idle) the current thread until the connection is made. The method returns a Socket object through which the program can communicate with the connecting client. void close() throws IOException closes the server socket.

Sending E-Mail To send e-mail, you make a socket connection to port 25, the SMTP port. SMTP is the Simple Mail Transport Protocol that describes the format for e-mail messages. You can connect to any server that runs an SMTP service. On UNIX machines, that service is typically implemented by the sendmail daemon. Here are the details: 1. Open a socket to your host. 2. Socket s = new Socket("mail.yourserver.com", 25); // 25 is SMTP

3. PrintWriter out = new PrintWriter(s.getOutputStream()); 2. Send the following information to the print stream: 3. HELO sending host 4. MAIL FROM: <sender email address> 5. RCPT TO: <recipient email address> 6. DATA 7. mail message 8. (any number of lines) 9. . 10. QUIT The SMTP specification (RFC 821) states that lines must be terminated with \r followed by \n.

Program to create a client that sends a mail to a server in the localhost like DevNullSMTP server Import java.io.*; Import java.net.*; public class SMTPDemo {

public static void main(String args[]) throws IOException, UnknownHostException { String msgFile = "file1.txt"; String from = "java2s@java2s.com"; String to = "someone@otherend.com"; String mailHost = "localhost"; SMTP mail = new SMTP(mailHost); if (mail != null) {

if (mail.send(new FileReader(msgFile), from, to)) { System.out.println("Mail sent."); } else { System.out.println("Connect to SMTP server failed!"); } } System.out.println("Done."); }

static class SMTP { private final static int SMTP_PORT = 25;

InetAddress mailHost;

InetAddress localhost;

BufferedReader in;

PrintWriter out;

public SMTP(String host) throws UnknownHostException { mailHost = InetAddress.getByName(host); localhost = InetAddress.getLocalHost(); System.out.println("mailhost = " + mailHost); System.out.println("localhost= " + localhost); System.out.println("SMTP constructor done\n");

} public boolean send(FileReader msgFileReader, String from, String to) throws IOException { Socket smtpPipe; InputStream inn; OutputStream outt; BufferedReader msg; msg = new BufferedReader(msgFileReader); smtpPipe = new Socket(mailHost, SMTP_PORT); if (smtpPipe == null) { return false; } inn = smtpPipe.getInputStream(); outt = smtpPipe.getOutputStream(); in = new BufferedReader(new InputStreamReader(inn)); out = new PrintWriter(new OutputStreamWriter(outt), true); if (inn == null || outt == null) { System.out.println("Failed to open streams to socket."); return false; } String initialID = in.readLine(); System.out.println(initialID); System.out.println("HELO " + localhost.getHostName()); out.println("HELO " + localhost.getHostName()); String welcome = in.readLine(); System.out.println(welcome);

System.out.println("MAIL From:<" + from + ">"); out.println("MAIL From:<" + from + ">"); String senderOK = in.readLine(); System.out.println(senderOK); System.out.println("RCPT TO:<" + to + ">"); out.println("RCPT TO:<" + to + ">"); String recipientOK = in.readLine(); System.out.println(recipientOK); System.out.println("DATA"); out.println("DATA"); String line; while ((line = msg.readLine()) != null) { out.println(line); } System.out.println("."); out.println("."); String acceptedOK = in.readLine(); System.out.println(acceptedOK); System.out.println("QUIT"); out.println("QUIT"); return true; } } }

Internet Addressing
Every computer on the Internet has an address. An Internet address is a number that uniquely identifies each computer on the Net. Originally, all Internet addresses consisted of 32-bit values. This address type was specified by IPv4 (Internet Protocol,version 4). However, a new addressing scheme, called IPv6 (Internet Protocol, version6) has come into play. IPv6 uses a 128-bit value to represent an address. Although there are several reasons for and advantages to IPv6, the main one is that it supports a muchlarger address space than does IPv4. Fortunately, IPv6 is downwardly compatible with IPv4. Currently, IPv4 is by far the most widely used scheme, but this situation is likelyto change over time. Because of the emerging importance of IPv6, Java 2, version 1.4 has begun toadd support for it. However, at the time of this writing, IPv6 is not supported by all environments. Furthermore, for the next few years, IPv4 will continue to be the dominantform of addressing. For these reasons, the form of Internet addresses discussed here, and used in this chapter, are the IPv4 form. As mentioned, IPv4 is, loosely, a subsetof IPv6, and the material contained in this chapter is largely applicable to both formsof addressing. There are 32 bits in an IPv4 IP address, and we often refer to them as a sequenceof four numbers between 0 and 255 separated by dots (.). This makes them easier toremember, because they are not randomly assignedthey are hierarchically assigned. The first few bits define which class of network, lettered A, B, C, D, or E, the address represents. Most Internet users are on a class C network, since there

are over two million networks in class C. The first byte of a class C network is between 192 and 224,with the last byte actually identifying an individual computer among the 256 allowed on a single class C network. This scheme allows for half a billion devices to live on class C networks. Domain Naming Service (DNS) The Internet wouldnt be a very friendly place to navigate if everyone had to refer to their addresses as numbers. For example, it is difficult to imagine seeing http://192.9.9.1/ at the bottom of an advertisement. Thankfully, a clearing house exists for a parallel hierarchy of names to go with all these numbers. It is called the Domain Naming Service (DNS). Just as the four numbers of an IP address describe anetwork hierarchy from left to right, the name of an Internet address, called its domain name, describes a machines location in a name space, from right to left. For example,www.osborne.com is in the COM domain (reserved for U.S. commercial sites), it is called osborne (after the company name), and www is the name of the specific computer that is Osbornes web server. www corresponds to the rightmost number in the equivalent IP address. Java and the Net Now that the stage has been set, lets take a look at how Java relates to all of these network concepts. Java supports TCP/IP both by extending the already established stream I/O interface and by adding the features required tobuild I/O objects across the network. Java supports both the TCP and UDP protocol families. TCP is used for reliable stream-based I/O across the network. UDP supportsa simpler, hence faster, point-to-point datagram-oriented model. The Networking Classes and Interfaces The classes contained in the java.net package are listed here:

The java.net packages interfaces are listed here:

InetAddress Whether you are making a phone call, sending mail, or establishing a connection acrossthe Internet, addresses are fundamental. The InetAddress class is used to encapsulate both the numerical IP address we discussed earlier and the domain name for thataddress. You interact with this class by using the name of an IP host, which is more convenient and understandable than its IP address. The InetAddress class hides the number inside. As of Java 2, version 1.4, InetAddress can handle both IPv4 and IPv6 addresses. This discussion assumes IPv4.

Factory Methods The InetAddress class has no visible constructors. To create an InetAddress object,you have to use one of the available factory methods. Factory methods are merely a convention whereby static methods in a class return an instance of that class. This is done in lieu of overloading a constructor with various parameter lists when having unique method names makes the results much clearer. Three commonly used InetAddress factory methods are shown here.

The getLocalHost( ) method simply returns the InetAddress object that represents the local host. The getByName( ) method returns an InetAddress for a host name passed to it. If these methods are unable to resolve the host name, they throw an UnknownHostException. On the Internet, it is common for a single name to be used to represent several machines. In the world of web servers, this is one way to provide some degree of

scaling. The getAllByName( ) factory method returns an array of InetAddresses that represent all of the addresses that a particular name resolves to. It will also throw anUnknownHostException if it cant resolve the name to at least one address. Java 2, version 1.4 also includes the factory method getByAddress( ), which takes an IPaddress and returns an InetAddress object. Either an IPv4 or an IPv6 address can be used. The following example prints the addresses and names of the local machine and two well-known Internet web sites: // Demonstrate InetAddress. import java.net.*; class InetAddressTest { public static void main(String args[]) throws UnknownHostException { InetAddress Address = InetAddress.getLocalHost(); System.out.println(Address); Address = InetAddress.getByName("osborne.com"); System.out.println(Address); InetAddress SW[] = InetAddress.getAllByName("www.nba.com"); for (int i=0; i<SW.length; i++) System.out.println(SW[i]); } } Here is the output produced by this program. (Of course, the output you see will beslightly different.) default/206.148.209.138 osborne.com/198.45.24.162 www.nba.com/64.241.238.153 www.nba.com/64.241.238.142 Instance Methods The InetAddress class also has several other methods, which can be used on the objects returned by the methods just discussed. Here are some of the most commonly used. boolean equals(Object other) Returns true if this object has the same Internet address as other.

Internet addresses are looked up in a series of hierarchically cached servers. That means that your local computer might know a particular name-to-IP-address mapping automatically, such as for itself and nearby servers. For other names, it may ask a localDNS server for IP address information. If that server doesnt have a particular address,it can go to a remote site and ask for it. This can continue all the way up to the root server, called InterNIC (internic.net). This process might take a long time, so it is wiseto structure your code so that you cache IP address information locally rather than lookit up repeatedly.
What Is a Network Interface?

A network interface is the point of interconnection between a computer and a private or public network. A network interface is generally a network interface card (NIC), but does not have to have a physical form. Instead, the network interface can be implemented in software. For example, the loopback interface (127.0.0.1 for IPv4 and ::1 for IPv6) is not a physical device but a piece of software simulating a network interface. The loopback interface is commonly used in test environments. The java.net.NetworkInterface class represents both types of interfaces. NetworkInterface is useful for a multihomed system, which is a system with multiple NICs. Using NetworkInterface, you can specify which NIC to use for a particular network activity. For example, assume you have a machine with two configured NICs, and you want to send data to a server. You create a socket like this:

Socket soc = new java.net.Socket(); soc.connect(new InetSocketAddress(address, port)); To send the data, the system determines which interface will be used. However, if you have a preference or otherwise need to specify which NIC to use, you can query the system for the appropriate interfaces and find an address on the interface you want to use. When you create the socket and bind it to that address, the system will use the associated interface. For example: NetworkInterface nif = NetworkInterface.getByName("bge0"); Enumeration nifAddresses = nif.getInetAddresses(); Socket soc = new java.net.Socket(); soc.bind(nifAddresses.nextElement()); soc.connect(new InetSocketAddress(address, port)); You can also use NetworkInterface to identify the local interface on which a multicast group is to be joined. For example: NetworkInterfacenif = NetworkInterface.getByName("bge0"); MulticastSocket() ms = new MulticastSocket(); ms.joinGroup(new InetSocketAddress(hostname, port) , nif); NetworkInterface can be used with Java APIs in many other ways beyond the two uses described here.
Retrieving Network Interfaces

The NetworkInterface class has no public constructor. Therefore, you cannot just create a new instance of this class with the new operator. Instead, the following static methods are available so that you can retrieve the interface details from the system: getByInetAddress(), getByName(), and getNetworkInterfaces(). The first two methods are used when you already know the IP address or the name of the particular interface. The third method, getNetworkInterfaces() returns the complete list of interfaces on the machine. Network interfaces can be hierarchically organized. The NetworkInterface class includes two methods, getParent() and getSubInterfaces(), that are pertinent to a network interface hierarchy. The getParent() method returns the parent NetworkInterface of an interface. If a network interface is a subinterface, getParent() returns a non-null value. The getSubInterfaces() method returns all the subinterfaces of a network interface. The following example program lists the name of all the network interfaces and subinterfaces (if any exist) on a machine.

import java.io.*; import java.net.*; importjava.util.*; import static java.lang.System.out; public class ListNIFs { public static void main(String args[]) throws SocketException { Enumeration<NetworkInterface> nets = NetworkInterface.getNetworkInterfaces(); for (NetworkInterfacenetIf : Collections.list(nets)) { out.printf("Display name: %s\n", netIf.getDisplayName()); out.printf("Name: %s\n", netIf.getName()); displaySubInterfaces(netIf); out.printf("\n"); } } static void displaySubInterfaces(NetworkInterfacenetIf) throws
SocketException {

Enumeration<NetworkInterface>subIfs = netIf.getSubInterfaces(); for (NetworkInterfacesubIf : Collections.list(subIfs)) { out.printf("\tSub Interface Display name: %s\n", subIf.getDisplayName()); out.printf("\tSub Interface Name: %s\n", subIf.getName()); } } } The following is sample output from the example program: Display name: bge0 Name: bge0 Sub Interface Display name: bge0:3 Sub Interface Name: bge0:3 Sub Interface Display name: bge0:2 Sub Interface Name: bge0:2 Sub Interface Display name: bge0:1 Sub Interface Name: bge0:1 Display name: lo0 Name: lo0
Listing Network Interface Addresses

One of the most useful pieces of information you can get from a network interface is the list of IP addresses that are assigned to it. You can obtain this information from a NetworkInterface instance by using one of two methods. The first method, getInetAddresses(), returns an Enumeration of InetAddress. The other method, getInterfaceAddresses(), returns a list of java.net.InterfaceAddress instances. This method is used when you need more information about an interface address beyond its IP address. For example, you might need additional information about the subnet mask and broadcast address when the address is an IPv4 address, and a network prefix length in the case of an IPv6 address. The following example program lists all the network interfaces and their addresses on a machine: import java.io.*; import java.net.*; importjava.util.*; import static java.lang.System.out; public class ListNets { public static void main(String args[]) throws SocketException { Enumeration<NetworkInterface> nets = NetworkInterface.getNetworkInterfaces(); for (NetworkInterfacenetint : Collections.list(nets)) displayInterfaceInformation(netint); } static void displayInterfaceInformation(NetworkInterfacenetint) throws SocketException { out.printf("Display name: %s\n", netint.getDisplayName()); out.printf("Name: %s\n", netint.getName()); Enumeration<InetAddress>inetAddresses = netint.getInetAddresses(); for (InetAddressinetAddress : Collections.list(inetAddresses)) { out.printf("InetAddress: %s\n", inetAddress); } out.printf("\n"); } } The following is sample output from the example program: Display name: bge0 Name: bge0 InetAddress: /fe80:0:0:0:203:baff:fef2:e99d%2 InetAddress: /121.153.225.59

Display name: lo0 Name: lo0 InetAddress: /0:0:0:0:0:0:0:1%1 InetAddress: /127.0.0.1


Network Interface Parameters

You can access network parameters about a network interface beyond the name and IP addresses assigned to it You can discover if a network interface is up (that is, running) with the isUP() method. The following methods indicate the network interface type: isLoopback() indicates if the network interface is a loopback interface. isPointToPoint() indicates if the interface is a point-to-point interface.

isVirtual() indicates if the interface is a virtual interface.

The supportsMulticast() method indicates whether the network interface supports multicasting. The getHardwareAddress() method returns the network interface's physical hardware address, usually called MAC address, when it is available. The getMTU() method returns the Maximum Transmission Unit (MTU), which is the largest packet size. The following example expands on the example in Listing Network Interface Addresses by adding the additional network parameters described on this page: import java.io.*; import java.net.*; importjava.util.*; import static java.lang.System.out; public class ListNetsEx { public static void main(String args[]) throws SocketException { Enumeration<NetworkInterface> nets = NetworkInterface.getNetworkInterfaces(); for (NetworkInterfacenetint : Collections.list(nets)) displayInterfaceInformation(netint); } static void displayInterfaceInformation(NetworkInterface netint) throws SocketException { out.printf("Display name: %s\n", netint.getDisplayName());

out.printf("Name: %s\n", netint.getName()); Enumeration<InetAddress>inetAddresses = netint.getInetAddresses(); for (InetAddressinetAddress : Collections.list(inetAddresses)) { out.printf("InetAddress: %s\n", inetAddress); } out.printf("Up? %s\n", netint.isUp()); out.printf("Loopback? %s\n", netint.isLoopback()); out.printf("PointToPoint? %s\n", netint.isPointToPoint()); out.printf("Supports multicast? %s\n", netint.supportsMulticast()); out.printf("Virtual? %s\n", netint.isVirtual()); out.printf("Hardware address: %s\n", Arrays.toString(netint.getHardwareAddress())); out.printf("MTU: %s\n", netint.getMTU()); out.printf("\n"); } } The following is sample output from the example program: Display name: bge0 Name: bge0 InetAddress: /fe80:0:0:0:203:baff:fef2:e99d%2 InetAddress: /129.156.225.59 Up? true Loopback?false PointToPoint?false Supports multicast? false Virtual?false Hardware address: [0, 3, 4, 5, 6, 7] MTU: 1500 Display name: lo0 Name: lo0 InetAddress: /0:0:0:0:0:0:0:1%1 InetAddress: /127.0.0.1 Up? true Loopback?true PointToPoint?false Supports multicast? false Virtual?false Hardware address: null

MTU: 8232

Cookies
A cookie, also known as a web cookie, browser cookie, and HTTP cookie, is a piece of text stored by a user's web browser. A cookie can be used for authentication, storing site preferences, shopping cart contents, the identifier for a server-based session, or anything else that can be accomplished through storing text data. A cookie consists of one or more name-value pairs containing bits of information, which may be encrypted for information privacy and data security purposes. The cookie is sent as an HTTP header by a web server to a web browser and then sent back unchanged by the browser each time it accesses that server. Cookies, as with a cache, can be cleared to restore file storage space. If not manually deleted by the user, cookies usually have an expiration date associated with them (established by the server that set it). Once that date has passed, the cookies stored by the client will automatically be deleted. As text, cookies are not executable. Because they are not executed, they cannot replicate themselves and are not viruses. However, due to the browser mechanism to set and read cookies, they can be used as spyware. Anti-spyware products may warn users about some cookies because cookies can be used to track peoplea privacy concern, later causing possible malware. Most modern browsers allow users to decide whether to accept cookies, and the time frame to keep them, but rejecting cookies makes some websites unusable.

Uses
Session management
Cookies may be used to maintain data related to the user during navigation, possibly across multiple visits. Cookies were introduced to provide a way to implement a "shopping cart" (or "shopping basket"),[2][3] a virtual device into which users can store items they want to purchase as they navigate throughout the site. Shopping basket applications today usually store the list of basket contents in a database on the server side, rather than storing basket items in the cookie itself. A web server typically sends a cookie containing a unique session identifier. The web browser will send back that session identifier with each subsequent request and shopping basket items are stored associated with a unique session identifier.

Allowing users to log in to a website is a frequent use of cookies. Typically the web server will first send a cookie containing a unique session identifier. Users then submit their credentials and the web application authenticates the session and allows the user access to services.

Personalization
Cookies may be used to remember the information about the user who has visited a website in order to show relevant content in the future. For example a web server may send a cookie containing the username last used to log in to a web site so that it may be filled in for future visits. Many websites use cookies for personalization based on users' preferences. Users select their preferences by entering them in a web form and submitting the form to the server. The server encodes the preferences in a cookie and sends the cookie back to the browser. This way, every time the user accesses a page, the server is also sent the cookie where the preferences are stored, and can personalize the page according to the user preferences. For example, the Wikipedia website allows authenticated users to choose the webpage skin they like best; the Google search engine allows users (even non-registered ones) to decide how many search results per page they want to see.

Tracking
Tracking cookies may be used to track internet users' web browsing habits. This can also be done in part by using the IP address of the computer requesting the page or the referrer field of the HTTP header, but cookies allow for a greater precision. This can be done for example as follows: If the user requests a page of the site, but the request contains no cookie, the server presumes that this is the first page visited by the user; the server creates a random string and sends it as a cookie back to the browser together with the requested page; From this point on, the cookie will be automatically sent by the browser to the server every time a new page from the site is requested; the server sends the page as usual, but also stores the URL of the requested page, the date/time of the request, and the cookie in a log file. By looking at the log file, it is then possible to find out which pages the user has visited and in what sequence. For example, if the log contains some requests done using the cookie id=abc, it can be determined that these requests all come

from the same user. The URL and date/time stored with the cookie allows for finding out which pages the user has visited, and at what time. Third-party cookies and Web bugs, explained below, also allow for tracking across multiple sites. Tracking within a site is typically used to produce usage statistics, while tracking across sites is typically used by advertising companies to produce anonymous user profiles (which are then used to determine what advertisements should be shown to the user). A tracking cookie may potentially infringe upon the user's privacy but they can be easily removed. Current versions of popular web browsers include options to delete 'persistent' cookies when the application is closed.

Third-party cookies
When viewing a Web page, images or other objects contained within this page may reside on servers besides just the URL shown in your browser. While rendering the page, the browser downloads all these objects. Most modern websites that you view contain information from lots of different sources. For example, if you type www.domain.com into your browser, widgets and advertisements within this page are often served from a different domain source. While this information is being retrieved, some of these sources may set cookies in your browser. First-party cookies are cookies that are set by the same domain that is in your browser's address bar. Third-party cookies are cookies being set by one of these widgets or other inserts coming from a different domain. The standards for cookies, RFC 2109 and RFC 2965, specify that browsers should protect user privacy and not allow third-party cookies by default. But most browsers, such as Mozilla Firefox, Internet Explorer and Opera, do allow thirdparty cookies by default, though they allow users to block them. Some Internet users disable them because they can be used to track a user browsing from one website to another. This tracking is most often done by on-line advertising companies to assist in targeting advertisements. For example: Suppose a user visits www.domain1.com and an advertiser sets a cookie in the user's browser, and then the user later visits www.domain2.com. If the same company advertises on both sites, the advertiser knows that this particular user who is now viewing www.domain2.com also viewed www.domain1.com in the past and may thus more effectively target the user's interests or avoid repeating advertisements. The advertiser can then build up profiles on users.

Implementation Cookies are arbitrary pieces of data chosen by the Web server and sent to the browser. The browser returns them unchanged to the server, introducing a state (memory of previous events) into otherwise stateless HTTP transactions. Without cookies, each retrieval of a Web page or component of a Web page is an isolated event, mostly unrelated to all other views of the pages of the same site. Other than being set by a web server, cookies can also be set by a script in a language such as JavaScript, if supported and enabled by the Web browser. Cookie specifications[8][9] suggest that browsers should be able to save and send back a minimal number of cookies. In particular, an internet browser is expected to be able to store at least 300 cookies of four kilobytes each, and at least 20 cookies per server or domain. The cookie setter can specify a deletion date, in which case the cookie will be removed on that date. If the cookie setter does not specify a date, the cookie is removed once the user quits his or her browser. As a result, specifying a date is a way for making a cookie survive across sessions. For this reason, cookies with an expiration date are called persistent. As an example application, a shopping site can use persistent cookies to store the items users have placed in their basket. (In reality, the cookie may refer to an entry in a database stored at the shopping site, not on your computer.) This way, if users quit their browser without making a purchase and return later, they still find the same items in the basket so they do not have to look for these items again. If these cookies were not given an expiration date, they would expire when the browser is closed, and the information about the basket content would be lost. Cookies can also be limited in scope to a specific domain, subdomain or path on the web server which created them.

Setting a cookie
Transfer of Web pages follows the HyperText Transfer Protocol (HTTP). Regardless of cookies, browsers request a page from web servers by sending them a usually short text called HTTP request. For example, to access the page http://www.example.org/index.html, browsers connect to the server www.example.org sending it a request that looks like the following one: GET /index.html HTTP/1.1 Host: www.example.org

browser

server

The server replies by sending the requested page preceded by a similar packet of text, called 'HTTP response'. This packet may contain lines requesting the browser to store cookies: HTTP/1.1 200 OK Content-type: text/html Set-Cookie: name=value (content of page) browser server

The server sends the line Set-Cookie only if the server wishes the browser to store a cookie. Set-Cookie is a request for the browser to store the string name=value and send it back in all future requests to the server. If the browser supports cookies and cookies are enabled, every subsequent page request to the same server will include the cookie. For example, the browser requests the page http://www.example.org/spec.html by sending the server www.example.org a request like the following: GET /spec.html HTTP/1.1 Host: www.example.org Cookie: name=value Accept: */* browser server

This is a request for another page from the same server, and differs from the first one above because it contains the string that the server has previously sent to the browser. This way, the server knows that this request is related to the previous one. The server answers by sending the requested page, possibly adding other cookies as well. The value of a cookie can be modified by the server by sending a new SetCookie: name=newvalue line in response of a page request. The browser then replaces the old value with the new one.

The term "cookie crumb" is sometimes used to refer to the name-value pair.[10] This is not the same as breadcrumb web navigation, which is the technique of showing in each page the list of pages the user has previously visited; this technique, however, may be implemented using cookies. The Set-Cookie line is typically not created by the base HTTP server but by a CGI program. The basic HTTP server facility (e.g. Apache) just sends the result of the program (a document preceded by the header containing the cookies) to the browser. Cookies can also be set by JavaScript or similar scripts running within the browser. In JavaScript, the object document.cookie is used for this purpose. For example, the instruction document.cookie = "temperature=20" creates a cookie of name temperature and value 20.[

CookieHandler Callback Mechanism HTTP state management is implemented in Java SE through the java.net.CookieHandler class. A CookieHandler object provides a callback mechanism to provide an HTTP state management policy implementation in the HTTP protocol handler. That is, URLs that use HTTP as the protocol, new URL("http://java.sun.com") for example, will use the HTTP protocol handler. This protocol handler calls back to the CookieHander object, if set, to handle the state management. The CookieHandler class is an abstract class that has two pairs of related methods. The first pair, getDefault() and setDefault(cookieHandler), are static methods that enable you to discover the current handler that is installed and to install your own handler. No default handler is installed, and installing a handler is done on a system-wide basis. For applications running within a secure environment, that is, they have a security manager installed, you must have special permission to get and set the handler. For more information, see java.net.CookieHandler.getDefault. The second pair of related methods, put(uri, responseHeaders) and get(uri, requestHeaders), enable you to set and get all the applicable cookies to and from a cookie cache for the specified URI in the response/request headers, respectively. These methods are abstract, and a concrete implementation of a CookieHandler must provide the implementation.

Java Web Start and Java Plug-in have a default CookieHandler installed. However, if you are running a stand-alone application and want to enable HTTP state management, you must set a system-wide handler. The next two pages in this lesson show you how to do so.

Understanding the Sockets Direct Protocol

For high performance computing environments, the capacity to move data across a network quickly and efficiently is a requirement. Such networks are typically described as requiring high throughput and low latency. High throughput refers to an environment that can deliver a large amount of processing capacity over a long period of time. Low latency refers to the minimal delay between processing input and providing output, such as you would expect in a real-time application. In these environments, conventional networking using socket streams can create bottlenecks when it comes to moving data. Introduced in 1999 by the InfiniBand Trade Association, InfiniBand (IB) was created to address the need for high performance computing. One of the most important features of IB is Remote Direct Memory Access (RDMA). RDMA enables moving data directly from the memory of one computer to another computer, bypassing the operating system of both computers and resulting in significant performance gains. The Sockets Direct Protocol (SDP) is a networking protocol developed to support stream connections over InfiniBand fabric. SDP support was introduced to Java Platform, Standard Edition ("Java SE Platform") in JDK7 for applications deployed in the Solaris Operating System ("Solaris OS"). The Solaris OS has supported SDP and InfiniBand since Solaris 10 5/08.

Overview
SDP support is essentially a TCP bypass technology.

When SDP is enabled and an application attempts to open a TCP connection, the TCP mechanism is bypassed and communication goes directly to the IB network. For example, when your application attempts to bind to a TCP address, the underlying software will decide, based on information in the configuration file, if it should be rebound to an SDP protocol. This process can happen during the binding process or the connecting process (but happens only once for each socket).

There are no API changes required in your code to take advantage of the SDP protocol: the implementation is transparent and is supported by the classic networking (java.net) and the New I/O (java.nio.channels) packages.. SDP support is disabled by default. The steps to enable SDP support are:

Create an SDP configuration file. Set the system property that specifies the location of the configuration file.

Creating an SDP Configuration File


An SDP configuration file is a text file, and you decide where on the file system this file will reside. Every line in the configuration file is either a comment or a rule. A comment is indicated by the hash character (#) at the beginning of the line, and everything following the hash character will be ignored. There are two types of rules, as follows: A "bind" rule indicates that the SDP protocol transport should be used when a TCP socket binds to an address and port that match the rule. A "connect" rule indicates that the SDP protocol transport should be used when an unbound TCP socket attempts to connect to an address and port that match the rule. A rule has the following form:
("bind"|"connect")1*LWSP-char(hostname|ipaddress) ["/"prefix])1*LWSP-char("*"|port)["-"("*"|port)]

Decoding the notation: 1*LWSP-char means that any number of linear whitespace characters (tabs or spaces) can separate the tokens. The square brackets indicate optional text. The notation (xxx | yyy) indicates that the token will include either xxx or yyy, but not both. Quoted characters indicate literal text. The first keyword indicates whether the rule is a bind or a connect rule. The next token specifies either a host name or a literal IP address. When you specify a literal IP address, you can also specify a prefix, which indicates an IP address range. The third and final token is a port number or a range of port numbers. Consider the following notation in this sample configuration file:
# Use SDP when binding to 192.168.1.1

bind 192.168.1.1 * # Use SDP when connecting to all application services on 192.168.1.* connect 192.168.1.0/24 1024-* # Use SDP when connecting to the http server or MySQL database on hpccluster connect hpccluster.foo.com 80 connect hpccluster.foo.com 3306

The first rule in the sample file specifies that SDP is used for any port (*) on the local IP address 192.168.1.1. You would add a bind rule for each local address assigned to an InfiniBandadaptor. (An InfiniBand adaptor is the equivalent of a network interface card (NIC) for InfiniBand.) If you had several IB adaptors, you would use a bind rule for each address that is assigned to those adaptors. The second rule in the sample file specifies that whenever connecting to 192.168.1.* and the target port is 1024 or greater, SDP is used. The prefix on the IP address /24 indicates that the first 24 bits of the 32-bit IP address should match the specified address. Each portion of the IP address uses 8 bits, so 24 bits indicates that the IP address should match 192.168.1 and the final byte can be any value. The -* notation on the port token specifies "and above." A range of ports, such as 10242056, would also be valid and would include the end points of the specified range. The final rules in the sample file specify a host name (hpccluster), first with the port assigned to an http server (80) and then with the port assigned to a MySQL database (3306). Unlike a literal IP address, a host name can translate into multiple addresses. When you specify a host name, it matches all addresses that the host name is registered to in the name service.

Enabling the SDP Protocol


SDP support is disabled by default. To enable SDP support, set the com.sun.sdp.conf system property by providing the location of the configuration file. The following example starts an application using a configuration file named sdp.conf:
% java -Dcom.sun.sdp.conf=sdp.conf -Djava.net.preferIPv4Stack=true MyApplication

MyApplication refers to the client application that is attempting to connect to the IB adaptor.

Note that this example specifies another system property, java.net.preferIPv4Stack. See the Issues section for more information about why this property is used.

Technical Issues with SDP


IPv4 and IPv6 incompatibility Internet Protocol version 4 (IPv4) has long been the industry standard version of the Internet Protocol (IP) for delivering data over the Internet. Internet Protocol version 6 (IPv6) is the next generation Internet layer protocol. Both versions of IP are in use today. IPv4 addresses are 32-bits long, written in decimal format, and separated by periods. IPv6 addresses are 128-bits long, written in hexadecimal format, and separated by colons. IPv4 addresses cannot be used as is in IPv6, but IPv6 does support a special class of addresses: the IPv4-mapped address. In an IPv4-mapped address, the first 80 bits are set to zero, the next 16 bits are set to 1, and the last 32 bits represent the IPv4 address. For example, here is the same IP address expressed in both formats:
IPv4 address 192.168.0.1 IPv4-mapped address(for use in ::ffff:192.168.0.1 IPv6)

By default, if IPv6 is enabled on any of the IB adaptors, the Java platform uses IPv6. However, IPv4-mapped addresses are not currently available in the Solaris OS (see RFE #6622184). For this reason, if you want to use the IPv4 address format, you must specify the java.net.preferIPv4Stack property, as shown in this example:
% java -Dcom.sun.sdp.conf=sdp.confDjava.net.preferIPv4Stack=true MyApplication

Bugs

A few bugs were found in the early InfiniBand implementation. These bugs are fixed in the Solaris 10 10/09 release. Make sure that you are using at least this release.

You might also like