You are on page 1of 32

COMPUTER NETWORK MI 0035 Set 2

Name: Ihas Mohammed Roll number: 521145658 Learning centre: 01524 Subject: MI 0035- COMPUTER NETWORK Assignment No.: Set 2 Date of submission at learning centre:

MI 0035 Computer Network

ASSIGNMENTS Subject code: MB0035 (4 credits) Set 2 Marks 60 SUBJECT NAME: COMPUTER NETWORKS Note: Each Question carries 10 marks Q.1. write down the features of fast Ethernet and gigabit Ethernet. ANS: Fast Ethernet Technology Fast Ethernet, or 100BaseT, is conventional Ethernet but faster, operating at 100 Mbps instead of 10 Mbps. Fast Ethernet is based on the proven CSMA/CD Media Access Control (MAC) protocol and can use existing 10BaseT cabling (See Appendix for pinout diagram and table). Data can move from 10 Mbps to 100 Mbps without protocol translation or changes to application and networking software. Data- Link Layer Fast Ethernet maintains CSMA/CD, the Ethernet transmission protocol. However, Fast Ethernet reduces the duration of time each bit is transmitted by a factor of 10, enabling the packet speed to increase tenfold from 10 Mbps to 100 Mbps. Data can move between Ethernet and Fast Ethernet without requiring protocol translation, because Fast Ethernet also maintains the 10BaseT error control functions as well as the frame format and length. Other high-speed technologies such as 100VG-AnyLAN, FDDI, and Asynchronous Transfer Mode (ATM) achieve 100 Mbps or higher speeds by implementing different protocols that require protocol translation when moving data to and from 10BaseT. This protocol translation involves changes to the frame that typically mean higher latencies when frames are passed through layer 2 LAN switches. Physical Layer Media Options

MI 0035 Computer Network

Fast Ethernet can run over the same variety of media as 10BaseT, including UTP, shielded twisted-pair (STP), and ber. The Fast Ethernet specication denes separate physical sublayers for each media type: 100BaseT4 for four pairs of voice- or data-grade Category 3, 4, and 5 UTP wiring 100BaseTX for two pairs of data-grade Category 5 UTP and STP wiring 100BaseFX for two strands of 62.5/125-micron multimode ber In many cases, organizations can upgrade to 100BaseT technology without replacing existing wiring. However, for installations with Category 3 UTP wiring in all or part of their locations, four pairs must be available to implement Fast Ethernet. The MII layer of 100BaseT couples these physical sublayers to the CSMA/CD MAC layer (see Figure 1). The MII provides a single interface that can support external transceivers for any of the 100BaseT physical sublayers. For the physical connection, the MII is implemented on Fast Ethernet devices such as routers, switches, hubs, and adapters, and on transceiver devices using a 40-pin connector (See Appendix for pinout and connector diagrams). Cisco Systems contributed to the MII specication.Public Copyright 1999 Cisco Systems, Inc. All Rights Reserved. Physical Layer Signaling Schemes Each physical sublayer uses a signaling scheme that is appropriate to its media type. 100BaseT4 uses three pairs of wire for 100-Mbps transmission and the fourth pair for collision detection. This method lowers the 100BaseT4 signaling to 33 Mbps per pair, making it suitable for Category 3, 4, and 5 wiring. 100BaseTX uses one pair of wires for transmission (125-MHz frequency operating at 80-percent efciency to allow for 4B5B encoding) and the other pair for collision detection and receive. 100BaseFX uses one ber for transmission and the other ber for collision detection and receive. The 100BaseTX and 100BaseFX physical signaling channels are based on FDDI physical layers developed and approved by the American National Standards Institute (ANSI) X3T9.5 committee. 100BaseTX uses the MLT-3 line encoding signaling scheme, which Cisco developed and contributed to the ANSI committee as the specication for FDDI over Category 5 UTP. Today MLT-3 also is used as the signaling scheme for ATM over Category 5 UTP. Gigabit Ethernet:

MI 0035 Computer Network

Gigabit Ethernet is a 1-gigabit/sec (1,000-Mbit/sec) extension of the IEEE 802.3 Ethernet networking standard. Its primary niches are corporate LANs, campus networks, and service provider networks where it can be used to tie together existing 10-Mbit/sec and 100-Mbit/sec Ethernet networks. Gigabit Ethernet can replace 100-Mbit/sec FDDI (Fiber Distributed Data Interface) and Fast Ethernet backbones, and it competes with ATM (Asynchronous Transfer Mode) as a core networking technology. Many ISPs use Gigabit Ethernet in their data centers. Gigabit Ethernet provides an ideal upgrade path for existing Ethernetbased networks. It can be installed as a backbone network while retaining the existing investment in Ethernet hubs, switches, and wiring plants. In addition, management tools can be retained, although network analyzers will require updates to handle the higher speed. Gigabit Ethernet provides an alternative to ATM as a high-speed networking technology. While ATM has built-in QoS (quality of service) to support real-time network traffic, Gigabit Ethernet may be able to provide a high level of service quality by providing more bandwidth than is needed. This topic continues in "The Encyclopedia of Networking and Telecommunications" with a discussion of the following: Gigabit Ethernet features and specification Gigabit Ethernet modes and functional elements Gigabit Ethernet committees and specifications, including: 1000Base-LX (IEEE 802.3z) 1000Base-SX (IEEE 802.3z) 1000Base-CX (IEEE 802.3z) 1000Base-T (IEEE 802.3ab) 10-Gigabit Ethernet (IEEE 802.3ae) Gigabit Ethernet switches Network configuration and design Flat network or subnets Gigabit Ethernet backbones Switch-to-server links

MI 0035 Computer Network

Gigabit Ethernet to the desktop Switch-to-switch links Gigabit Ethernet versus ATM Hybrid Gigabit Ethernet/ATM Core Network

10-Gigabit Ethernet As if 1 Gbits/sec wasn't enough, the IEEE is working to define 10-Gigabit Ethernet (sometimes called "10 GE"). The new standard is being developed by the IEEE 802.3ae Working Group. Service providers will be the first to take advantage of this standard. It is being deployed in emerging metro-Ethernet networks. See "MAN (Metropolitan Area Network)" and "Network Access Services." As with 1-Gigabit Ethernet, 10-Gigabit Ethernet will preserve the 802.3 Ethernet frame format, as well as minimum and maximum frame sizes. It will support full-duplex operation only. The topology is star-wired LANs that use point-to-point links, and structured cabling topologies. 802.3ad link aggregation will also be supported. The new standard will support new multimedia applications, distributed processing, imaging, medical, CAD/CAM, and a variety of other applications-many that cannot even be perceived today. Most certainly it will be used in service provider data centers and as part of metropolitan area networks. The technology will also be useful in the SAN (Storage Area Network) environment. Refer to the following Web sites for more information. 10 GEA (10 Gigabit Ethernet Alliance) whitepapers.htm http://www.10gea.org/Tech-

Q.2. Differentiate the working between pure ALOHA and slotted ALOHA ANS: ALOHA: Aloha is a computer networking system which was introduced in the early 1970 by Norman Abramson and his colleagues at university of Hawaii to solve the channel allocation problem. On the basis of global time synchronization. Aloha is divided into two different versions or protocols. i.e Pure Aloha and Slotted Aloha. Pure Aloha:

MI 0035 Computer Network

Pure Aloha does not require global time synchronization. The basic idea of pure aloha system is that it allows its users to transmit whenever they have data.A sender just like other users can listen to what it is transmitting, and due to this feedback broadcasting system is able to detect collision, if any. If the collision is detected the sender will wait a random period of time and attempt transmission again. The waiting time must not be the same or the same frames will collide and destroyed over and over. Systems in which multiple users share a common channel in a way that can lead to conflicts are widely known as contention systems. Efficiency of Pure Aloha: Let "T" be the time needed to transmit one frame on the channel, and "frame-time" as a unit of time equal to T. Let "G" refer to the mean used in the Poisson distribution over transmission-attempt amounts that is, on average, there are G transmission-attempts per frame-time. Let "t" be the time at which the sender wants to send a frame. We want to use the channel for one frame-time beginning at t, and so we need all other stations to refrain from transmitting during this time. Moreover, we need the other stations to refrain from transmitting between t-T and t as well, because a frame sent during this interval would overlap with our frame.

EFFICIENCY OF ALOHA Vulnerable period for the shaded frame is 2t, if t is the frame time. A frame will not collide if no other frames are sent within one frame time of its start, before and after. For any frame-time, the probability of there being k transmission-attempts during that frame-time is: {G^k e^{-G}} / {k!} If throughput (number of packets per unit time) is represented by S, under all load, S =GPo, where Po is the probability that the frame does not suffer collision. A frame does not have collision if no frames are send during the frame time. Thus, in t time Po=(e)^(-G). In 2t time Po=e^(2G), as mean number of frames generated in 2t is 2G. From the above, throughput in 2t time S=G*(Po)=G*e^(-2G) Slotted Aloha Channel: Slotted Aloha does require global time synchronization. Efficiency of Slotted Aloha Channel: Assume that the sending stations has to wait until the beginning of a frame time (one frame time is one time slot) and arrivals still follow Poisson Distribution, where they are assumed probabilistically independent: In this case the vulnerable period is just t time units. Then the Probability that k frames are generated in a frame time is effective:-

MI 0035 Computer Network

Pk=(G^k)*(e^-G)/k! In t time, the probability of zero frames, Po=e^(-G) From the above throughput becomes: S=GPo=G*(e^-G) Comparison Of Pure Aloha And Slotted Aloha: PURE ALOHA VS SLOTTED ALOHA Throughput versus offered traffic for pure ALOHA and slotted ALOHA systems, ie, plot of S against G, from S=Ge^(-2G) and S=Ge^(-G) formulas. CSMA: CSMA is a set of rules in which the devices attached to a network first determines whether the channel or carrier is in use or free and then act accordingly. As in this MAC protocol,the network devices or nodes before transmission senses the channel,therefore, this protocol is known as carrier sense multiple access protocol. Multiple Access indicates that many devices can connect to and share the same network and if a node transmits anything, it is heard by all the stations on the network. Q.3. write down distance vector algorithm. Explain path vector protocol ANS: Distance Vector Routing algorithm: 1) For each node, estimate the cost from itself to each destination. 2) For each node, send the cost information the neighbors. 3) Receive cost information from the neighbor, update the routing tables accordingly. 4) Repeat steps 1 to 3 periodically. A path vector protocol is a computer network routing protocol which maintains the path information that gets updated dynamically. Updates which have looped through the network and returned to the same node are easily detected and discarded. This algorithm is sometimes used in BellmanFord routing algorithms to avoid "Count to Infinity" problems. It is different from the distance vector routing and link state routing. Each entry in the routing table contains the destination network, the next router and the path to reach the destination. Path Vector Messages in BGP: The autonomous system boundary routers (ASBR), which participate in path vector routing, advertise the reachability of networks. Each router that receives a path vector message must verify that the advertised path is according to its policy. If the messages comply

MI 0035 Computer Network

with the policy, the ASBR modifies its routing table and the message before sending it to the next neighbor. In the modified message it sends its own AS number and replaces the next router entry with its own identification. BGP is an example of a path vector protocol. In BGP the routing table maintains the autonomous systems that are traversed in order to reach the destination system. Exterior Gateway Protocol (EGP) does not use path vectors.

Q.4. state the working principle of TCP segment header and UDP header ANS: TCP Header Format TCP segments are sent as internet datagrams. The Internet Protocol header carries several information fields, including the source and destination host addresses [2]. A TCP header follows the internet header, supplying information specific to the TCP protocol. This division allows for the existence of host level protocols other than TCP. TCP Header Format

01234567890123456789012345678901 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+ | Source Port | Destination Port |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+ | Sequence Number |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+ | Acknowledgment Number |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+

MI 0035 Computer Network

| Data |

|U|A|P|R|S|F| Window |

| |

| Offset| Reserved |R|C|S|S|Y|I| | | |G|K|H|T|N|N|

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+ | Checksum | Urgent Pointer |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+ | Options | Padding |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+ | data |

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+ TCP Header Format Source Port: 16 bits The source port number. Destination Port: 16 bits The destination port number. Sequence Number: 32 bits The sequence number of the first data octet in this segment (except when SYN is present). If SYN is present the sequence number is the initial sequence number (ISN) and the first data octet is ISN+1. Acknowledgment Number: 32 bits If the ACK control bit is set this field contains the value of the next sequence number the sender of the segment is expecting to receive. Once a connection is established this is always sent. Data Offset: 4 bits

MI 0035 Computer Network

The number of 32 bit words in the TCP Header. This indicates where the data begins. The TCP header (even one including options) is an integral number of 32 bits long. Reserved: 6 bits Reserved for future use. Must be zero. Control Bits: 6 bits (from left to right): URG: Urgent Pointer field significant ACK: Acknowledgment field significant PSH: Push Function RST: Reset the connection SYN: Synchronize sequence numbers FIN: No more data from sender Window: 16 bits The number of data octets beginning with the one indicated in the acknowledgment field which the sender of this segment is willing to accept. Checksum: 16 bits The checksum field is the 16 bit one's complement of the one's complement sum of all 16 bit words in the header and text. If a segment contains an odd number of header and text octets to be checksummed, the last octet is padded on the right with zeros to form a 16 bit word for checksum purposes. The pad is not transmitted as part of the segment. While computing the checksum, the checksum field itself is replaced with zeros.

The checksum also covers a 96 bit pseudo header conceptually prefixed to the TCP header. This pseudo header contains the Source Address, the Destination Address, the Protocol, and TCP length. This gives the TCP protection against misrouted segments. This information is carried in the Internet Protocol and is transferred across the TCP/Network interface in the arguments or results of calls by the TCP on the IP.

10

MI 0035 Computer Network

+--------+--------+--------+--------+ | Source Address |

+--------+--------+--------+--------+ | Destination Address |

+--------+--------+--------+--------+ | zero | PTCL | TCP Length |

+--------+--------+--------+--------+

The TCP Length is the TCP header length plus the data length in octets (this is not an explicitly transmitted quantity, but is computed), and it does not count the 12 octets of the pseudo header. Urgent Pointer: 16 bits This field communicates the current value of the urgent pointer as a positive offset from the sequence number in this segment. The urgent pointer points to the sequence number of the octet following the urgent data. This field is only be interpreted in segments with the URG control bit set. Options: variable Options may occupy space at the end of the TCP header and are a multiple of 8 bits in length. All options are included in the checksum. An option may begin on any octet boundary. There are two cases for the format of an option: Case 1: A single octet of option-kind. Case 2: An octet of option-kind, an octet of option-length, and the actual option-data octets. The option-length counts the two octets of option-kind and optionlength as well as the option-data octets.

11

MI 0035 Computer Network

Note that the list of options may be shorter than the data offset field might imply. The content of the header beyond the End-of-Option option must be header padding (i.e., zero). A TCP must implement all options. Currently defined options include (kind indicated in octal): Kind ---0 1 2 Length -----4 Meaning

------End of option list. No-Operation. Maximum Segment Size

Specific Option Definitions End of Option List +--------+ |00000000| +--------+ Kind=0 This option code indicates the end of the option list. This might not coincide with the end of the TCP header according to the Data Offset field. This is used at the end of all options, not the end of each option, and need only be used if the end of the options would not otherwise coincide with the end of the TCP header. No-Operation

+--------+ |00000001| +--------+ Kind=1

12

MI 0035 Computer Network

This option code may be used between options, for example, to align the beginning of a subsequent option on a word boundary.There is no guarantee that senders will use this option, so receivers must be prepared to process options even if they do not begin on a word boundary.

Maximum Segment Size +--------+--------+---------+--------+ |00000010|00000100| max seg size | +--------+--------+---------+--------+ Kind=2 Length=4 Maximum Segment Size Option Data: 16 bits If this option is present, then it communicates the maximum receive segment size at the TCP which sends this segment. This field must only be sent in the initial connection request (i.e., in segments with the SYN control bit set). If this option is not used, any segment size is allowed. Padding: variable The TCP header padding is used to ensure that the TCP header ends and data begins on a 32 bit boundary. The padding is composed of zeros. The User Datagram Protocol (UDP) The User Datagram Protocol (UDP) is a transport layer protocol defined for use with the IP network layer protocol. It is defined by RFC 768 written by John Postel. It provides a best-effort datagram service to an End System (IP host). The service provided by UDP is an unreliable service that provides no guarantees for delivery and no protection from duplication (e.g. if this arises due to software errors within an Intermediate System (IS)). The simplicity of UDP reduces the overhead from using the protocol and the services may be adequate in many cases.

13

MI 0035 Computer Network

UDP provides a minimal, unreliable, best-effort, message-passing transport to applications and upper-layer protocols. Compared to other transport protocols, UDP and its UDP-Lite variant are unique in that they do not establish end-to-end connections between communicating end systems. UDP communication consequently does not incur connection establishment and teardown overheads and there is minimal associated end system state. Because of these characteristics, UDP can offer a very efficient communication transport to some applications, but has no inherent congestion control or reliability. A second unique characteristic of UDP is that it provides no inherent On many platforms, applications can send UDP datagrams at the line rate of the link interface, which is often much greater than the available path capacity, and doing so would contribute to congestion along the path, applications therefore need to be designed responsibly [RFC 4505]. One increasingly popular use of UDP is as a tunneling protocol, where a tunnel endpoint encapsulates the packets of another protocol inside UDP datagrams and transmits them to another tunnel endpoint, which decapsulates the UDP datagrams and forwards the original packets contained in the payload. Tunnels establish virtual links that appear to directly connect locations that are distant in the physical Internet topology, and can be used to create virtual (private) networks. Using UDP as a tunneling protocol is attractive when the payload protocol is not supported by middleboxes that may exist along the path, because many middleboxes support UDP transmissions. UDP does not provide any communications security. Applications that need to protect their communications against eavesdropping, tampering, or message forgery therefore need to separately provide security services using additional protocol mechanisms. Protocol Header A computer may send UDP packets without first establishing a connection to the recipient. A UDP datagram is carried in a single IP packet and is hence limited to a maximum payload of 65,507 bytes for IPv4 and 65,527 bytes for IPv6. The transmission of large IP packets usually requires IP fragmentation. Fragmentation decreases communication reliability and efficiency and should theerfore be avoided. To transmit a UDP datagram, a computer completes the appropriate fields in the UDP header (PCI) and forwards the data together with the header for transmission by the IP network layer. The UDP protocol header consists of 8 bytes of Protocol Control Information (PCI) The UDP header consists of four fields each of 2 bytes in length:

14

MI 0035 Computer Network

Source Port (UDP packets from a client use this as a service access point (SAP) to indicate the session on the local client that originated the packet. UDP packets from a server carry the server SAP in this field) Destination Port (UDP packets from a client use this as a service access point (SAP) to indicate the service required from the remote server. UDP packets from a server carry the client SAP in this field) UDP length (The number of bytes comprising the combined UDP header information and payload data) UDP Checksum (A checksum to verify that the end to end data has not been corrupted by routers or bridges in the network or by the processing in an end system. The algorithm to compute the checksum is the Standard Internet Checksum algorithm. This allows the receiver to verify that it was the intended destination of the packet, because it covers the IP addresses, port numbers and protocol number, and it verifies that the packet is not truncated or padded, because it covers the size field. Therefore, this protects an application against receiving corrupted payload data in place of, or in addition to, the data that was sent. In the cases where this check is not required, the value of 0x0000 is placed in this field, in which case the data is not checked by the receiver. Like for other transport protocols, the UDP header and data are not processed by Intermediate Systems (IS) in the network, and are delivered to the final destination in the same form as originally transmitted. At the final destination, the UDP protocol layer receives packets from the IP network layer. These are checked using the checksum (when >0, this checks correct end-to-end operation of the network service) and all invalid PDUs are discarded. UDP does not make any provision for error reporting if the packets are not delivered. Valid data are passed to the appropriate session layer protocol identified by the source and destination port numbers (i.e. the session service access points). UDP and UDP-Lite also may be used for multicast and broadcast, allowing senders to transmit to multiple receivers. Using UDP Application designers are generally aware that UDP does not provide any reliability, e.g., it does not retransmit any lost packets. Often, this is a main reason to consider UDP as a transport. Applications that do require reliable message delivery therefore need to implement appropriate protocol mechanisms in their applications (e.g. tftp). UDP's best effort service does not protect against datagram duplication, i.e., an application may receive multiple copies of the same UDP datagram. Application designers therefore need to verify that their

15

MI 0035 Computer Network

application gracefully handles datagram duplication and may need to implement mechanisms to detect duplicates. The Internet may also significantly delay some packets with respect to others, e.g., due to routing transients, intermittent connectivity, or mobility. This can cause reordering, where UDP datagrams arrive at the receiver in an order different from the transmission order. Applications that require ordered delivery must restore datagram ordering themselves. The burdon of needing to code all these protocol mechanims can be avoided by using TCP! Q.5. what is IP addressing? Discuss different classs of IP addressing. ANS: n identifier for a computer or device on a TCP/IP network. Networks using the TCP/IP protocol route messages based on the IP address of the destination. The format of an IP address is a 32-bit numeric address written as four numbers separated by periods. Each number can be zero to 255. For example, 1.160.10.240 could be an IP address. Within an isolated network, you can assign IP addresses at random as long as each one is unique. However, connecting a private network to the Internetrequires using registered IP addresses (called Internet addresses) to avoid duplicates. The four numbers in an IP address are used in different ways to identify a particular network and a host on that network. Four regional Internet registries -- ARIN, RIPE NCC, LACNIC and APNIC -- assign Internet addresses from the following three classes. Class A - supports 16 million hosts on each of 126 networks Class B - supports 65,000 hosts on each of 16,000 networks Class C - supports 254 hosts on each of 2 million networks The number of unassigned Internet addresses is running out, so a new classless scheme called CIDR is gradually replacing the system based on classes A, B, and C and is tied to adoption of IPv6. IP address classes These IP addresses can further be broken down into classes. These classes are A, B, C, D, E and their possible ranges can be seen in Figure 2 below. Class Start address Finish address

16

MI 0035 Computer Network

A B C D E

0.0.0.0 128.0.0.0 192.0.0.0 224.0.0.0 240.0.0.0

126.255.255.255 191.255.255.255 223.255.255.255 239.255.255.255 255.255.255.255

Figure 2. IP address Classes If you look at the table you may notice something strange. The range of IP address from Class A to Class B skips the 127.0.0.0-127.255.255.255 range. That is because this range is reserved for the special addresses called Loopback addresses that have already been discussed above. The rest of classes are allocated to companies and organizations based upon the amount of IP addresses that they may need. Listed below are descriptions of the IP classes and the organizations that will typically receive that type of allocation. Default Network: The special network 0.0.0.0 is generally used for routing. Class A: From the table above you see that there are 126 class A networks. These networks consist of 16,777,214 possible IP addresses that can be assigned to devices and computers. This type of allocation is generally given to very large networks such as multi-national companies. Loopback: This is the special 127.0.0.0 network that is reserved as a loopback to your own computer. These addresses are used for testing and debugging of your programs or hardware. Class B: This class consists of 16,384 individual networks, each allocation consisting of 65,534 possible IP addresses. These blocks are generally allocated to Internet Service Providers and large networks, like a college or major hospital. Class C: There is a total of 2,097,152 Class C networks available, with each network consisting of 255 individual IP addresses. This type of class is generally given to small to mid-sized companies. Class D: The IP addresses in this class are reserved for a service called Multicast. Class E: The IP addresses in this class are reserved for experimental use.

17

MI 0035 Computer Network

Broadcast: This is the special network of 255.255.255.255, and is used for broadcasting messages to the entire network that your computer resides on. Private Addresses There are also blocks of IP addresses that are set aside for internal private use for computers not directly connected to the Internet. These IP addresses are not supposed to be routed through the Internet, and most service providers will block the attempt to do so. These IP addresses are used for internal use by company or home networks that need to use TCP/IP but do not want to be directly visible on the Internet. These IP ranges are: Class Private Start Address A B C 10.0.0.0 Private End Address

10.255.255.255

172.16.0.0 172.31.255.255 192.168.0.0 192.168.255.255

If you are on a home/office private network and want to use TCP/IP, you should assign your computers/devices IP addresses from one of these three ranges. That way your router/firewall would be the only device with a true IP address which makes your network more secure.

Common Problems and Resolutions The most common problem people have is by accident assigning an IP address to a device on your network that is already assigned to another device. When this happens, the other computers will not know which device should get the information, and you can experience erratic behavior. On most operating systems and devices, if there are two devices on the local network that have the same IP address, it will generally give you a "IP Conflict" warning. If you see this warning, that means that the device giving the warning, detected another device on the network using the same address. The best solution to avoid a problem like this is to use a service called DHCP that almost all home routers provide. DHCP, or Dynamic Host Configuration Protocol, is a service that assigns addresses to devices and computers. You tell the DHCP server what range of IP addresses you would like it to assign, and then the DHCP server takes the responsibility

18

MI 0035 Computer Network

of assigning those IP addresses to the various devices and keeping track so those IP addresses are assigned only once.

Q.6. define Cryptography? Discuss two cryptography techniques ANS: Cryptography is the science of information security. The word is derived from the Greekkryptos, meaning hidden. Cryptography is closely related to the disciplines of cryptology andcryptanalysis. Cryptography includes techniques such as microdots, merging words with images, and other ways to hide information in storage or transit. However, in today's computer-centric world, cryptography is most often associated with scrambling plaintext(ordinary text, sometimes referred to as cleartext) into ciphertext (a process calledencryption), then back again (known as decryption). Individuals who practice this field are known as cryptographers. Modern cryptography concerns itself with the following four objectives: 1) Confidentiality (the information cannot be understood by anyone for whom it was unintended) 2) Integrity (the information cannot be altered in storage or transit between sender and intended receiver without the alteration being detected) 3) Non-repudiation (the creator/sender of the information cannot deny at a later stage his or her intentions in the creation or transmission of the information) 4) Authentication (the sender and receiver can confirm each other?s identity and the origin/destination of the information) 3. TYPES OF CRYPTOGRAPHIC ALGORITHMS There are several ways of classifying cryptographic algorithms. For purposes of this paper, they will be categorized based on the number of keys that are employed for encryption and decryption, and further defined by their application and use. The three types of algorithms that will be discussed are (Figure 1): Secret Key Cryptography (SKC): Uses a single key for both encryption and decryption Public Key Cryptography (PKC): Uses one key for encryption and another for decryption

19

MI 0035 Computer Network

Hash Functions: Uses a mathematical transformation to irreversibly "encrypt" information FIGURE 1: Three types of cryptography: secret-key, public key, and hash function.

3.1. Secret Key Cryptography With secret key cryptography, a single key is used for both encryption and decryption. As shown in Figure 1A, the sender uses the key (or some set of rules) to encrypt the plaintext and sends the ciphertext to the receiver. The receiver applies the same key (or ruleset) to decrypt the message and recover the plaintext. Because a single key is used for both functions, secret key cryptography is also called symmetric encryption. With this form of cryptography, it is obvious that the key must be known to both the sender and the receiver; that, in fact, is the secret. The biggest difficulty with this approach, of course, is the distribution of the key. Secret key cryptography schemes are generally categorized as being either stream ciphers or block ciphers. Stream ciphers operate on a single bit (byte or computer word) at a time and implement some form of feedback mechanism so that the key is constantly changing. A block cipher is so-called because the scheme encrypts one block of data at a time using the same key on each block. In general, the same plaintext block will always encrypt to the same ciphertext when using the same key in a block cipher whereas the same plaintext will encrypt to different ciphertext in a stream cipher. Stream ciphers come in several flavors but two are worth mentioning here. Self-synchronizing stream ciphers calculate each bit in the keystream as a function of the previous n bits in the keystream. It is termed "self-synchronizing" because the decryption process can stay synchronized with the encryption process merely by knowing how far into the n-bit keystream it is. One problem is error propagation; a garbled bit in transmission will result in n garbled bits at the receiving side. Synchronous stream ciphers generate the keystream in a fashion independent of the message stream but by using the same keystream generation function at sender and receiver. While stream ciphers do not propagate transmission errors, they are, by their nature, periodic so that the keystream will eventually repeat. Block ciphers can operate in one of several modes; the following four are the most important:

20

MI 0035 Computer Network

Electronic Codebook (ECB) mode is the simplest, most obvious application: the secret key is used to encrypt the plaintext block to form a ciphertext block. Two identical plaintext blocks, then, will always generate the same ciphertext block. Although this is the most common mode of block ciphers, it is susceptible to a variety of brute-force attacks. Cipher Block Chaining (CBC) mode adds a feedback mechanism to the encryption scheme. In CBC, the plaintext is exclusively-ORed (XORed) with the previous ciphertext block prior to encryption. In this mode, two identical blocks of plaintext never encrypt to the same ciphertext. Cipher Feedback (CFB) mode is a block cipher implementation as a self-synchronizing stream cipher. CFB mode allows data to be encrypted in units smaller than the block size, which might be useful in some applications such as encrypting interactive terminal input. If we were using 1-byte CFB mode, for example, each incoming character is placed into a shift register the same size as the block, encrypted, and the block transmitted. At the receiving side, the ciphertext is decrypted and the extra bits in the block (i.e., everything above and beyond the one byte) are discarded. Output Feedback (OFB) mode is a block cipher implementation conceptually similar to a synchronous stream cipher. OFB prevents the same plaintext block from generating the same ciphertext block by using an internal feedback mechanism that is independent of both the plaintext and ciphertext bitstreams. A nice overview of these different modes can be found at progressivecoding.com. Secret key cryptography algorithms that are in use today include: Data Encryption Standard (DES): The most common SKC scheme used today, DES was designed by IBM in the 1970s and adopted by the National Bureau of Standards (NBS) [now the National Institute for Standards and Technology (NIST)] in 1977 for commercial and unclassified government applications. DES is a block-cipher employing a 56-bit key that operates on 64-bit blocks. DES has a complex set of rules and transformations that were designed specifically to yield fast hardware implementations and slow software implementations, although this latter point is becoming less significant today since the speed of computer processors is several orders of magnitude faster today than twenty years ago. IBM also proposed a 112-bit key for DES, which was rejected at the time by the government; the use of 112-bit keys was considered in the 1990s, however, conversion was never seriously considered. DES is defined in American National Standard X3.92 and three Federal Information Processing Standards (FIPS):

21

MI 0035 Computer Network

FIPS 46-3: DES

FIPS 74: Guidelines for Implementing and Using the NBS Data Encryption Standard FIPS 81: DES Modes of Operation

Information about vulnerabilities of DES can be obtained from the Electronic Frontier Foundation. Two important variants that strengthen DES are: Triple-DES (3DES): A variant of DES that employs up to three 56-bit keys and makes three encryption/decryption passes over the block; 3DES is also described in FIPS 46-3 and is the recommended replacement to DES. DESX: A variant devised by Ron Rivest. By combining 64 additional key bits to the plaintext prior to encryption, effectively increases the keylength to 120 bits. More detail about DES, 3DES, and DESX can be found below in Section 5.4. Advanced Encryption Standard (AES): In 1997, NIST initiated a very public, 4-1/2 year process to develop a new secure cryptosystem for U.S. government applications. The result, the Advanced Encryption Standard, became the official successor to DES in December 2001. AES uses an SKC scheme called Rijndael, a block cipher designed by Belgian cryptographers Joan Daemen and Vincent Rijmen. The algorithm can use a variable block length and key length; the latest specification allowed any combination of keys lengths of 128, 192, or 256 bits and blocks of length 128, 192, or 256 bits. NIST initially selected Rijndael in October 2000 and formal adoption as the AES standard came in December 2001. FIPS PUB 197 describes a 128-bit block cipher employing a 128-, 192-, or 256-bit key. The AES process and Rijndael algorithm are described in more detail below in Section 5.9. CAST-128/256: CAST-128, described in Request for Comments (RFC) 2144, is a DES-like substitution-permutation crypto algorithm, employing a 128-bit key operating on a 64-bit block. CAST-256 (RFC 2612) is an extension of CAST-128, using a 128-bit block size and a variable length (128, 160, 192, 224, or 256 bit) key. CAST is named for its developers, Carlisle Adams and Stafford Tavares and is available internationally. CAST256 was one of the Round 1 algorithms in the AES process. International Data Encryption Algorithm (IDEA): Secret-key cryptosystem written by Xuejia Lai and James Massey, in 1992 and

22

MI 0035 Computer Network

patented by Ascom; a 64-bit SKC block cipher using a 128-bit key. Also available internationally. Rivest Ciphers (aka Ron's Code): Named for Ron Rivest, a series of SKC algorithms. RC1: Designed on paper but never implemented.

RC2: A 64-bit block cipher using variable-sized keys designed to replace DES. It's code has not been made public although many companies have licensed RC2 for use in their products. Described in RFC 2268. RC3: Found to be breakable during development.

RC4: A stream cipher using variable-sized keys; it is widely used in commercial cryptography products, although it can only be exported using keys that are 40 bits or less in length. RC5: A block-cipher supporting a variety of block sizes, key sizes, and number of encryption passes over the data. Described in RFC 2040. RC6: An improvement over RC5, RC6 was one of the AES Round 2 algorithms. Blowfish: A symmetric 64-bit block cipher invented by Bruce Schneier; optimized for 32-bit processors with large data caches, it is significantly faster than DES on a Pentium/PowerPC-class machine. Key lengths can vary from 32 to 448 bits in length. Blowfish, available freely and intended as a substitute for DES or IDEA, is in use in over 80 products. Twofish: A 128-bit block cipher using 128-, 192-, or 256-bit keys. Designed to be highly secure and highly flexible, well-suited for large microprocessors, 8-bit smart card microprocessors, and dedicated hardware. Designed by a team led by Bruce Schneier and was one of the Round 2 algorithms in the AES process. Camellia: A secret-key, block-cipher crypto algorithm developed jointly by Nippon Telegraph and Telephone (NTT) Corp. and Mitsubishi Electric Corporation (MEC) in 2000. Camellia has some characteristics in common with AES: a 128-bit block size, support for 128-, 192-, and 256-bit key lengths, and suitability for both software and hardware implementations on common 32-bit processors as well as 8-bit processors (e.g., smart cards, cryptographic hardware, and embedded systems). Also described in RFC 3713. Camellia's application in IPsec is described in RFC 4312 and application in OpenPGP in RFC 5581. MISTY1: Developed at Mitsubishi Electric Corp., a block cipher using a 128-bit key and 64-bit blocks, and a variable number of rounds.

23

MI 0035 Computer Network

Designed for hardware and software implementations, and is resistant to differential and linear cryptanalysis. Described in RFC 2994. Secure and Fast Encryption Routine (SAFER): Secret-key crypto scheme designed for implementation in software. Versions have been defined for 40-, 64-, and 128-bit keys. KASUMI: A block cipher using a 128-bit key that is part of the ThirdGeneration Partnership Project (3gpp), formerly known as the Universal Mobile Telecommunications System (UMTS). KASUMI is the intended confidentiality and integrity algorithm for both message content and signaling data for emerging mobile communications systems. SEED: A block cipher using 128-bit blocks and 128-bit keys. Developed by the Korea Information Security Agency (KISA) and adopted as a national standard encryption algorithm in South Korea. Also described in RFC 4269. ARIA: A 128-bit block cipher employing 128-, 192-, and 256-bit keys. Developed by large group of researchers from academic institutions, research institutes, and federal agencies in South Korea in 2003, and subsequently named a national standard. Described in RFC 5794. CLEFIA: Described in RFC 6114, CLEFIA is a 128-bit block cipher employing key lengths of 128, 192, and 256 bits (which is compatible with AES). The CLEFIA algorithm was first published in 2007 by Sony Corporation. CLEFIA is one of the new-generation lightweight blockcipher algorithms designed after AES, offering high performance in software and hardware as well as a lightweight implementation in hardware. SMS4: SMS4 is a 128-bit block cipher using 128-bit keys and 32 rounds to process a block. Declassified in 2006, SMS4 is used in the Chinese National Standard for Wireless Local Area Network (LAN) Authentication and Privacy Infrastructure (WAPI). SMS4 had been a proposed cipher for the Institute of Electrical and Electronics Engineers (IEEE) 802.11i standard on security mechanisms for wireless LANs, but has yet to be accepted by the IEEE or International Organization for Standardization (ISO). SMS4 is described in SMS4 Encryption Algorithm for Wireless Networks (translated and typeset by Whitfield Diffie and George Ledin, 2008) or in the original Chinese. Skipjack: SKC scheme proposed for Capstone. Although the details of the algorithm were never made public, Skipjack was a block cipher using an 80-bit key and 32 iteration cycles per 64-bit block. 3.2. Public-Key Cryptography Public-key cryptography has been said to be the most significant new development in cryptography in the last 300-400 years. Modern PKC was

24

MI 0035 Computer Network

first described publicly by Stanford University professor Martin Hellman and graduate student Whitfield Diffie in 1976. Their paper described a two-key crypto system in which two parties could engage in a secure communication over a non-secure communications channel without having to share a secret key. PKC depends upon the existence of so-called one-way functions, or mathematical functions that are easy to computer whereas their inverse function is relatively difficult to compute. Let me give you two simple examples: Multiplication vs. factorization: Suppose I tell you that I have two numbers, 9 and 16, and that I want to calculate the product; it should take almost no time to calculate the product, 144. Suppose instead that I tell you that I have a number, 144, and I need you tell me which pair of integers I multiplied together to obtain that number. You will eventually come up with the solution but whereas calculating the product took milliseconds, factoring will take longer because you first need to find the 8 pairs of integer factors and then determine which one is the correct pair. Exponentiation vs. logarithms: Suppose I tell you that I want to take the number 3 to the 6th power; again, it is easy to calculate 36=729. But if I tell you that I have the number 729 and want you to tell me the two integers that I used, x and y so that logx 729 = y, it will take you longer to find all possible solutions and select the pair that I used. While the examples above are trivial, they do represent two of the functional pairs that are used with PKC; namely, the ease of multiplication and exponentiation versus the relative difficulty of factoring and calculating logarithms, respectively. The mathematical "trick" in PKC is to find a trap door in the one-way function so that the inverse calculation becomes easy given knowledge of some item of information. (The problem is further exacerbated because the algorithms don't use just any old integers, but very large prime numbers.) Generic PKC employs two keys that are mathematically related although knowledge of one key does not allow someone to easily determine the other key. One key is used to encrypt the plaintext and the other key is used to decrypt the ciphertext. The important point here is that it does not matter which key is applied first, but that both keys are required for the process to work (Figure 1B). Because a pair of keys are required, this approach is also called asymmetric cryptography. In PKC, one of the keys is designated the public key and may be advertised as widely as the owner wants. The other key is designated the private keyand is never revealed to another party. It is straight forward to send messages under this scheme. Suppose Alice wants to send Bob a message. Alice encrypts some information using Bob's public key; Bob decrypts the ciphertext using his private key. This method could be also

25

MI 0035 Computer Network

used to prove who sent a message; Alice, for example, could encrypt some plaintext with her private key; when Bob decrypts using Alice's public key, he knows that Alice sent the message and Alice cannot deny having sent the message (non-repudiation). Public-key cryptography algorithms that are in use today for key exchange or digital signatures include: RSA: The first, and still most common, PKC implementation, named for the three MIT mathematicians who developed it Ronald Rivest, Adi Shamir, and Leonard Adleman. RSA today is used in hundreds of software products and can be used for key exchange, digital signatures, or encryption of small blocks of data. RSA uses a variable size encryption block and a variable size key. The key-pair is derived from a very large number, n, that is the product of two prime numbers chosen according to special rules; these primes may be 100 or more digits in length each, yielding an n with roughly twice as many digits as the prime factors. The public key information includes n and a derivative of one of the factors ofn; an attacker cannot determine the prime factors of n (and, therefore, the private key) from this information alone and that is what makes the RSA algorithm so secure. (Some descriptions of PKC erroneously state that RSA's safety is due to the difficulty in factoring large prime numbers. In fact, large prime numbers, like small prime numbers, only have two factors!) The ability for computers to factor large numbers, and therefore attack schemes such as RSA, is rapidly improving and systems today can find the prime factors of numbers with more than 200 digits. Nevertheless, if a large number is created from two prime factors that are roughly the same size, there is no known factorization algorithm that will solve the problem in a reasonable amount of time; a 2005 test to factor a 200-digit number took 1.5 years and over 50 years of compute time (see the Wikipedia article on integer factorization.) Regardless, one presumed protection of RSA is that users can easily increase the key size to always stay ahead of the computer processing curve. As an aside, the patent for RSA expired in September 2000 which does not appear to have affected RSA's popularity one way or the other. A detailed example of RSA is presented below in Section 5.3. Diffie-Hellman: After the RSA algorithm was published, Diffie and Hellman came up with their own algorithm. D-H is used for secret-key key exchange only, and not for authentication or digital signatures. More detail about Diffie-Hellman can be found below in Section 5.2. Digital Signature Algorithm (DSA): The algorithm specified in NIST's Digital Signature Standard (DSS), provides digital signature capability for the authentication of messages. ElGamal: Designed by Taher Elgamal, a PKC system similar to DiffieHellman and used for key exchange.

26

MI 0035 Computer Network

Elliptic Curve Cryptography (ECC): A PKC algorithm based upon elliptic curves. ECC can offer levels of security with small keys comparable to RSA and other PKC methods. It was designed for devices with limited compute power and/or memory, such as smartcards and PDAs. More detail about ECC can be found below in Section 5.8. Other references include "The Importance of ECC" Web page and the"Online Elliptic Curve Cryptography Tutorial", both from Certicom. See also RFC 6090 for a review of fundamental ECC algorithms. Public-Key Cryptography Standards (PKCS): A set of interoperable standards and guidelines for public-key cryptography, designed by RSA Data Security Inc. PKCS #1: RSA Cryptography Standard (Also RFC 3447) PKCS #2: Incorporated into PKCS #1. PKCS #3: Diffie-Hellman Key-Agreement Standard PKCS #4: Incorporated into PKCS #1.

PKCS #5: Password-Based Cryptography Standard (PKCS #5 V2.0 is also RFC 2898) PKCS #6: Extended-Certificate Syntax Standard (being phased out in favor of X.509v3) PKCS #7: Cryptographic Message Syntax Standard (Also RFC 2315) PKCS #8: Private-Key Information Syntax Standard (Also RFC 5208) PKCS #9: Selected Attribute Types (Also RFC 2985) PKCS #10: Certification Request Syntax Standard (Also RFC 2986) PKCS #11: Cryptographic Token Interface Standard PKCS #12: Personal Information Exchange Syntax Standard PKCS #13: Elliptic Curve Cryptography Standard

PKCS #14: Pseudorandom Number Generation Standard is no longer available PKCS #15: Cryptographic Token Information Format Standard

Cramer-Shoup: A public-key cryptosystem proposed by R. Cramer and V. Shoup of IBM in 1998.

27

MI 0035 Computer Network

Key Exchange Algorithm (KEA): A variation on Diffie-Hellman; proposed as the key exchange method for Capstone. LUC: A public-key cryptosystem designed by P.J. Smith and based on Lucas sequences. Can be used for encryption and signatures, using integer factoring. For additional information on PKC algorithms, see "Public-Key Encryption", Chapter 8 in Handbook of Applied Cryptography, by A. Menezes, P. van Oorschot, and S. Vanstone (CRC Press, 1996).

A digression: Who invented PKC? I tried to be careful in the first paragraph of this section to state that Diffie and Hellman "first described publicly" a PKC scheme. Although I have categorized PKC as a two-key system, that has been merely for convenience; the real criteria for a PKC scheme is that it allows two parties to exchange a secret even though the communication with the shared secret might be overheard. There seems to be no question that Diffie and Hellman were first to publish; their method is described in the classic paper, "New Directions in Cryptography," published in the November 1976 issue of IEEE Transactions on Information Theory. As shown below, Diffie-Hellman uses the idea that finding logarithms is relatively harder than exponentiation. And, indeed, it is the precursor to modern PKC which does employ two keys. Rivest, Shamir, and Adleman described an implementation that extended this idea in their paper "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems," published in the February 1978 issue of theCommunications of the ACM (CACM). Their method, of course, is based upon the relative ease of finding the product of two large prime numbers compared to finding the prime factors of a large number. Some sources, though, credit Ralph Merkle with first describing a system that allows two parties to share a secret although it was not a two-key system, per se. A Merkle Puzzle works where Alice creates a large number of encrypted keys, sends them all to Bob so that Bob chooses one at random and then lets Alice know which he has selected. An eavesdropper will see all of the keys but can't learn which key Bob has selected (because he has encrypted the response with the chosen key). In this case, Eve's effort to break in is the square of the effort of Bob to choose a key. While this difference may be small it is often sufficient. Merkle apparently took a computer science course at UC Berkeley in 1974 and described his method, but had difficulty making people understand it; frustrated, he dropped the course. Meanwhile, he submitted the paper "Secure Communication Over Insecure Channels" which was published in the CACM in April 1978; Rivest et al.'s paper even makes reference to it. Merkle's method certainly wasn't published first, but did he have the idea first?

28

MI 0035 Computer Network

An interesting question, maybe, but who really knows? For some time, it was a quiet secret that a team at the UK's Government Communications Headquarters (GCHQ) had first developed PKC in the early 1970s. Because of the nature of the work, GCHQ kept the original memos classified. In 1997, however, the GCHQ changed their posture when they realized that there was nothing to gain by continued silence. Documents show that a GCHQ mathematician named James Ellis started research into the key distribution problem in 1969 and that by 1975, Ellis, Clifford Cocks, and Malcolm Williamson had worked out all of the fundamental details of PKC, yet couldn't talk about their work. (They were, of course, barred from challenging the RSA patent!) After more than 20 years, Ellis, Cocks, and Williamson have begun to get their due credit. And the National Security Agency (NSA) claims to have knowledge of this type of algorithm as early as 1966 but there is no supporting documentation... yet. So this really was a digression...

3.3. Hash Functions Hash functions, also called message digests and one-way encryption, are algorithms that, in some sense, use no key (Figure 1C). Instead, a fixedlength hash value is computed based upon the plaintext that makes it impossible for either the contents or length of the plaintext to be recovered. Hash algorithms are typically used to provide a digital fingerprint of a file's contents, often used to ensure that the file has not been altered by an intruder or virus. Hash functions are also commonly employed by many operating systems to encrypt passwords. Hash functions, then, provide a measure of the integrity of a file. Hash algorithms that are in common use today include: Message Digest (MD) algorithms: A series of byte-oriented algorithms that produce a 128-bit hash value from an arbitrary-length message. MD2 (RFC 1319): Designed for systems with limited memory, such as smart cards. (MD2 has been relegated to historical status, perRFC 6149.) MD4 (RFC 1320): Developed by Rivest, similar to MD2 but designed specifically for fast processing in software. (MD4 has been relegated to historical status, per RFC 6150.) MD5 (RFC 1321): Also developed by Rivest after potential weaknesses were reported in MD4; this scheme is similar to MD4 but is slower because more manipulation is made to the original data. MD5 has been implemented in a large number of products although several

29

MI 0035 Computer Network

weaknesses in the algorithm were demonstrated by German cryptographer Hans Dobbertin in 1996 ("Cryptanalysis of MD5 Compress"). Secure Hash Algorithm (SHA): Algorithm for NIST's Secure Hash Standard (SHS). SHA-1 produces a 160-bit hash value and was originally published as FIPS 180-1 and RFC 3174. FIPS 180-2 (aka SHA-2) describes five algorithms in the SHS: SHA-1 plus SHA-224, SHA-256, SHA-384, and SHA-512 which can produce hash values that are 224, 256, 384, or 512 bits in length, respectively. SHA-224, -256, -384, and -512 are also described in RFC 4634. RIPEMD: A series of message digests that initially came from the RIPE (RACE Integrity Primitives Evaluation) project. RIPEMD-160 was designed by Hans Dobbertin, Antoon Bosselaers, and Bart Preneel, and optimized for 32-bit processors to replace the then-current 128-bit hash functions. Other versions include RIPEMD-256, RIPEMD-320, and RIPEMD-128. HAVAL (HAsh of VAriable Length): Designed by Y. Zheng, J. Pieprzyk and J. Seberry, a hash algorithm with many levels of security. HAVAL can create hash values that are 128, 160, 192, 224, or 256 bits in length. Whirlpool: A relatively new hash function, designed by V. Rijmen and P.S.L.M. Barreto. Whirlpool operates on messages less than 2256 bits in length, and produces a message digest of 512 bits. The design of this has function is very different than that of MD5 and SHA-1, making it immune to the same attacks as on those hashes (see below). Tiger: Designed by Ross Anderson and Eli Biham, Tiger is designed to be secure, run efficiently on 64-bit processors, and easily replace MD4, MD5, SHA and SHA-1 in other applications. Tiger/192 produces a 192-bit output and is compatible with 64-bit architectures; Tiger/128 and Tiger/160 produce a hash of length 128 and 160 bits, respectively, to provide compatibility with the other hash functions mentioned above. (Readers might be interested in HashCalc, a Windows-based program that calculates hash values using a dozen algorithms, including MD5, SHA-1 and several variants, RIPEMD-160, and Tiger. Command line utilities that calculate hash values include sha_verify by Dan Mares [Windows; supports MD5, SHA-1, SHA-2] and md5deep [cross-platform; supports MD5, SHA-1, SHA-256, Tiger, and Whirlpool].) Hash functions are sometimes misunderstood and some sources claim that no two files can have the same hash value. This is, in fact, not correct. Consider a hash function that provides a 128-bit hash value. There are, obviously, 2128 possible hash values. But there are a lot more than 2128 possiblefiles. Therefore, there have to be multiple files in fact, there have to be an infinite number of files! that can have the same 128-bit hash value.

30

MI 0035 Computer Network

The difficulty is finding two files with the same hash! What is, indeed, very hard to do is to try to create a file that has a given hash value so as to force a hash value collision which is the reason that hash functions are used extensively for information security and computer forensics applications. Alas, researchers in 2004 found that practical collision attacks could be launched on MD5, SHA-1, and other hash algorithms. Readers interested in this problem should read the following: AccessData. (2006, April). MD5 Collisions: The Effect on Computer Forensics. AccessData White Paper. Burr, W. (2006, March/April). Cryptographic hash standards: Where do we go from here? IEEE Security & Privacy, 4(2), 88-91. Dwyer, D. (2009, June 3). SHA-1 Collision Attacks Now 252. SecureWorks Research blog. Gutman, P., Naccache, D., & Palmer, C.C. (2005, May/June). When hashes collide. IEEE Security & Privacy, 3(3), 68-71. Klima, V. (March 2005). Finding MD5 Collisions - a Toy For a Notebook. Lee, R. (2009, January 7). Law Is Not A Science: Admissibility of Computer Evidence and MD5 Hashes. SANS Computer Forensics blog. Thompson, E. (2005, February). MD5 collisions and the impact on computer forensics. Digital Investigation, 2(1), 36-40. Wang, X., Feng, D., Lai, X., & Yu, H. (2004, August). Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD. Wang, X., Yin, Y.L., & Yu, H. (2005, February 13). Collision Search Attacks on SHA1. Readers are also referred to the Eindhoven University of Technology HashClash Project Web site. An excellent overview of the situation with hash collisions (circa 2005) can be found in RFC 4270 (by P. Hoffman and B. Schneier, November 2005). And for additional information on hash functions, see David Hopwood's MessageDigest Algorithms page. At this time, there is no obvious successor to MD5 and SHA-1 that could be put into use quickly; there are so many products using these hash functions that it could take many years to flush out all use of 128- and 160-bit hashes. That said, NIST announced in 2007 their Cryptographic Hash Algorithm Competition to find the next-generation secure hashing method. Dubbed SHA-3, this new scheme will augment FIPS 180-2. A list of submissions can be found at The SHA-3 Zoo. The SHA-3 standard may not be available until 2011 or 2012.

31

MI 0035 Computer Network

Certain extensions of hash functions are used for a variety of information security and digital forensics applications, such as: Fuzzy hashes are an area of intense research and represent hash values that represent two inputs that are similar. Fuzzy hashes are used to detect documents, images, or other files that are close to each other with respect to content. See "Fuzzy Hashing" (PDF | PPT) by Jesse Kornblum for a good treatment of this topic.

32

You might also like