You are on page 1of 28

Chapter 13

TCP Implementation

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Objectives
 Understand the structure of typical TCP implementation
 Outline the implementation of extended standards for TCP
over high-performance networks
 Understand the sources of end-system overhead in typical
TCP implementations, and techniques to minimize them
 Quantify the effect of end-system overhead and buffering
on TCP performance
 Understand the role of Remote Direct Memory Access
(RDMA) extensions for high-performance IP networking

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Contents

 Overview of TCP implementation


 High-performance TCP
 End-system overhead
 Copy avoidance
 TCP offload

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Implementation
Overview

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Overall Structure (RFC 793)

 Internal structure specified in RFC 793


Fig. 13.1

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Data Structure of TCP Endpoint
 Data structure of TCP endpoint
 Transmission control block: Stores the connection state and
related variables
 Transmit queue: Buffers containing outstanding data
 Receiver queue: Buffers for received data (but not yet forwarded
to higher layer)

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Buffering and Data Movement

 Buffer queues reside in the protocol-independent socket


layer within the operating system kernel
 TCP sender upcalls to the transmit queue to obtain data
 TCP receiver notifies the receive queue of correct arrival of
incoming data
 BSD-derived kernels implement buffers in mbufs
 Moves data by reference
 Reduces the need to copy
 Most implementations commit buffer space to the queue
lazily
 Queues consume memory only when the bandwidth of the network
does not match the rate at which TCP user produces/consumes data

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


User Memory Access

 Provides for movement of data to and from the


memory of the TCP user
 Copy semantics
 SEND and RECEIVE are defined with copy semantics
 The user can modify a send buffer at the time the
SEND is issued
 Direct access
 Allows TCP to access the user buffers directly
 Bypasses copying of data

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


TCP Data Exchange
 TCP endpoints cooperate by exchanging segments
 Each segment contains:
 Sequence number seg.seq, segment data length seg.len,
status bits, ack seq number seg.ack, advertised receive
window size seg.wnd
 Fig. 13.3

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Data Retransmissions

 TCP sender uses retransmission timer to derive


retransmission of unacknowledged data
 Retransmits a segment if the timer fires
 Retransmission timeout (RTO)
 RTO<RTT: Aggressive; too many retransmissions
 RTO>RTT: Conservative; low utilisation due to
connection idle
 In practice, adaptive retransmission timer with
back-off is used (Specified in RFC 2988)

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Congestion Control

 A retransmission event indicates (to TCP sender) that the


network is congested
 Congestion management is a function of the end-systems
 RFC 2581 requires TCP end-systems respond to
congestion by reducing sending rate
 AIMD: Additive Increase Multiplicative Decrease
 TCP sender probes for available bandwidth on the network path
 Upon detection of congestion, TCP sender multiplicatively reduces
cwnd
 Achieves fairness among TCP connections

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


High Performance
TCP

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


TCP Implementation with High
Bandwidth-Delay Product

 High bandwidth-delay product:


 High speed networks (e.g. optical networks)
 High-latency networks (e.g. satellite network)
 Collectively called Long Fat Networks (LFNs)
 LFNs require large window size (more than 16
bits as originally defined for TCP)
 Window scale option allows TCP sender to
advertise large window size (e.g. 1 Gbyte)
 Specified at connection setup
 Limits window sizes in units of up to 16K

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Round Trip Time Estimation

 Accuracy of RTT estimation depends on frequent


sample measurements of RTT
 Percentage of segments sampled decreases with larger
windows
 May be insufficient for LFNs
 Timestamp option
 Enables the sender to compute RTT samples
 Provides safeguard against accepting out-of-sequence
numbers

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Path MTU Discovery

 Most efficient by using the largest MSS without


segmentation
 Enables TCP sender to automatically discover the
largest acceptable MSS
 TCP implementation must correctly handle
dynamic changes to MSS
 Never leaves more than 2*MSS bytes of data
unacknowledged
 TCP sender may need to segment data for
retransmission

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


End-System
Overhead

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Reduce End-System Overhead

 TCP imposes processing overhead in


operating system
Adds directly to latency
Consumes a significant share of CPU cycles
and memory
 Reducing overhead can improve application
throughput

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Relationship Between Bandwidth
and CPU Utilization

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Achievable Throuput for Host-
Limited Systems

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Sources of Overhead for TCP/IP
 Per-transfer overhead
 Per-packet overhead
 Per-byte overhead
 Fig. 13.5

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Per-Packet Overhead

 Increasing packet size can mitigate the impact of


per-packet and per-segment overhead
 Fig. 13.6
 Increasing segment size S increases achievable
bandwidth
 As packet size grows, the effect of per-packet overhead
becomes less significant
 Interrupts
 A significant source of per-packet overhead

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Relationship between Packet Size
and Achievable Bandwidth

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Relationship between Packet
Overhead and Bandwidth

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Checksum Overhead

 A source of per-byte overhead


 Ways for reducing checksum overhead:
Complete multiple steps in a single traversal to
reduce per-byte overhead
Integrate chechsumming with the data copy
Compute the checksum in hardware

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Copy Avoidance

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


Copy Avoidance for High-
Performance TCP

 Page remapping
 Uses virtual memory to reduce copying across the TCP/user
interface
 Typically resides at the socket layer in the OS kernel
 Scatter/gather I/O
 Does not require copy semantics
 Entails a comprehensive restructuring of OS and I/O interfaces
 Remote Direct Memory Access (RDMA)
 Steers incoming data directly into user-specified buffers
 IETF standards under way

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


TCP Offload

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall


TCP Offload

 Supports TCP/IP protocol functions directly


on the network adapter (NIC)
Processing
TCP checksum offloading
 Significantly reduces per-packet overheads
for TCP/IP protocol processing
 Helps to avoid expensive copy operations

High Performance TCP/IP Networking, Hassan-Jain Prentice Hall

You might also like