You are on page 1of 50

BREAKING THE DATA TRANSFER BOTTLENECK

UDT: A High Performance Data Transport Protocol

Yunhong GU
gu@lac.uic.edu

National Center for Data Mining


University of Illinois at Chicago
October 10, 2005
Updated on August 8, 2009

udt.sourceforge.net udt.sourceforge.net

1 :: 50
Outline

INTRODUCTION

PROTOCOL DESIGN & IMPLEMENTATION

CONGESTION CONTROL

PERFORMANCE EVALUATION

COMPOSABLE UDT

CONCLUSIONS

udt.sourceforge.net

2 :: 50
>> INTRODUCTION

PROTOCOL DESIGN & IMPLEMENTATION

CONGESTION CONTROL

PERFORMANCE EVALUATION

COMPOSABLE UDT

CONCLUSIONS

udt.sourceforge.net

3 :: 50
Motivations
 The widespread use of high-speed networks (1Gb/s, 10Gb/s, etc.)
has enabled many new distributed data intensive applications
 Inexpensive fibers and advanced optical networking technologies (e.g.,
DWDM - Dense Wavelength Division Multiplexing)
 10Gb/s is common in high speed network testbeds, 40 Gb/s is emerging

 Large volumetric datasets


 Satellite weather data
 Astronomy observation
 Network monitoring

 The Internet transport protocol (TCP) does NOT scale well as


network bandwidth-delay product (BDP) increases

 New transport protocol is needed!

udt.sourceforge.net

4 :: 50
Data Transport Protocol

 Functionalities
 Streaming, messaging
 Reliability
 Timeliness
Applications
 Unicast vs. multicast

 Congestion control Transport Layer


 Efficiency
 Fairness Network Layer
 Convergence
 Distributedness Data link Layer

Physical Layer

udt.sourceforge.net

5 :: 50
TCP

 Reliable, data streaming, unicast

 Congestion control
 Increase congestion window size (cwnd) one full sized packet per RTT
 Halve the cwnd per loss event

½ Bandwidth * RTT

 Poor efficiency in high bandwidth-delay product networks

 Bias on flows with larger RTT


udt.sourceforge.net

6 :: 50
TCP

LAN US US-EU US-ASIA


1000

800

Throughput (Mb/s)
600

400

200
1000

800 0.01%
Throughput (Mb/s)

600 0.05%

0.1%

s
400

os
L
et
ck
Pa
0.5%
200

0.1%

1 10 100 200 400


Round Trip Time (ms)
udt.sourceforge.net

7 :: 50
Related Work

 TCP variants
 HighSpeed, Scalable, BiC, FAST, H-TCP, L-TCP

 Parallel TCP
 PSockets, GridFTP

 Rate-based reliable UDP


 RBUDP, Tsunami, FOBS, FRTP (based on SABUL), Hurricane (based on
UDT)

 XCP

 SABUL

udt.sourceforge.net

8 :: 50
Problems of Existing Work

 Hard to deploy
 TCP variants and XCP
 Need modifications in OS kernel and/or routers

 Cannot be used in shared networks


 Most reliable UDP-based protocols

 Poor fairness
 Intra-protocol fairness
 RTT fairness

 Manual parameter tuning

udt.sourceforge.net

9 :: 50
A New Protocol

LAN US US-EU US-ASIA


1000

800

Throughput (Mb/s)
600

400

200
1000

800 0.01%
Throughput (Mb/s)

600 0.05%

0.1%

s
400

os
L
et
ck
Pa
0.5%
200

0.1%

1 10 100 200 400


Round Trip Time (ms)
udt.sourceforge.net

10 :: 50
UDT (UDP-based Data Transfer Protocol)

 Application level, UDP-based

 Similar functionalities to TCP


 Connection-oriented reliable duplex unicast data streaming

 New protocol design and implementation

 New congestion control algorithm

 Configurable congestion control framework

udt.sourceforge.net

11 :: 50
Objective & Non-objective

 Objective
 For distributed data intensive applications in high speed networks
 A small number of flows share the abundant bandwidth
 Efficient, fair, and friendly
 Configurable
 Easily deployable and usable

 Non-objective
 Replace TCP on the Internet

udt.sourceforge.net

12 :: 50
UDT Project

 Open source (udt.sourceforge.net)

 Design and implement the UDT protocol

 Design the UDT congestion control algorithm

 Evaluate experimentally the performance of UDT

 Design and implement a configurable protocol framework based on


UDT (Composable UDT)

udt.sourceforge.net

13 :: 50
INTRODUCTION

>> PROTOCOL DESIGN & IMPLEMENTATION

CONGESTION CONTROL

PERFORMANCE EVALUATION

COMPOSABLE UDT

CONCLUSIONS

udt.sourceforge.net

14 :: 50
UDT Overview

 Two orthogonal elements


 The UDT protocol
 The UDT congestion control algorithm

 Protocol design & implementation


 Functionality
 Efficiency

 Congestion control algorithm


 Efficiency, fairness, friendliness, and stability

udt.sourceforge.net

15 :: 50
UDT Overview

Applications

UDT Socket

Applications Applications
UDT

Socket API Socket API

TCP UDP

udt.sourceforge.net

16 :: 50
Functionality

 Reliability
 Packet-based sequencing
 Acknowledgment and loss report from receiver
 ACK sub-sequencing
 Retransmission (based on loss report and timeout)

 Streaming and Messaging


 Buffer/memory management

 Connection maintenance
 Handshake, keep-alive message, teardown message

 Duplex
 Each UDT instance contains both a sender and a receiver
udt.sourceforge.net

17 :: 50
Protocol Architecture

Sender UDP Sender


Seq. No TS Payload

 UDP Channel 

ACK Seq. No
Receiver Receiver
NAK Loss List

A B

udt.sourceforge.net

18 :: 50
Software Architecture

CC

Sender's
Sender
Buffer
Sender's
Loss List

UDP Channel
API

Receiver's
Loss List
Receiver's
Receiver
Buffer

Listener

udt.sourceforge.net

19 :: 50
Efficiency Consideration

 Less packets
 Timer-based acknowledging

 Less CPU time


 Reduce per packet processing time
 Reduce memory copy
 Reduce loss list processing time
 Light ACK vs. regular ACK

 Parallel processing
 Threading architecture

 Less burst in processing


 Evenly distribute the processing time
udt.sourceforge.net

20 :: 50
Application Programming Interface (API)

 Socket API

 New API
 sendfile/recvfile: efficient file transfer
 sendmsg/recvmsg: messaging with partial reliability
 selectEx: a more efficient version of “select”

 Rendezvous Connect
 Firewall traversing

udt.sourceforge.net

21 :: 50
INTRODUCTION

PROTOCOL DESIGN & IMPLEMENTATION

>> CONGESTION CONTROL

PERFORMANCE EVALUATION

COMPOSABLE UDT

CONCLUSIONS

udt.sourceforge.net

22 :: 50
Overview

 Congestion control vs. flow control


 Congestion control: effectively utilize the network bandwidth
 Flow control: prevent the receiver from being overwhelmed by incoming
packets

 Window-based vs. rate-based


 Window-based: tune the maximum number of on-flight packets (TCP)
 Rate-based: tune the inter-packet sending time (UDT)

 AIMD: additive increases multiplicative decreases

 Feedback
 Packet loss (Most TCP variants, UDT)
 Delay (Vegas, FAST)

udt.sourceforge.net

23 :: 50
AIMD with Decreasing Increases

 AIMD
 x = x + (x), for every constant interval (e.g., RTT)
 x = (1 - ) x, when there is a packet loss event
where x is the packet sending rate.

 TCP
 (x)  1, and the increase interval is RTT.
  = 0.5

 AIMD with Decreasing Increase


 (x) is non-increasing, and limx->+ (x) = 0.

udt.sourceforge.net

24 :: 50
AIMD with Decreasing Increases

(x)

UDT

Scalable TCP

HighSpeed TCP

AIMD (TCP NewReno)

x
udt.sourceforge.net

25 :: 50
UDT Control Algorithm

 Increase
 (x) = f( B - x ) * c
where B is the link capacity (x)
(Bandwidth), c is a constant
parameter  ( x)  10  log( B  x )   c

 Constant rate control interval


(SYN), irrelevant to RTT
 SYN = 0.01 seconds

 Decrease
 Randomized decrease factor x
  = 1 – (8/9)n

udt.sourceforge.net

26 :: 50
The Increase Formula: an Example

Bandwidth (B) = 10 Gbps, Packet size = 1500 bytes

x (Mbps) B - x (Mbps) Increment (pkts/SYN)

[0, 9000) (1000, 10000] 10

[9000, 9900) (100, 1000] 1

[9900, 9990) (10, 100] 0.1

[9990, 9999) (1, 10] 0.01


[9999, 9999.9) (0.1, 1] 0.001
9999.9+ <0.1 0.00067

udt.sourceforge.net

27 :: 50
Dealing with Packet Loss

 Loss synchronization
 Randomization method

 Non-congestion loss
 Do not decrease sending rate for the first packet loss

M=5, N=2 M=8, N=3

 Packet reordering

udt.sourceforge.net

28 :: 50
Bandwidth Estimation

 Packet Pair

P2 P1 P2 P1 P2 P1

Packet Size / Space  Bottleneck Bandwidth

 Filters
 Cross traffic
 Interrupt Coalescence
 Robust to estimation errors

 Randomized interval to send packet pair

udt.sourceforge.net

29 :: 50
INTRODUCTION

PROTOCOL DESIGN & IMPLEMENTATION

CONGESTION CONTROL

>> PERFORMANCE EVALUATION

COMPOSABLE UDT

CONCLUSIONS

udt.sourceforge.net

30 :: 50
Performance Characteristics

 Efficiency
 Higher bandwidth utilization, less CPU usage

 Intra-protocol fairness
 Max-min fairness
 Jain's fairness index

 TCP friendliness
 Bulk TCP flow vs Bulk UDT flow
 Short-lived TCP flow (slow start phase) vs Bulk UDT flow

 Stability (oscillations)
 Stability index (standard deviation)

udt.sourceforge.net

31 :: 50
Evaluation Strategies

 Simulations vs. experiments


 NS2 network simulator, NCDM teraflow testbed

 Setup
 Network topology, bandwidth, distance, queuing, Link error rate, etc.
 Concurrency (number of parallel flows)

 Comparison (against TCP)

 Real world applications


 SDSS data transfer, high performance mining of streaming data, etc.

 Independent evaluation
 SLAC, JGN2, UvA, Unipmn (Italy), etc.
udt.sourceforge.net

32 :: 50
Efficiency, Fairness, & Stability

206.220.241.16 145.146.98.81

206.220.241.15 145.146.98.80
206.220.241.14 145.146.98.79
1Gb/s bandwidth, 106 ms RTT,
206.220.241.13 145.146.98.78

StarLight, Chicago SARA, Amsterdam

Flow 1

Flow 2

Flow 3

Flow 4

0 100 200 300 400 500 600 700


Time (sec)
udt.sourceforge.net

33 :: 50
Efficiency, Fairness, & Stability

1000
Throughout (Mbits/s) 900

450

300
200

0
0 100 200 300 400 500 600 700
Time (s)
Flow 1 902 466 313 215 301 452 885
Flow 2 446 308 216 310 452
Flow 3 302 202 307
Flow 4 197
Efficiency 902 912 923 830 918 904 885
Fairness 1 0.999 0.999 0.998 0.999 1 1
Stability 0.11 0.11 0.08 0.16 0.04 0.02 0.04

udt.sourceforge.net

34 :: 50
TCP Friendliness

80
TCP Throughput (Mb/s)

70

60

50

40

30

20
0 1 2 3 4 5 6 7 8 9 10
Number of UDT flows

 500 1MB TCP flows vs. 0 – 10 bulk UDT flows


 1Gb/s between Chicago and Amsterdam

udt.sourceforge.net

35 :: 50
INTRODUCTION

PROTOCOL DESIGN & IMPLEMENTATION

CONGESTION CONTROL

PERFORMANCE EVALUATION

>> COMPOSABLE UDT

CONCLUSIONS

udt.sourceforge.net

36 :: 50
Composable UDT - Objectives

 Easy implementation and deployment of new control algorithms

 Easy evaluation of new control algorithms

 Application awareness support and dynamic configuration

udt.sourceforge.net

37 :: 50
Composable UDT - Methodologies

 Packet sending control


 Window-based, rate-based, and hybrid

 Control event handling


 onACK, onLoss, onTimeout, onPktSent, onPktRecved, etc.

 Protocol parameters access


 RTT, loss rate, RTO, etc.

 Packet extension
 User-defined control packets

udt.sourceforge.net

38 :: 50
Composable UDT - Evaluation

 Simplicity
 Can it be easily used?

 Expressiveness
 Can it be used to implement most control protocols?

 Similarity
 Can Composable UDT based implementations reproduce the
performance of their native implementations?

 Overhead
 Will the overhead added by Composable UDT be too large?

udt.sourceforge.net

39 :: 50
Simplicity & Expressiveness

 Eight event handlers, four protocol control functions, and one


performance monitoring function.

 Support a large variety of protocols


 Reliable UDT blast
 TCP and its variants (both loss and delay based)
 Group transport protocols

udt.sourceforge.net

40 :: 50
Simplicity & Expressiveness

CCC
Base Congestion
Control Class

CTCP CGTP CUDPBlast


Group Transport Reliable UDP
TCP NewReno
Protocol Blast
28

CVegas CScalable CHS CBiC CWestwood


TCP Vegas Scalable TCP HighSpeed TCP BiC TCP TCP Westwood

73 / +132-6 11 / +192-29 8 / +27-1 11 / +192-29 27 / +145-2

CFAST
FAST TCP

37 / +351-2
udt.sourceforge.net

41 :: 50
Similarity and Overhead
 Similarity
 How Composable UDT based implementations can simulate their native
implementations

 CTCP vs. Linux TCP


Flow Throughput Fairness Stability
# TCP CTCP TCP CTCP TCP CTCP
1 112 122 1 1 0.517 0.415
2 191 208 0.997 0.999 0.476 0.426
4 322 323 0.949 0.999 0.484 0.492
8 378 422 0.971 0.999 0.633 0.550
16 672 642 0.958 0.985 0.502 0.482
32 877 799 0.988 0.997 0.491 0.470
64 921 716 0.994 0.996 0.569 0.529

 CPU usage
 Sender: CTCP uses about 100% more times of CPU as Linux TCP
 Receiver: CTCP uses about 20% more CPU than Linux TCP
udt.sourceforge.net

42 :: 50
INTRODUCTION

PROTOCOL DESIGN & IMPLEMENTATION

CONGESTION CONTROL

PERFORMANCE EVALUATION

COMPOSABLE UDT

>> CONCLUSIONS

udt.sourceforge.net

43 :: 50
Contributions

 A high performance data transport protocol and associated


implementation
 The UDT protocol
 Open source UDT library (udt.sourceforge.net)
 User includes research institutes and industries

 An efficient and fair congestion control algorithm


 DAIMD & the UDT control algorithm
 Packet loss handling techniques
 Using bandwidth estimation technique in congestion control

 A configurable transport protocol framework


 Composable UDT

udt.sourceforge.net

44 :: 50
Selected Publications

 Papers on the UDT Protocol


 UDT: UDP-based Data Transfer for High-Speed Wide Area Networks,
Yunhong Gu and Robert L. Grossman, Computer Networks (Elsevier). Volume
51, Issue 7. May 2007.
 Supporting Configurable Congestion Control in Data Transport Services,
Yunhong Gu and Robert L. Grossman, SC 2005, Nov 12 - 18, Seattle, WA.
 Experiences in Design and Implementation of a High Performance Transport
Protocol, Yunhong Gu, Xinwei Hong, and Robert L. Grossman, SC 2004, Nov
6 - 12, Pittsburgh, PA.
 An Analysis of AIMD Algorithms with Decreasing Increases, Yunhong Gu,
Xinwei Hong and Robert L. Grossman, First Workshop on Networks for Grid
Applications (Gridnets 2004), Oct. 29, San Jose, CA.
 Internet Draft
 UDT: A Transport Protocol for Data Intensive Applications, Yunhong Gu and
Robert L. Grossman, draft-gg-udt-02.txt.

udt.sourceforge.net

45 :: 50
Commercialization

 Baidu Hi Messenger
 Maidsafe
 Movie2Me by broadcasting center europe
 NiFTy TV
 Sterling File Accelerator (SFA) by Sterling Commerce
 Tideworks
 PowerFolder
 GridFTP
 etc.

udt.sourceforge.net

46 :: 50
Support and Services

 Online Forum
 https://sourceforge.net/forum/?group_id=115059
 Support both English and Chinese languages

 Consulting service
 Dedicated service to customers’ projects
 Provided by UDT developers

udt.sourceforge.net

47 :: 50
Achievements

 SC 2002 Bandwidth Challenge “Best Use of Emerging Network


Infrastructure” Award

 SC 2003 Bandwidth Challenge “Application Foundation” Award

 SC 2004 Bandwidth Challenge “Best Replacement for FedEx / UDP


Fairness” Award

 SC 2006 Bandwidth Challenge Winner

 SC 2008 Bandwidth Challenge Winner

udt.sourceforge.net

48 :: 50
Vision

 Short-term
 A practical solution to the distributed data intensive applications in high
BDP environments

 Long-term
 Evolve with new technologies (open source & open standard)
 More functionalities and support for more use scenarios
 Network research platform (e.g., fast prototyping and evaluation of new
control algorithms)

udt.sourceforge.net

49 :: 50
The End

Thank You!

Yunhong Gu, October 10, 2005


Updated on August 8, 2009

udt.sourceforge.net

50 :: 50

You might also like