You are on page 1of 640

Introduction to Algorithms

6.046J/18.401J/SMA5503
Lecture 1
Prof. Charles E. Leiserson
Day 1 Introduction to Algorithms L1.2
Welcome to Introduction to
Algorithms, Fall 2001
Handouts
1. Course Information
2. Calendar
3. Registration (MIT students only)
4. References
5. Objectives and Outcomes
6. Diagnostic Survey
Day 1 Introduction to Algorithms L1.3
Course information
1. Staff
2. Distance learning
3. Prerequisites
4. Lectures
5. Recitations
6. Handouts
7. Textbook (CLRS)
8. Website
9. Extra help
10.Registration (MIT only)
11.Problem sets
12.Describing algorithms
13.Grading policy
14.Collaboration policy
Course information handout
Day 1 Introduction to Algorithms L1.4
Analysis of algorithms
The theoretical study of computer-program
performance and resource usage.
Whats more important than performance?
modularity
correctness
maintainability
functionality
robustness
user-friendliness
programmer time
simplicity
extensibility
reliability
Day 1 Introduction to Algorithms L1.5
Why study algorithms and
performance?
Algorithms help us to understand scalability.
Performance often draws the line between what
is feasible and what is impossible.
Algorithmic mathematics provides a language
for talking about program behavior.
The lessons of program performance generalize
to other computing resources.
Speed is fun!
Day 1 Introduction to Algorithms L1.6
The problem of sorting
Input: sequence a
1
, a
2
, , a
n
of numbers.
Example:
Input: 8 2 4 9 3 6
Output: 2 3 4 6 8 9
Output: permutation a'
1
, a'
2
, , a'
n
such
that a'
1
a'
2

a'
n
.
Day 1 Introduction to Algorithms L1.7
Insertion sort
INSERTION-SORT (A, n) A[1 . . n]
for j 2 to n
do key A[ j]
i j 1
while i > 0 and A[i] > key
do A[i+1] A[i]
i i 1
A[i+1] = key
pseudocode
i j
key
sorted
A:
1 n
Day 1 Introduction to Algorithms L1.8
Example of insertion sort
8 2 4 9 3 6
Day 1 Introduction to Algorithms L1.9
Example of insertion sort
8 2 4 9 3 6
Day 1 Introduction to Algorithms L1.10
Example of insertion sort
8 2 4 9 3 6
2 8 4 9 3 6
Day 1 Introduction to Algorithms L1.11
Example of insertion sort
8 2 4 9 3 6
2 8 4 9 3 6
Day 1 Introduction to Algorithms L1.12
Example of insertion sort
8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
Day 1 Introduction to Algorithms L1.13
Example of insertion sort
8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
Day 1 Introduction to Algorithms L1.14
Example of insertion sort
8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6
Day 1 Introduction to Algorithms L1.15
Example of insertion sort
8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6
Day 1 Introduction to Algorithms L1.16
Example of insertion sort
8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6
2 3 4 8 9 6
Day 1 Introduction to Algorithms L1.17
Example of insertion sort
8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6
2 3 4 8 9 6
Day 1 Introduction to Algorithms L1.18
Example of insertion sort
8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6
2 3 4 8 9 6
2 3 4 6 8 9 done
Day 1 Introduction to Algorithms L1.19
Running time
The running time depends on the input: an
already sorted sequence is easier to sort.
Parameterize the running time by the size of
the input, since short sequences are easier to
sort than long ones.
Generally, we seek upper bounds on the
running time, because everybody likes a
guarantee.
Day 1 Introduction to Algorithms L1.20
Kinds of analyses
Worst-case: (usually)
T(n) = maximum time of algorithm
on any input of size n.
Average-case: (sometimes)
T(n) = expected time of algorithm
over all inputs of size n.
Need assumption of statistical
distribution of inputs.
Best-case: (bogus)
Cheat with a slow algorithm that
works fast on some input.
Day 1 Introduction to Algorithms L1.21
Machine-independent time
What is insertion sorts worst-case time?
It depends on the speed of our computer:
relative speed (on the same machine),
absolute speed (on different machines).
BIG IDEA:
Ignore machine-dependent constants.
Look at growth of T(n) as n .
Asymptotic Analysis
Asymptotic Analysis
Day 1 Introduction to Algorithms L1.22
-notation
Drop low-order terms; ignore leading constants.
Example: 3n
3
+ 90n
2
5n + 6046 = (n
3
)
Math:
(g(n)) = { f (n) : there exist positive constants c
1
, c
2
, and
n
0
such that 0 c
1
g(n) f (n) c
2
g(n)
for all n n
0
}
Engineering:
Day 1 Introduction to Algorithms L1.23
Asymptotic performance
n
T(n)
n
0
We shouldnt ignore
asymptotically slower
algorithms, however.
Real-world design
situations often call for a
careful balancing of
engineering objectives.
Asymptotic analysis is a
useful tool to help to
structure our thinking.
When n gets large enough, a (n
2
) algorithm
always beats a (n
3
) algorithm.
Day 1 Introduction to Algorithms L1.24
Insertion sort analysis
Worst case: Input reverse sorted.
( )

=
= =
n
j
n j n T
2
2
) ( ) (
Average case: All permutations equally likely.
( )

=
= =
n
j
n j n T
2
2
) 2 / ( ) (
Is insertion sort a fast sorting algorithm?
Moderately so, for small n.
Not at all, for large n.
[arithmetic series]
Day 1 Introduction to Algorithms L1.25
Merge sort
MERGE-SORT A[1 . . n]
1. If n = 1, done.
2. Recursively sort A[ 1 . . n/2 ]
and A[ n/2+1 . . n ] .
3. Merge the 2 sorted lists.
Key subroutine: MERGE
Day 1 Introduction to Algorithms L1.26
Merging two sorted arrays
20
13
7
2
12
11
9
1
Day 1 Introduction to Algorithms L1.27
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
Day 1 Introduction to Algorithms L1.28
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
Day 1 Introduction to Algorithms L1.29
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
Day 1 Introduction to Algorithms L1.30
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
Day 1 Introduction to Algorithms L1.31
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
Day 1 Introduction to Algorithms L1.32
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
Day 1 Introduction to Algorithms L1.33
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
9
Day 1 Introduction to Algorithms L1.34
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
9
20
13
12
11
Day 1 Introduction to Algorithms L1.35
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
9
20
13
12
11
11
Day 1 Introduction to Algorithms L1.36
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
9
20
13
12
11
11
20
13
12
Day 1 Introduction to Algorithms L1.37
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
9
20
13
12
11
11
20
13
12
12
Day 1 Introduction to Algorithms L1.38
Merging two sorted arrays
20
13
7
2
12
11
9
1
1
20
13
7
2
12
11
9
2
20
13
7
12
11
9
7
20
13
12
11
9
9
20
13
12
11
11
20
13
12
12
Time = (n) to merge a total
of n elements (linear time).
Day 1 Introduction to Algorithms L1.39
Analyzing merge sort
MERGE-SORT A[1 . . n]
1. If n = 1, done.
2. Recursively sort A[ 1 . . n/2 ]
and A[ n/2+1 . . n ] .
3. Merge the 2 sorted lists
T(n)
(1)
2T(n/2)
(n)
Abuse
Sloppiness: Should be T( n/2 ) + T( n/2 ) ,
but it turns out not to matter asymptotically.
Day 1 Introduction to Algorithms L1.40
Recurrence for merge sort
T(n) =
(1) if n = 1;
2T(n/2) + (n) if n > 1.
We shall usually omit stating the base
case when T(n) = (1) for sufficiently
small n, but only when it has no effect on
the asymptotic solution to the recurrence.
CLRS and Lecture 2 provide several ways
to find a good upper bound on T(n).
Day 1 Introduction to Algorithms L1.41
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
Day 1 Introduction to Algorithms L1.42
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
T(n)
Day 1 Introduction to Algorithms L1.43
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
T(n/2)
T(n/2)
cn
Day 1 Introduction to Algorithms L1.44
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
T(n/4) T(n/4) T(n/4) T(n/4)
cn/2
cn/2
Day 1 Introduction to Algorithms L1.45
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2
cn/2
(1)

Day 1 Introduction to Algorithms L1.46


Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2
cn/2
(1)

h = lg n
Day 1 Introduction to Algorithms L1.47
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2
cn/2
(1)

h = lg n
cn
Day 1 Introduction to Algorithms L1.48
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2
cn/2
(1)

h = lg n
cn
cn
Day 1 Introduction to Algorithms L1.49
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2
cn/2
(1)

h = lg n
cn
cn
cn

Day 1 Introduction to Algorithms L1.50


Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2
cn/2
(1)

h = lg n
cn
cn
cn
#leaves = n (n)

Day 1 Introduction to Algorithms L1.51


Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/4 cn/4 cn/4 cn/4
cn/2
cn/2
(1)

h = lg n
cn
cn
cn
#leaves = n (n)
Total = (n lg n)

Day 1 Introduction to Algorithms L1.52


Conclusions
(n lg n) grows more slowly than (n
2
).
Therefore, merge sort asymptotically
beats insertion sort in the worst case.
In practice, merge sort beats insertion
sort for n > 30 or so.
Go test it out for yourself!
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 2
Prof. Erik Demaine
Day 3 Introduction to Algorithms L2.2
Solving recurrences
The analysis of merge sort from
Lecture 1 required us to solve a
recurrence.
Recurrences are like solving integrals,
differential equations, etc.
o Learn a few tricks.
Lecture 3: Applications of recurrences.
Day 3 Introduction to Algorithms L2.3
Substitution method
1. Guess the form of the solution.
2. Verify by induction.
3. Solve for constants.
The most general method:
Example: T(n) = 4T(n/2) + n
[Assume that T(1) = (1).]
Guess O(n
3
) . (Prove O and separately.)
Assume that T(k) ck
3
for k < n .
Prove T(n) cn
3
by induction.
Day 3 Introduction to Algorithms L2.4
Example of substitution
3
3 3
3
3
) ) 2 / ((
) 2 / (
) 2 / ( 4
) 2 / ( 4 ) (
cn
n n c cn
n n c
n n c
n n T n T

=
+ =
+
+ =
desired residual
whenever (c/2)n
3
n 0, for example,
if c 2 and n 1.
desired
residual
Day 3 Introduction to Algorithms L2.5
Example (continued)
We must also handle the initial conditions,
that is, ground the induction with base
cases.
Base: T(n) = (1) for all n < n
0
, where n
0
is a suitable constant.
For 1 n < n
0
, we have (1) cn
3
, if we
pick c big enough.
This bound is not tight!
Day 3 Introduction to Algorithms L2.6
A tighter upper bound?
We shall prove that T(n) = O(n
2
).
Assume that T(k) ck
2
for k < n:
) (
4
) 2 / ( 4 ) (
2
n O
n cn
n n T n T
=
+
+ =
Wrong! We must prove the I.H.
2
2
) (
cn
n cn

=
for no choice of c > 0. Lose!
[ desired residual ]
Day 3 Introduction to Algorithms L2.7
A tighter upper bound!
IDEA: Strengthen the inductive hypothesis.
Subtract a low-order term.
Inductive hypothesis: T(k) c
1
k
2
c
2
k for k < n.

) (
2
) 2 / ( ) 2 / ( ( 4
) 2 / ( 4 ) (
2
2
1
2 2
2
1
2
2
1
2
2
1
n c n c
n n c n c n c
n n c n c
n n c n c
n n T n T

=
+ =
+
+ =
if c
2
> 1.
Pick c
1
big enough to handle the initial conditions.
Day 3 Introduction to Algorithms L2.8
Recursion-tree method
A recursion tree models the costs (time) of a
recursive execution of an algorithm.
The recursion tree method is good for
generating guesses for the substitution method.
The recursion-tree method can be unreliable,
just like any method that uses ellipses ().
The recursion-tree method promotes intuition,
however.
Day 3 Introduction to Algorithms L2.9
Example of recursion tree
Solve T(n) = T(n/4) + T(n/2) + n
2
:
Day 3 Introduction to Algorithms L2.10
Example of recursion tree
T(n)
Solve T(n) = T(n/4) + T(n/2) + n
2
:
Day 3 Introduction to Algorithms L2.11
Example of recursion tree
T(n/4)
T(n/2)
n
2
Solve T(n) = T(n/4) + T(n/2) + n
2
:
Day 3 Introduction to Algorithms L2.12
Example of recursion tree
Solve T(n) = T(n/4) + T(n/2) + n
2
:
n
2
(n/4)
2
(n/2)
2
T(n/16) T(n/8) T(n/8) T(n/4)
Day 3 Introduction to Algorithms L2.13
Example of recursion tree
(n/16)
2
(n/8)
2
(n/8)
2
(n/4)
2
(n/4)
2
(n/2)
2
(1)

Solve T(n) = T(n/4) + T(n/2) + n


2
:
n
2
Day 3 Introduction to Algorithms L2.14
Example of recursion tree
Solve T(n) = T(n/4) + T(n/2) + n
2
:
(n/16)
2
(n/8)
2
(n/8)
2
(n/4)
2
(n/4)
2
(n/2)
2
(1)

2
n
n
2
Day 3 Introduction to Algorithms L2.15
Example of recursion tree
Solve T(n) = T(n/4) + T(n/2) + n
2
:
(n/16)
2
(n/8)
2
(n/8)
2
(n/4)
2
(n/4)
2
(n/2)
2
(1)

2
16
5
n
2
n
n
2
Day 3 Introduction to Algorithms L2.16
Example of recursion tree
Solve T(n) = T(n/4) + T(n/2) + n
2
:
(n/16)
2
(n/8)
2
(n/8)
2
(n/4)
2
(n/4)
2
(1)

2
16
5
n
2
n
2
256
25
n
n
2
(n/2)
2

Day 3 Introduction to Algorithms L2.17


Example of recursion tree
Solve T(n) = T(n/4) + T(n/2) + n
2
:
(n/16)
2
(n/8)
2
(n/8)
2
(n/4)
2
(n/4)
2
(1)

2
16
5
n
2
n
2
256
25
n
( ) ( ) ( ) 1
3
16
5
2
16
5
16
5
2
L + + + + n

Total =
= (n
2
)
n
2
(n/2)
2
geometric series
Day 3 Introduction to Algorithms L2.18
The master method
The master method applies to recurrences of
the form
T(n) = aT(n/b) + f (n) ,
where a 1, b > 1, and f is asymptotically
positive.
Day 3 Introduction to Algorithms L2.19
Three common cases
Compare f (n) with n
log
b
a
:
1. f (n) = O(n
log
b
a
) for some constant > 0.
f (n) grows polynomially slower than n
log
b
a
(by an n

factor).
Solution: T(n) = (n
log
b
a
) .
2. f (n) = (n
log
b
a
lg
k
n) for some constant k 0.
f (n) and n
log
b
a
grow at similar rates.
Solution: T(n) = (n
log
b
a
lg
k+1
n) .
Day 3 Introduction to Algorithms L2.20
Three common cases (cont.)
Compare f (n) with n
log
b
a
:
3. f (n) = (n
log
b
a +
) for some constant > 0.
f (n) grows polynomially faster than n
log
b
a
(by
an n

factor),
and f (n) satisfies the regularity condition that
af (n/b) c f (n) for some constant c < 1.
Solution: T(n) = ( f (n)) .
Day 3 Introduction to Algorithms L2.21
Examples
Ex. T(n) = 4T(n/2) + n
a = 4, b = 2 n
log
b
a
= n
2
; f (n) = n.
CASE 1: f (n) = O(n
2
) for = 1.
T(n) = (n
2
).
Ex. T(n) = 4T(n/2) + n
2
a = 4, b = 2 n
log
b
a
= n
2
; f (n) = n
2
.
CASE 2: f (n) = (n
2
lg
0
n), that is, k = 0.
T(n) = (n
2
lgn).
Day 3 Introduction to Algorithms L2.22
Examples
Ex. T(n) = 4T(n/2) + n
3
a = 4, b = 2 n
log
b
a
= n
2
; f (n) = n
3
.
CASE 3: f (n) = (n
2 +
) for = 1
and 4(cn/2)
3
cn
3
(reg. cond.) for c = 1/2.
T(n) = (n
3
).
Ex. T(n) = 4T(n/2) + n
2
/lgn
a = 4, b = 2 n
log
b
a
= n
2
; f (n) = n
2
/lgn.
Master method does not apply. In particular,
for every constant > 0, we have n

= (lgn).
Day 3 Introduction to Algorithms L2.23
General method (Akra-Bazzi)
) ( ) / ( ) (
1
n f b n T a n T
k
i
i i
+ =

=
Let p be the unique solution to
( ) /b a
k
i
p
i i
1
1
=

=
Then, the answers are the same as for the
master method, but with n
p
instead of n
log
b
a
.
(Akra and Bazzi also prove an even more
general result.)
.
Day 3 Introduction to Algorithms L2.24
f (n/b)
Idea of master theorem
f (n/b)
f (n/b)
(1)

Recursion tree:

f (n)
a
f (n/b
2
)
f (n/b
2
)
f (n/b
2
)

a
h = log
b
n
f (n)
af (n/b)
a
2
f (n/b
2
)

#leaves = a
h
= a
log
b
n
= n
log
b
a
n
log
b
a
(1)
Day 3 Introduction to Algorithms L2.25
f (n/b)
Idea of master theorem
f (n/b)
f (n/b)
(1)

Recursion tree:

f (n)
a
f (n/b
2
)
f (n/b
2
)
f (n/b
2
)

a
h = log
b
n
f (n)
af (n/b)
a
2
f (n/b
2
)

n
log
b
a
(1)
CASE 1: The weight increases
geometrically from the root to the
leaves. The leaves hold a constant
fraction of the total weight.
CASE 1: The weight increases
geometrically from the root to the
leaves. The leaves hold a constant
fraction of the total weight.
(n
log
b
a
)
Day 3 Introduction to Algorithms L2.26
f (n/b)
Idea of master theorem
f (n/b)
f (n/b)
(1)

Recursion tree:

f (n)
a
f (n/b
2
)
f (n/b
2
)
f (n/b
2
)

a
h = log
b
n
f (n)
af (n/b)
a
2
f (n/b
2
)

n
log
b
a
(1)
CASE 2: (k = 0) The weight
is approximately the same on
each of the log
b
n levels.
CASE 2: (k = 0) The weight
is approximately the same on
each of the log
b
n levels.
(n
log
b
a
lgn)
Day 3 Introduction to Algorithms L2.27
f (n/b)
Idea of master theorem
f (n/b)
f (n/b)
(1)

Recursion tree:

f (n)
a
f (n/b
2
)
f (n/b
2
)
f (n/b
2
)

a
h = log
b
n
f (n)
af (n/b)
a
2
f (n/b
2
)

n
log
b
a
(1)
CASE 3: The weight decreases
geometrically from the root to the
leaves. The root holds a constant
fraction of the total weight.
CASE 3: The weight decreases
geometrically from the root to the
leaves. The root holds a constant
fraction of the total weight.
( f (n))
Day 3 Introduction to Algorithms L2.28
Conclusion
Next time: applying the master method.
For proof of master theorem, see CLRS.
Day 3 Introduction to Algorithms L2.29
Appendix: geometric series

1
1
1
2
x
x x

= + + + L
for |x| < 1

1
1
1
1
2
x
x
x x x
n
n

= + + + +
+
L
for x 1
Return to last
slide viewed.
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 3
Prof. Erik Demaine
Day 4 Introduction to Algorithms L3.2
The divide-and-conquer
design paradigm
1. Divide the problem (instance)
into subproblems.
2. Conquer the subproblems by
solving them recursively.
3. Combine subproblem solutions.
Day 4 Introduction to Algorithms L3.3
Example: merge sort
1. Divide: Trivial.
2. Conquer: Recursively sort 2 subarrays.
3. Combine: Linear-time merge.
T(n) = 2T(n/2) + O(n)
# subproblems
subproblem size
work dividing
and combining
Day 4 Introduction to Algorithms L3.4
Master theorem (reprise)
T(n) = aT(n/b) + f (n)
CASE 1: f (n) = O(n
log
b
a
)
T(n) = (n
log
b
a
) .
CASE 2: f (n) = (n
log
b
a
lg
k
n)
T(n) = (n
log
b
a
lg
k+1
n) .
CASE 3: f (n) = (n
log
b
a +
) and af (n/b) c f (n)
T(n) = ( f (n)) .
Merge sort: a = 2, b = 2 n
log
b
a
= n
CASE 2 (k = 0) T(n) = (nlgn) .
Day 4 Introduction to Algorithms L3.5
Binary search
Example: Find 9
3 5 7 8 9 12 15
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
Day 4 Introduction to Algorithms L3.6
Binary search
Example: Find 9
3 5 7 8 9 12 15
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
Day 4 Introduction to Algorithms L3.7
Binary search
Example: Find 9
3 5 7 8 9 12 15
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
Day 4 Introduction to Algorithms L3.8
Binary search
Example: Find 9
3 5 7 8 9 12 15
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
Day 4 Introduction to Algorithms L3.9
Binary search
Example: Find 9
3 5 7 8 9 12 15
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
Day 4 Introduction to Algorithms L3.10
Binary search
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
Example: Find 9
3 5 7 8 9 12 15
Day 4 Introduction to Algorithms L3.11
Recurrence for binary search
T(n) = 1T(n/2) + (1)
# subproblems
subproblem size
work dividing
and combining
n
log
b
a
= n
log
2
1
= n
0
= 1 CASE 2 (k = 0)
T(n) = (lgn) .
Day 4 Introduction to Algorithms L3.12
Powering a number
Problem: Compute a
n
, where n N.
a
n
=
a
n/2
a
n/2
if n is even;
a
(n1)/2
a
(n1)/2
a if n is odd.
Divide-and-conquer algorithm:
T(n) = T(n/2) + (1) T(n) = (lgn) .
Naive algorithm: (n).
Day 4 Introduction to Algorithms L3.13
Fibonacci numbers
Recursive definition:
F
n
=
0 if n = 0;
F
n1
+ F
n2
if n 2.
1 if n = 1;
0 1 1 2 3 5 8 13 21 34 L
Naive recursive algorithm: (
n
)
(exponential time), where =
is the golden ratio.
2 / ) 5 1 ( +
Day 4 Introduction to Algorithms L3.14
Computing Fibonacci
numbers
Naive recursive squaring:
F
n
=
n
/ rounded to the nearest integer. 5
Recursive squaring: (lg n) time.
This method is unreliable, since floating-point
arithmetic is prone to round-off errors.
Bottom-up:
Compute F
0
, F
1
, F
2
, , F
n
in order, forming
each number by summing the two previous.
Running time: (n).
Day 4 Introduction to Algorithms L3.15
Recursive squaring
n
F F
F F
n n
n n

+
0 1
1 1
1
1
Theorem: .
Proof of theorem. (Induction on n.)
Base (n = 1): .
1
0 1
1 1
0 1
1 2

F F
F F
Algorithm: Recursive squaring.
Time = (lg n) .
Day 4 Introduction to Algorithms L3.16
Recursive squaring
.
.
Inductive step (n 2):
n
n
F F
F F
F F
F F
n n
n n
n n
n n

+
0 1
1 1
0 1
1 1
1
0 1
1 1
0 1
1 1
2 1
1
1
1
Day 4 Introduction to Algorithms L3.17
Matrix multiplication

nn n n
n
n
nn n n
n
n
nn n n
n
n
b b b
b b b
b b b
a a a
a a a
a a a
c c c
c c c
c c c
L
M O M M
L
L
L
M O M M
L
L
L
M O M M
L
L
2 1
2 22 21
1 12 11
2 1
2 22 21
1 12 11
2 1
2 22 21
1 12 11

=
=
n
k
kj ik ij
b a c
1
Input: A = [a
ij
], B = [b
ij
].
Output: C = [c
ij
] = A B.
i, j = 1, 2, , n.
Day 4 Introduction to Algorithms L3.18
Standard algorithm
for i 1 to n
do for j 1 to n
do c
ij
0
for k 1 to n
do c
ij
c
ij
+ a
ik
b
kj
Running time = (n
3
)
Day 4 Introduction to Algorithms L3.19
Divide-and-conquer algorithm
nn matrix = 22 matrix of (n/2)(n/2) submatrices:
IDEA:

h g
f e
d c
b a
u t
s r
C = A

B
r =ae +bg
s =af +bh
t =ce +dh
u =cf +dg
8 mults of (n/2)(n/2) submatrices
4 adds of (n/2)(n/2) submatrices
Day 4 Introduction to Algorithms L3.20
Analysis of D&C algorithm
n
log
b
a
= n
log
2
8
= n
3
CASE 1 T(n) = (n
3
).
No better than the ordinary algorithm.
# submatrices
submatrix size
work adding
submatrices
T(n) = 8T(n/2) + (n
2
)
Day 4 Introduction to Algorithms L3.21
7 mults, 18 adds/subs.
Note: No reliance on
commutativity of mult!
7 mults, 18 adds/subs.
Note: No reliance on
commutativity of mult!
Strassens idea
Multiply 22 matrices with only 7 recursive mults.
P
1
= a ( f h)
P
2
= (a + b) h
P
3
= (c + d) e
P
4
= d (g e)
P
5
= (a + d) (e + h)
P
6
= (b d) (g + h)
P
7
= (a c) (e + f )
r = P
5
+ P
4
P
2
+ P
6
s = P
1
+ P
2
t = P
3
+ P
4
u = P
5
+ P
1
P
3
P
7
Day 4 Introduction to Algorithms L3.22
Strassens idea
Multiply 22 matrices with only 7 recursive mults.
P
1
= a ( f h)
P
2
= (a + b) h
P
3
= (c + d) e
P
4
= d (g e)
P
5
= (a + d) (e + h)
P
6
= (b d) (g + h)
P
7
= (a c) (e + f )
r = P
5
+ P
4
P
2
+ P
6
= (a + d) (e + h)
+ d(g e) (a + b) h
+ (b d) (g + h)
= ae + ah + de + dh
+ dg de ah bh
+ bg + bh dg dh
= ae + bg
Day 4 Introduction to Algorithms L3.23
Strassens algorithm
1. Divide: Partition A and B into
(n/2)(n/2) submatrices. Form terms
to be multiplied using + and .
2. Conquer: Perform 7 multiplications of
(n/2)(n/2) submatrices recursively.
3. Combine: Form C using + and on
(n/2)(n/2) submatrices.
T(n) = 7T(n/2) + (n
2
)
Day 4 Introduction to Algorithms L3.24
Analysis of Strassen
T(n) = 7T(n/2) + (n
2
)
n
log
b
a
= n
log
2
7
n
2.81
CASE 1 T(n) = (n
lg 7
).
Best to date (of theoretical interest only): (n
2.376L
).
The number 2.81 may not seem much smaller than
3, but because the difference is in the exponent, the
impact on running time is significant. In fact,
Strassens algorithm beats the ordinary algorithm
on todays machines for n 30 or so.
Day 4 Introduction to Algorithms L3.25
VLSI layout
Problem: Embed a complete binary tree
with n leaves in a grid using minimal area.
H(n)
W(n)
H(n) = H(n/2) + (1)
= (lg n)
W(n) = 2W(n/2) + (1)
= (n)
Area = (n lg n)
Day 4 Introduction to Algorithms L3.26
H-tree embedding
L(n)
L(n)
L(n/4) L(n/4) (1)
L(n) = 2L(n/4) + (1)
= ( )
n
Area = (n)
Day 4 Introduction to Algorithms L3.27
Conclusion
Divide and conquer is just one of several
powerful techniques for algorithm design.
Divide-and-conquer algorithms can be
analyzed using recurrences and the master
method (so practice this math).
Can lead to more efficient algorithms
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 4
Prof. Charles E. Leiserson
Day 6 Introduction to Algorithms L4.2
Quicksort
Proposed by C.A.R. Hoare in 1962.
Divide-and-conquer algorithm.
Sorts in place (like insertion sort, but not
like merge sort).
Very practical (with tuning).
Day 6 Introduction to Algorithms L4.3
Divide and conquer
Quicksort an n-element array:
1. Divide: Partition the array into two subarrays
around a pivot x such that elements in lower
subarray x elements in upper subarray.
2. Conquer: Recursively sort the two subarrays.
3. Combine: Trivial.
x
x
x
x
x
x
Key: Linear-time partitioning subroutine.
Day 6 Introduction to Algorithms L4.4
x
x
Running time
= O(n) for n
elements.
Running time
= O(n) for n
elements.
Partitioning subroutine
PARTITION(A, p, q) A[ p . . q]
x A[ p] pivot = A[ p]
i p
for j p + 1 to q
do if A[ j] x
then i i + 1
exchange A[i] A[ j]
exchange A[ p] A[i]
return i
x
x
x
x
?
?
p i q j
Invariant:
Day 6 Introduction to Algorithms L4.5
Example of partitioning
i j
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
Day 6 Introduction to Algorithms L4.6
Example of partitioning
i j
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
Day 6 Introduction to Algorithms L4.7
Example of partitioning
i j
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
Day 6 Introduction to Algorithms L4.8
Example of partitioning
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
i j
6
6
5
5
13
13
10
10
8
8
3
3
2
2
11
11
Day 6 Introduction to Algorithms L4.9
Example of partitioning
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
i j
6
6
5
5
13
13
10
10
8
8
3
3
2
2
11
11
Day 6 Introduction to Algorithms L4.10
Example of partitioning
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
i j
6
6
5
5
13
13
10
10
8
8
3
3
2
2
11
11
Day 6 Introduction to Algorithms L4.11
Example of partitioning
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
i j
6
6
5
5
3
3
10
10
8
8
13
13
2
2
11
11
6
6
5
5
13
13
10
10
8
8
3
3
2
2
11
11
Day 6 Introduction to Algorithms L4.12
Example of partitioning
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
i j
6
6
5
5
3
3
10
10
8
8
13
13
2
2
11
11
6
6
5
5
13
13
10
10
8
8
3
3
2
2
11
11
Day 6 Introduction to Algorithms L4.13
Example of partitioning
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
6
6
5
5
3
3
10
10
8
8
13
13
2
2
11
11
6
6
5
5
13
13
10
10
8
8
3
3
2
2
11
11
i j
6
6
5
5
3
3
2
2
8
8
13
13
10
10
11
11
Day 6 Introduction to Algorithms L4.14
Example of partitioning
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
6
6
5
5
3
3
10
10
8
8
13
13
2
2
11
11
6
6
5
5
13
13
10
10
8
8
3
3
2
2
11
11
i j
6
6
5
5
3
3
2
2
8
8
13
13
10
10
11
11
Day 6 Introduction to Algorithms L4.15
Example of partitioning
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
6
6
5
5
3
3
10
10
8
8
13
13
2
2
11
11
6
6
5
5
13
13
10
10
8
8
3
3
2
2
11
11
i j
6
6
5
5
3
3
2
2
8
8
13
13
10
10
11
11
Day 6 Introduction to Algorithms L4.16
Example of partitioning
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
6
6
5
5
3
3
10
10
8
8
13
13
2
2
11
11
6
6
5
5
13
13
10
10
8
8
3
3
2
2
11
11
6
6
5
5
3
3
2
2
8
8
13
13
10
10
11
11
i
2
2
5
5
3
3
6
6
8
8
13
13
10
10
11
11
Day 6 Introduction to Algorithms L4.17
Pseudocode for quicksort
QUICKSORT(A, p, r)
if p < r
then q PARTITION(A, p, r)
QUICKSORT(A, p, q1)
QUICKSORT(A, q+1, r)
Initial call: QUICKSORT(A, 1, n)
Day 6 Introduction to Algorithms L4.18
Analysis of quicksort
Assume all input elements are distinct.
In practice, there are better partitioning
algorithms for when duplicate input
elements may exist.
Let T(n) = worst-case running time on
an array of n elements.
Day 6 Introduction to Algorithms L4.19
Worst-case of quicksort
Input sorted or reverse sorted.
Partition around min or max element.
One side of partition always has no elements.
) (
) ( ) 1 (
) ( ) 1 ( ) 1 (
) ( ) 1 ( ) 0 ( ) (
2
n
n n T
n n T
n n T T n T
=
+ =
+ + =
+ + =
(arithmetic series)
Day 6 Introduction to Algorithms L4.20
Worst-case recursion tree
T(n) = T(0) + T(n1) + cn
Day 6 Introduction to Algorithms L4.21
Worst-case recursion tree
T(n) = T(0) + T(n1) + cn
T(n)
Day 6 Introduction to Algorithms L4.22
cn
T(0) T(n1)
Worst-case recursion tree
T(n) = T(0) + T(n1) + cn
Day 6 Introduction to Algorithms L4.23
cn
T(0) c(n1)
Worst-case recursion tree
T(n) = T(0) + T(n1) + cn
T(0) T(n2)
Day 6 Introduction to Algorithms L4.24
cn
T(0) c(n1)
Worst-case recursion tree
T(n) = T(0) + T(n1) + cn
T(0) c(n2)
T(0)
(1)
O
Day 6 Introduction to Algorithms L4.25
cn
T(0) c(n1)
Worst-case recursion tree
T(n) = T(0) + T(n1) + cn
T(0) c(n2)
T(0)
(1)
O
( )
2
1
n k
n
k
=
|
|
.
|

\
|


=
Day 6 Introduction to Algorithms L4.26
cn
(1) c(n1)
Worst-case recursion tree
T(n) = T(0) + T(n1) + cn
(1) c(n2)
(1)
(1)
O
( )
2
1
n k
n
k
=
|
|
.
|

\
|


=
T(n) = (n) + (n
2
)
= (n
2
)
h = n
Day 6 Introduction to Algorithms L4.27
Best-case analysis
(For intuition only!)
If were lucky, PARTITION splits the array evenly:
T(n) = 2T(n/2) + (n)
= (n lg n)
(same as merge sort)
What if the split is always
10
9
10
1
:
?
( ) ( ) ) ( ) (
10
9
10
1
n n T n T n T + + =
What is the solution to this recurrence?
Day 6 Introduction to Algorithms L4.28
Analysis of almost-best case
) (n T
Day 6 Introduction to Algorithms L4.29
Analysis of almost-best case
cn
( ) n T
10
1
( ) n T
10
9
Day 6 Introduction to Algorithms L4.30
Analysis of almost-best case
cn
cn
10
1
cn
10
9
( ) n T
100
1
( ) n T
100
9
( ) n T
100
9
( ) n T
100
81
Day 6 Introduction to Algorithms L4.31
Analysis of almost-best case
cn
cn
10
1
cn
10
9
cn
100
1
cn
100
9
cn
100
9
cn
100
81
(1)
(1)

log
10/9
n
cn
cn
cn

O(n) leaves
O(n) leaves
Day 6 Introduction to Algorithms L4.32
log
10
n
Analysis of almost-best case
cn
cn
10
1
cn
10
9
cn
100
1
cn
100
9
cn
100
9
cn
100
81
(1)
(1)

log
10/9
n
cn
cn
cn
T(n) cnlog
10/9
n + (n)

cnlog
10
n
O(n) leaves
O(n) leaves
(nlg n)
Lucky!
Day 6 Introduction to Algorithms L4.33
More intuition
Suppose we alternate lucky, unlucky,
lucky, unlucky, lucky, .
L(n) = 2U(n/2) + (n) lucky
U(n) = L(n 1) + (n) unlucky
Solving:
L(n) = 2(L(n/2 1) + (n/2)) + (n)
= 2L(n/2 1) + (n)
= (n lg n)
How can we make sure we are usually lucky?
Lucky!
Day 6 Introduction to Algorithms L4.34
Randomized quicksort
IDEA: Partition around a random element.
Running time is independent of the input
order.
No assumptions need to be made about
the input distribution.
No specific input elicits the worst-case
behavior.
The worst case is determined only by the
output of a random-number generator.
Day 6 Introduction to Algorithms L4.35
Randomized quicksort
analysis
Let T(n) = the random variable for the running
time of randomized quicksort on an input of size
n, assuming random numbers are independent.
For k = 0, 1, , n1, define the indicator
random variable
X
k
=
1 if PARTITION generates a k : nk1 split,
0 otherwise.
E[X
k
] = Pr{X
k
= 1} = 1/n, since all splits are
equally likely, assuming elements are distinct.
Day 6 Introduction to Algorithms L4.36
Analysis (continued)
T(n) =
T(0) + T(n1) + (n) if 0 : n1 split,
T(1) + T(n2) + (n) if 1 : n2 split,
M
T(n1) + T(0) + (n) if n1 : 0 split,
( )

=
+ + =
1
0
) ( ) 1 ( ) (
n
k
k
n k n T k T X
.
Day 6 Introduction to Algorithms L4.37
Calculating expectation
( )
(

+ + =

=
1
0
) ( ) 1 ( ) ( )] ( [
n
k
k
n k n T k T X E n T E
Take expectations of both sides.
Day 6 Introduction to Algorithms L4.38
Calculating expectation
( )
( ) | |

=
+ + =
(

+ + =
1
0
1
0
) ( ) 1 ( ) (
) ( ) 1 ( ) ( )] ( [
n
k
k
n
k
k
n k n T k T X E
n k n T k T X E n T E
Linearity of expectation.
Day 6 Introduction to Algorithms L4.39
Calculating expectation
( )
( ) | |
| | | |

=
+ + =
+ + =
(

+ + =
1
0
1
0
1
0
) ( ) 1 ( ) (
) ( ) 1 ( ) (
) ( ) 1 ( ) ( )] ( [
n
k
k
n
k
k
n
k
k
n k n T k T E X E
n k n T k T X E
n k n T k T X E n T E
Independence of X
k
from other random
choices.
Day 6 Introduction to Algorithms L4.40
Calculating expectation
( )
( ) | |
| | | |
| | | |

=
+ + =
+ + =
+ + =
(

+ + =
1
0
1
0
1
0
1
0
1
0
1
0
) (
1
) 1 (
1
) (
1
) ( ) 1 ( ) (
) ( ) 1 ( ) (
) ( ) 1 ( ) ( )] ( [
n
k
n
k
n
k
n
k
k
n
k
k
n
k
k
n
n
k n T E
n
k T E
n
n k n T k T E X E
n k n T k T X E
n k n T k T X E n T E
Linearity of expectation; E[X
k
] = 1/n.
Day 6 Introduction to Algorithms L4.41
Calculating expectation
( )
( ) | |
| | | |
| | | |
| | ) ( ) (
2
) (
1
) 1 (
1
) (
1
) ( ) 1 ( ) (
) ( ) 1 ( ) (
) ( ) 1 ( ) ( )] ( [
1
1
1
0
1
0
1
0
1
0
1
0
1
0
n k T E
n
n
n
k n T E
n
k T E
n
n k n T k T E X E
n k n T k T X E
n k n T k T X E n T E
n
k
n
k
n
k
n
k
n
k
k
n
k
k
n
k
k
+ =
+ + =
+ + =
+ + =
(

+ + =

=
Summations have
identical terms.
Day 6 Introduction to Algorithms L4.42
Hairy recurrence
| | ) ( ) (
2
)] ( [
1
2
n k T E
n
n T E
n
k
+ =

=
(The k = 0, 1 terms can be absorbed in the (n).)
Prove: E[T(n)] anlgn for constant a > 0.
Use fact:
2
1
2
8
1
2
2
1
lg lg n n n k k
n
k

=

(exercise).
Choose a large enough so that anlgn
dominates E[T(n)] for sufficiently small n 2.
Day 6 Introduction to Algorithms L4.43
Substitution method
| | ) ( lg
2
) (
1
2
n k ak
n
n T E
n
k
+

=
Substitute inductive hypothesis.
Day 6 Introduction to Algorithms L4.44
Substitution method
| |
) (
8
1
lg
2
1 2
) ( lg
2
) (
2 2
1
2
n n n n
n
a
n k ak
n
n T E
n
k
+
|
.
|

\
|

+

=
Use fact.
Day 6 Introduction to Algorithms L4.45
Substitution method
| |
|
.
|

\
|
=
+
|
.
|

\
|

+

=
) (
4
lg
) (
8
1
lg
2
1 2
) ( lg
2
) (
2 2
1
2
n
an
n an
n n n n
n
a
n k ak
n
n T E
n
k
Express as desired residual.
Day 6 Introduction to Algorithms L4.46
Substitution method
| |
n an
n
an
n an
n n n n
n
a
n k ak
n
n T E
n
k
lg
) (
4
lg
) (
8
1
lg
2
1 2
) ( lg
2
) (
2 2
1
2

|
.
|

\
|
=
+
|
.
|

\
|
=
+

=
if a is chosen large enough so that
an/4 dominates the (n).
,
Day 6 Introduction to Algorithms L4.47
Quicksort in practice
Quicksort is a great general-purpose
sorting algorithm.
Quicksort is typically over twice as fast
as merge sort.
Quicksort can benefit substantially from
code tuning.
Quicksort behaves well even with
caching and virtual memory.
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 5
Prof. Erik Demaine
Introduction to Algorithms Day 8 L5.2 2001 by Charles E. Leiserson
How fast can we sort?
All the sorting algorithms we have seen so far
are comparison sorts: only use comparisons to
determine the relative order of elements.
E.g., insertion sort, merge sort, quicksort,
heapsort.
The best worst-case running time that weve
seen for comparison sorting is O(nlgn) .
Is O(nlgn) the best we can do?
Decision trees can help us answer this question.
Introduction to Algorithms Day 8 L5.3 2001 by Charles E. Leiserson
Decision-tree example
1:2
1:2
2:3
2:3
123
123
1:3
1:3
132
132
312
312
1:3
1:3
213
213
2:3
2:3
231
231
321
321
Each internal node is labeled i:j for i, j {1, 2,, n}.
The left subtree shows subsequent comparisons if a
i
a
j
.
The right subtree shows subsequent comparisons if a
i
a
j
.
Sort a
1
, a
2
, , a
n

Introduction to Algorithms Day 8 L5.4 2001 by Charles E. Leiserson


Decision-tree example
1:2
1:2
2:3
2:3
123
123
1:3
1:3
132
132
312
312
1:3
1:3
213
213
2:3
2:3
231
231
321
321
Each internal node is labeled i:j for i, j {1, 2,, n}.
The left subtree shows subsequent comparisons if a
i
a
j
.
The right subtree shows subsequent comparisons if a
i
a
j
.
9 4
Sort a
1
, a
2
, a
3

= 9, 4, 6 :
Introduction to Algorithms Day 8 L5.5 2001 by Charles E. Leiserson
Decision-tree example
1:2
1:2
2:3
2:3
123
123
1:3
1:3
132
132
312
312
1:3
1:3
213
213
2:3
2:3
231
231
321
321
Each internal node is labeled i:j for i, j {1, 2,, n}.
The left subtree shows subsequent comparisons if a
i
a
j
.
The right subtree shows subsequent comparisons if a
i
a
j
.
9 6
Sort a
1
, a
2
, a
3

= 9, 4, 6 :
Introduction to Algorithms Day 8 L5.6 2001 by Charles E. Leiserson
Decision-tree example
1:2
1:2
2:3
2:3
123
123
1:3
1:3
132
132
312
312
1:3
1:3
213
213
2:3
2:3
231
231
321
321
Each internal node is labeled i:j for i, j {1, 2,, n}.
The left subtree shows subsequent comparisons if a
i
a
j
.
The right subtree shows subsequent comparisons if a
i
a
j
.
4 6
Sort a
1
, a
2
, a
3

= 9, 4, 6 :
Introduction to Algorithms Day 8 L5.7 2001 by Charles E. Leiserson
Decision-tree example
1:2
1:2
2:3
2:3
123
123
1:3
1:3
132
132
312
312
1:3
1:3
213
213
2:3
2:3
231
231
321
321
Each leaf contains a permutation (1), (2),, (n) to
indicate that the ordering a
(1)
a
(2)
L a
(n)
has been
established.
4 6 9
Sort a
1
, a
2
, a
3

= 9, 4, 6 :
Introduction to Algorithms Day 8 L5.8 2001 by Charles E. Leiserson
Decision-tree model
A decision tree can model the execution of
any comparison sort:
One tree for each input size n.
View the algorithm as splitting whenever
it compares two elements.
The tree contains the comparisons along
all possible instruction traces.
The running time of the algorithm = the
length of the path taken.
Worst-case running time = height of tree.
Introduction to Algorithms Day 8 L5.9 2001 by Charles E. Leiserson
Lower bound for decision-
tree sorting
Theorem. Any decision tree that can sort n
elements must have height (nlgn) .
Proof. The tree must contain n! leaves, since
there are n! possible permutations. A height-h
binary tree has 2
h
leaves. Thus, n! 2
h
.
h lg(n!) (lg is mono. increasing)
lg ((n/e)
n
) (Stirlings formula)
= n lg n n lg e
= (n lg n) .
Introduction to Algorithms Day 8 L5.10 2001 by Charles E. Leiserson
Lower bound for comparison
sorting
Corollary. Heapsort and merge sort are
asymptotically optimal comparison sorting
algorithms.
Introduction to Algorithms Day 8 L5.11 2001 by Charles E. Leiserson
Sorting in linear time
Counting sort: No comparisons between elements.
Input: A[1 . . n], where A[ j]{1, 2, , k} .
Output: B[1 . . n], sorted.
Auxiliary storage: C[1 . . k] .
Introduction to Algorithms Day 8 L5.12 2001 by Charles E. Leiserson
Counting sort
for i 1 to k
do C[i] 0
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}|
for i 2 to k
do C[i] C[i] + C[i1] C[i] = |{key i}|
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
Introduction to Algorithms Day 8 L5.13 2001 by Charles E. Leiserson
Counting-sort example
A:
4
4
1
1
3
3
4
4
3
3
B:
1 2 3 4 5
C:
1 2 3 4
Introduction to Algorithms Day 8 L5.14 2001 by Charles E. Leiserson
Loop 1
A:
4
4
1
1
3
3
4
4
3
3
B:
1 2 3 4 5
C:
0
0
0
0
0
0
0
0
1 2 3 4
for i 1 to k
do C[i] 0
Introduction to Algorithms Day 8 L5.15 2001 by Charles E. Leiserson
Loop 2
A:
4
4
1
1
3
3
4
4
3
3
B:
1 2 3 4 5
C:
0
0
0
0
0
0
1
1
1 2 3 4
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}|
Introduction to Algorithms Day 8 L5.16 2001 by Charles E. Leiserson
Loop 2
A:
4
4
1
1
3
3
4
4
3
3
B:
1 2 3 4 5
C:
1
1
0
0
0
0
1
1
1 2 3 4
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}|
Introduction to Algorithms Day 8 L5.17 2001 by Charles E. Leiserson
Loop 2
A:
4
4
1
1
3
3
4
4
3
3
B:
1 2 3 4 5
C:
1
1
0
0
1
1
1
1
1 2 3 4
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}|
Introduction to Algorithms Day 8 L5.18 2001 by Charles E. Leiserson
Loop 2
A:
4
4
1
1
3
3
4
4
3
3
B:
1 2 3 4 5
C:
1
1
0
0
1
1
2
2
1 2 3 4
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}|
Introduction to Algorithms Day 8 L5.19 2001 by Charles E. Leiserson
Loop 2
A:
4
4
1
1
3
3
4
4
3
3
B:
1 2 3 4 5
C:
1
1
0
0
2
2
2
2
1 2 3 4
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}|
Introduction to Algorithms Day 8 L5.20 2001 by Charles E. Leiserson
Loop 3
A:
4
4
1
1
3
3
4
4
3
3
B:
1 2 3 4 5
C:
1
1
0
0
2
2
2
2
1 2 3 4
C':
1
1
1
1
2
2
2
2
for i 2 to k
do C[i] C[i] + C[i1] C[i] = |{key i}|
Introduction to Algorithms Day 8 L5.21 2001 by Charles E. Leiserson
Loop 3
A:
4
4
1
1
3
3
4
4
3
3
B:
1 2 3 4 5
C:
1
1
0
0
2
2
2
2
1 2 3 4
C':
1
1
1
1
3
3
2
2
for i 2 to k
do C[i] C[i] + C[i1] C[i] = |{key i}|
Introduction to Algorithms Day 8 L5.22 2001 by Charles E. Leiserson
Loop 3
A:
4
4
1
1
3
3
4
4
3
3
B:
1 2 3 4 5
C:
1
1
0
0
2
2
2
2
1 2 3 4
C':
1
1
1
1
3
3
5
5
for i 2 to k
do C[i] C[i] + C[i1] C[i] = |{key i}|
Introduction to Algorithms Day 8 L5.23 2001 by Charles E. Leiserson
Loop 4
A:
4
4
1
1
3
3
4
4
3
3
B:
3
3
1 2 3 4 5
C:
1
1
1
1
3
3
5
5
1 2 3 4
C':
1
1
1
1
2
2
5
5
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
Introduction to Algorithms Day 8 L5.24 2001 by Charles E. Leiserson
Loop 4
A:
4
4
1
1
3
3
4
4
3
3
B:
3
3
4
4
1 2 3 4 5
C:
1
1
1
1
2
2
5
5
1 2 3 4
C':
1
1
1
1
2
2
4
4
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
Introduction to Algorithms Day 8 L5.25 2001 by Charles E. Leiserson
Loop 4
A:
4
4
1
1
3
3
4
4
3
3
B:
3
3
3
3
4
4
1 2 3 4 5
C:
1
1
1
1
2
2
4
4
1 2 3 4
C':
1
1
1
1
1
1
4
4
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
Introduction to Algorithms Day 8 L5.26 2001 by Charles E. Leiserson
Loop 4
A:
4
4
1
1
3
3
4
4
3
3
B:
1
1
3
3
3
3
4
4
1 2 3 4 5
C:
1
1
1
1
1
1
4
4
1 2 3 4
C':
0
0
1
1
1
1
4
4
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
Introduction to Algorithms Day 8 L5.27 2001 by Charles E. Leiserson
Loop 4
A:
4
4
1
1
3
3
4
4
3
3
B:
1
1
3
3
3
3
4
4
4
4
1 2 3 4 5
C:
0
0
1
1
1
1
4
4
1 2 3 4
C':
0
0
1
1
1
1
3
3
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
Introduction to Algorithms Day 8 L5.28 2001 by Charles E. Leiserson
Analysis
for i 1 to k
do C[i] 0
(n)
(k)
(n)
(k)
for j 1 to n
do C[A[ j]] C[A[ j]] + 1
for i 2 to k
do C[i] C[i] + C[i1]
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
(n + k)
Introduction to Algorithms Day 8 L5.29 2001 by Charles E. Leiserson
Running time
If k = O(n), then counting sort takes (n) time.
But, sorting takes (n lg n) time!
Wheres the fallacy?
Answer:
Comparison sorting takes (n lg n) time.
Counting sort is not a comparison sort.
In fact, not a single comparison between
elements occurs!
Introduction to Algorithms Day 8 L5.30 2001 by Charles E. Leiserson
Stable sorting
Counting sort is a stable sort: it preserves
the input order among equal elements.
A:
4
4
1
1
3
3
4
4
3
3
B:
1
1
3
3
3
3
4
4
4
4
Exercise: What other sorts have this property?
Introduction to Algorithms Day 8 L5.31 2001 by Charles E. Leiserson
Radix sort
Origin: Herman Holleriths card-sorting
machine for the 1890 U.S. Census. (See
Appendix .)
Digit-by-digit sort.
Holleriths original (bad) idea: sort on
most-significant digit first.
Good idea: Sort on least-significant digit
first with auxiliary stable sort.
Introduction to Algorithms Day 8 L5.32 2001 by Charles E. Leiserson
Operation of radix sort
3 2 9
4 5 7
6 5 7
8 3 9
4 3 6
7 2 0
3 5 5
7 2 0
3 5 5
4 3 6
4 5 7
6 5 7
3 2 9
8 3 9
7 2 0
3 2 9
4 3 6
8 3 9
3 5 5
4 5 7
6 5 7
3 2 9
3 5 5
4 3 6
4 5 7
6 5 7
7 2 0
8 3 9
Introduction to Algorithms Day 8 L5.33 2001 by Charles E. Leiserson
Sort on digit t
Correctness of radix sort
Induction on digit position
Assume that the numbers
are sorted by their low-order
t 1 digits.
7 2 0
3 2 9
4 3 6
8 3 9
3 5 5
4 5 7
6 5 7
3 2 9
3 5 5
4 3 6
4 5 7
6 5 7
7 2 0
8 3 9
Introduction to Algorithms Day 8 L5.34 2001 by Charles E. Leiserson
Sort on digit t
Correctness of radix sort
Induction on digit position
Assume that the numbers
are sorted by their low-order
t 1 digits.
7 2 0
3 2 9
4 3 6
8 3 9
3 5 5
4 5 7
6 5 7
3 2 9
3 5 5
4 3 6
4 5 7
6 5 7
7 2 0
8 3 9
Two numbers that differ in
digit t are correctly sorted.
Introduction to Algorithms Day 8 L5.35 2001 by Charles E. Leiserson
Sort on digit t
Correctness of radix sort
Induction on digit position
Assume that the numbers
are sorted by their low-order
t 1 digits.
7 2 0
3 2 9
4 3 6
8 3 9
3 5 5
4 5 7
6 5 7
3 2 9
3 5 5
4 3 6
4 5 7
6 5 7
7 2 0
8 3 9
Two numbers that differ in
digit t are correctly sorted.
Two numbers equal in digit t
are put in the same order as
the input correct order.
Introduction to Algorithms Day 8 L5.36 2001 by Charles E. Leiserson
Analysis of radix sort
Assume counting sort is the auxiliary stable sort.
Sort n computer words of b bits each.
Each word can be viewed as having b/r base-2
r
digits.
Example: 32-bit word
8 8 8 8
r = 8 b/r = 4 passes of counting sort on
base-2
8
digits; or r = 16 b/r = 2 passes of
counting sort on base-2
16
digits.
How many passes should we make?
Introduction to Algorithms Day 8 L5.37 2001 by Charles E. Leiserson
Analysis (continued)
Recall: Counting sort takes (n + k) time to
sort n numbers in the range from 0 to k 1.
If each b-bit word is broken into r-bit pieces,
each pass of counting sort takes (n + 2
r
) time.
Since there are b/r passes, we have
( )

+ =
r
n
r
b
b n T 2 ) , (
.
Choose r to minimize T(n, b):
Increasing r means fewer passes, but as
r > lg n, the time grows exponentially. >
Introduction to Algorithms Day 8 L5.38 2001 by Charles E. Leiserson
Choosing r
( )

+ =
r
n
r
b
b n T 2 ) , (
Minimize T(n, b) by differentiating and setting to 0.
Or, just observe that we dont want 2
r
> n, and
theres no harm asymptotically in choosing r as
large as possible subject to this constraint.
>
Choosing r = lg n implies T(n, b) = (bn/lg n) .
For numbers in the range from 0 to n
d
1, we
have b = d lg n radix sort runs in (dn) time.
Introduction to Algorithms Day 8 L5.39 2001 by Charles E. Leiserson
Conclusions
Example (32-bit numbers):
At most 3 passes when sorting 2000 numbers.
Merge sort and quicksort do at least lg2000 =
11 passes.
In practice, radix sort is fast for large inputs, as
well as simple to code and maintain.
Downside: Unlike quicksort, radix sort displays
little locality of reference, and thus a well-tuned
quicksort fares better on modern processors,
which feature steep memory hierarchies.
Introduction to Algorithms Day 8 L5.40 2001 by Charles E. Leiserson
Appendix: Punched-card
technology
Herman Hollerith (1860-1929)
Punched cards
Holleriths tabulating system
Operation of the sorter
Origin of radix sort
Modern IBM card
Web resources on punched-
card technology
Introduction to Algorithms Day 8 L5.41 2001 by Charles E. Leiserson
Herman Hollerith
(1860-1929)
The 1880 U.S. Census took almost
10 years to process.
While a lecturer at MIT, Hollerith
prototyped punched-card technology.
His machines, including a card sorter, allowed
the 1890 census total to be reported in 6 weeks.
He founded the Tabulating Machine Company in
1911, which merged with other companies in 1924
to form International Business Machines.
Introduction to Algorithms Day 8 L5.42 2001 by Charles E. Leiserson
Punched cards
Punched card = data record.
Hole = value.
Algorithm = machine + human operator.
Replica of punch
card from the
1900 U.S. census:
[Howells 2000]
Introduction to Algorithms Day 8 L5.43 2001 by Charles E. Leiserson
Holleriths
tabulating
system
Pantograph card
punch
Hand-press reader
Dial counters
Sorting box
See figure from
[Howells 2000].
Introduction to Algorithms Day 8 L5.44 2001 by Charles E. Leiserson
Operation of the sorter
An operator inserts a card into
the press.
Pins on the press reach through
the punched holes to make
electrical contact with mercury-
filled cups beneath the card.
Whenever a particular digit
value is punched, the lid of the
corresponding sorting bin lifts.
The operator deposits the card
into the bin and closes the lid.
When all cards have been processed, the front panel is opened, and
the cards are collected in order, yielding one pass of a stable sort.
Introduction to Algorithms Day 8 L5.45 2001 by Charles E. Leiserson
Origin of radix sort
Holleriths original 1889 patent alludes to a most-
significant-digit-first radix sort:
The most complicated combinations can readily be
counted with comparatively few counters or relays by first
assorting the cards according to the first items entering
into the combinations, then reassorting each group
according to the second item entering into the combination,
and so on, and finally counting on a few counters the last
item of the combination for each group of cards.
Least-significant-digit-first radix sort seems to be
a folk invention originated by machine operators.
Introduction to Algorithms Day 8 L5.46 2001 by Charles E. Leiserson
Modern IBM card
So, thats why text windows have 80 columns!
See examples on the WWW Virtual Punch-Card Server.

.
One character per column.
Introduction to Algorithms Day 8 L5.47 2001 by Charles E. Leiserson
Web resources on punched-
card technology
Doug Joness punched card index
Biography of Herman Hollerith
The 1890 U.S. Census
Early history of IBM
Pictures of Holleriths inventions
Holleriths patent application (borrowed
from Gordon Bells CyberMuseum)
Impact of punched cards on U.S. history
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 6
Prof. Erik Demaine
Introduction to Algorithms Day 9 L6.2 2001 by Charles E. Leiserson
Order statistics
Select the ith smallest of n elements (the
element with rank i).
i = 1: minimum;
i = n: maximum;
i = (n+1)/2 or (n+1)/2(: median.
Naive algorithm: Sort and index ith element.
Worst-case running time = (n lg n) + (1)
= (n lg n),
using merge sort or heapsort (not quicksort).
Introduction to Algorithms Day 9 L6.3 2001 by Charles E. Leiserson
Randomized divide-and-
conquer algorithm
RAND-SELECT(A, p, q, i) ith smallest of A[ p . . q]
if p = q then return A[ p]
r RAND-PARTITION(A, p, q)
k r p + 1 k = rank(A[r])
if i = k then return A[ r]
if i < k
then return RAND-SELECT(A, p, r 1, i )
else return RAND-SELECT(A, r + 1, q, i k)
A[r]
A[r]
A[r]
A[r]
r p q
k
Introduction to Algorithms Day 9 L6.4 2001 by Charles E. Leiserson
Example
pivot
i = 7
6
6
10
10
13
13
5
5
8
8
3
3
2
2
11
11
k = 4
Select the 7 4 = 3rd smallest recursively.
Select the i = 7th smallest:
2
2
5
5
3
3
6
6
8
8
13
13
10
10
11
11
Partition:
Introduction to Algorithms Day 9 L6.5 2001 by Charles E. Leiserson
Intuition for analysis
Lucky:
1
0 1 log
9 / 10
= = n n
CASE 3
T(n) = T(9n/10) + (n)
= (n)
Unlucky:
T(n) = T(n 1) + (n)
= (n
2
)
arithmetic series
Worse than sorting!
(All our analyses today assume that all elements
are distinct.)
Introduction to Algorithms Day 9 L6.6 2001 by Charles E. Leiserson
Analysis of expected time
Let T(n) = the random variable for the running
time of RAND-SELECT on an input of size n,
assuming random numbers are independent.
For k = 0, 1, , n1, define the indicator
random variable
X
k
=
1 if PARTITION generates a k : nk1 split,
0 otherwise.
The analysis follows that of randomized
quicksort, but its a little different.
Introduction to Algorithms Day 9 L6.7 2001 by Charles E. Leiserson
Analysis (continued)
T(n) =
T(max{0, n1}) + (n) if 0 : n1 split,
T(max{1, n2}) + (n) if 1 : n2 split,
M
T(max{n1, 0}) + (n) if n1 : 0 split,
( )

=
+ =
1
0
) ( }) 1 , (max{
n
k
k
n k n k T X .
To obtain an upper bound, assume that the ith
element always falls in the larger side of the
partition:
Introduction to Algorithms Day 9 L6.8 2001 by Charles E. Leiserson
Calculating expectation
( )
(

+ =

=
1
0
) ( }) 1 , (max{ )] ( [
n
k
k
n k n k T X E n T E
Take expectations of both sides.
Introduction to Algorithms Day 9 L6.9 2001 by Charles E. Leiserson
Calculating expectation
( )
( ) | |

=
+ =
(

+ =
1
0
1
0
) ( }) 1 , (max{
) ( }) 1 , (max{ )] ( [
n
k
k
n
k
k
n k n k T X E
n k n k T X E n T E
Linearity of expectation.
Introduction to Algorithms Day 9 L6.10 2001 by Charles E. Leiserson
Calculating expectation
( )
( ) | |
| | | |

=
+ =
+ =
(

+ =
1
0
1
0
1
0
) ( }) 1 , (max{
) ( }) 1 , (max{
) ( }) 1 , (max{ )] ( [
n
k
k
n
k
k
n
k
k
n k n k T E X E
n k n k T X E
n k n k T X E n T E
Independence of X
k
from other random
choices.
Introduction to Algorithms Day 9 L6.11 2001 by Charles E. Leiserson
Calculating expectation
( )
( ) | |
| | | |
| |

=
+ =
+ =
+ =
(

+ =
1
0
1
0
1
0
1
0
1
0
) (
1
}) 1 , (max{
1
) ( }) 1 , (max{
) ( }) 1 , (max{
) ( }) 1 , (max{ )] ( [
n
k
n
k
n
k
k
n
k
k
n
k
k
n
n
k n k T E
n
n k n k T E X E
n k n k T X E
n k n k T X E n T E
Linearity of expectation; E[X
k
] = 1/n.
Introduction to Algorithms Day 9 L6.12 2001 by Charles E. Leiserson
Calculating expectation
( )
( ) | |
| | | |
| |
| |

) ( ) (
2
) (
1
}) 1 , (max{
1
) ( }) 1 , (max{
) ( }) 1 , (max{
) ( }) 1 , (max{ )] ( [
1
2 /
1
0
1
0
1
0
1
0
1
0
n k T E
n
n
n
k n k T E
n
n k n k T E X E
n k n k T X E
n k n k T X E n T E
n
n k
n
k
n
k
n
k
k
n
k
k
n
k
k
+
+ =
+ =
+ =
(

+ =

=
Upper terms
appear twice.
Introduction to Algorithms Day 9 L6.13 2001 by Charles E. Leiserson
Hairy recurrence
| |

) ( ) (
2
)] ( [
1
2 /
n k T E
n
n T E
n
n k
+ =

=
Prove: E[T(n)] cn for constant c > 0.
Use fact:

2
1
2 /
8
3
n k
n
n k

=

(exercise).
The constant c can be chosen large enough
so that E[T(n)] cn for the base cases.
(But not quite as hairy as the quicksort one.)
Introduction to Algorithms Day 9 L6.14 2001 by Charles E. Leiserson
Substitution method
| |

) (
2
) (
1
2 /
n ck
n
n T E
n
n k
+

=
Substitute inductive hypothesis.
Introduction to Algorithms Day 9 L6.15 2001 by Charles E. Leiserson
Substitution method
| |

) (
8
3 2
) (
2
) (
2
1
2 /
n n
n
c
n ck
n
n T E
n
n k
+
|
.
|

\
|

=
Use fact.
Introduction to Algorithms Day 9 L6.16 2001 by Charles E. Leiserson
Substitution method
Express as desired residual.
| |

|
.
|

\
|
=
+
|
.
|

\
|

=
) (
4
) (
8
3 2
) (
2
) (
2
1
2 /
n
cn
cn
n n
n
c
n ck
n
n T E
n
n k
Introduction to Algorithms Day 9 L6.17 2001 by Charles E. Leiserson
Substitution method
| |

cn
n
cn
cn
n n
n
c
n ck
n
n T E
n
n k

|
.
|

\
|
=
+
|
.
|

\
|

=
) (
4
) (
8
3 2
) (
2
) (
2
1
2 /
if c is chosen large enough so
that cn/4 dominates the (n).
,
Introduction to Algorithms Day 9 L6.18 2001 by Charles E. Leiserson
Summary of randomized
order-statistic selection
Works fast: linear expected time.
Excellent algorithm in practice.
But, the worst case is very bad: (n
2
).
Q. Is there an algorithm that runs in linear
time in the worst case?
IDEA: Generate a good pivot recursively.
A. Yes, due to Blum, Floyd, Pratt, Rivest,
and Tarjan [1973].
Introduction to Algorithms Day 9 L6.19 2001 by Charles E. Leiserson
Worst-case linear-time order
statistics
if i = k then return x
elseif i < k
then recursively SELECT the ith
smallest element in the lower part
else recursively SELECT the (ik)th
smallest element in the upper part
SELECT(i, n)
1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
2. Recursively SELECT the median x of the n/5
group medians to be the pivot.
3. Partition around the pivot x. Let k = rank(x).
4.
Same as
RAND-
SELECT
Introduction to Algorithms Day 9 L6.20 2001 by Charles E. Leiserson
Choosing the pivot
Introduction to Algorithms Day 9 L6.21 2001 by Charles E. Leiserson
Choosing the pivot
1. Divide the n elements into groups of 5.
Introduction to Algorithms Day 9 L6.22 2001 by Charles E. Leiserson
Choosing the pivot
lesser
greater
1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
Introduction to Algorithms Day 9 L6.23 2001 by Charles E. Leiserson
Choosing the pivot
lesser
greater
1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
2. Recursively SELECT the median x of the n/5
group medians to be the pivot.
x
Introduction to Algorithms Day 9 L6.24 2001 by Charles E. Leiserson
Analysis
lesser
greater
x
At least half the group medians are x, which
is at least n/5 /2 = n/10 group medians.
Introduction to Algorithms Day 9 L6.25 2001 by Charles E. Leiserson
Analysis
lesser
greater
x
At least half the group medians are x, which
is at least n/5 /2 = n/10 group medians.
Therefore, at least 3n/10 elements are x.
(Assume all elements are distinct.)
Introduction to Algorithms Day 9 L6.26 2001 by Charles E. Leiserson
Analysis
lesser
greater
x
At least half the group medians are x, which
is at least n/5 /2 = n/10 group medians.
Therefore, at least 3n/10 elements are x.
Similarly, at least 3n/10 elements are x.
(Assume all elements are distinct.)
Introduction to Algorithms Day 9 L6.27 2001 by Charles E. Leiserson
Minor simplification
For n 50, we have 3n/10 n/4.
Therefore, for n 50 the recursive call to
SELECT in Step 4 is executed recursively
on 3n/4 elements.
Thus, the recurrence for running time
can assume that Step 4 takes time
T(3n/4) in the worst case.
For n < 50, we know that the worst-case
time is T(n) = (1).
Introduction to Algorithms Day 9 L6.28 2001 by Charles E. Leiserson
Developing the recurrence
if i = k then return x
elseif i < k
then recursively SELECT the ith
smallest element in the lower part
else recursively SELECT the (ik)th
smallest element in the upper part
SELECT(i, n)
1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
2. Recursively SELECT the median x of the n/5
group medians to be the pivot.
3. Partition around the pivot x. Let k = rank(x).
4.
T(n)
(n)
T(n/5)
(n)
T(3n/4)
Introduction to Algorithms Day 9 L6.29 2001 by Charles E. Leiserson
Solving the recurrence
) (
4
3
5
1
) ( n n T n T n T +
|
.
|

\
|
+
|
.
|

\
|
=
if c is chosen large enough to handle both the
(n) and the initial conditions.
cn
n cn cn
n cn
n cn cn n T

|
.
|

\
|
=
+ =
+ +
) (
20
1
) (
20
19
) (
4
3
5
1
) (
,
Substitution:
T(n) cn
Introduction to Algorithms Day 9 L6.30 2001 by Charles E. Leiserson
Conclusions
Since the work at each level of recursion
is a constant fraction (19/20) smaller, the
work per level is a geometric series
dominated by the linear work at the root.
In practice, this algorithm runs slowly,
because the constant in front of n is large.
The randomized algorithm is far more
practical.
Exercise: Why not divide into groups of 3?
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 7
Prof. Charles E. Leiserson
Introduction to Algorithms Day 11 L7.2 2001 by Charles E. Leiserson
Symbol-table problem
Symbol table T holding n records:
key[x]
key[x]
record
x
Other fields
containing
satellite data
Operations on T:
INSERT(T, x)
DELETE(T, x)
SEARCH(T, k)
How should the data structure T be organized?
Introduction to Algorithms Day 11 L7.3 2001 by Charles E. Leiserson
Direct-access table
IDEA: Suppose that the set of keys is K {0,
1, , m1}, and keys are distinct. Set up an
array T[0 . . m1]:
T[k] =
x if k K and key[x] = k,
NIL otherwise.
Then, operations take (1) time.
Problem: The range of keys can be large:
64-bit numbers (which represent
18,446,744,073,709,551,616 different keys),
character strings (even larger!).
Introduction to Algorithms Day 11 L7.4 2001 by Charles E. Leiserson
As each key is inserted, h maps it to a slot of T.
Hash functions
Solution: Use a hash function h to map the
universe U of all keys into
{0, 1, , m1}:
U
K
k
1
k
2
k
3
k
4
k
5
0
m1
h(k
1
)
h(k
4
)
h(k
2
)
h(k
3
)
When a record to be inserted maps to an already
occupied slot in T, a collision occurs.
T
= h(k
5
)
Introduction to Algorithms Day 11 L7.5 2001 by Charles E. Leiserson
Resolving collisions by
chaining
Records in the same slot are linked into a list.
h(49) = h(86) = h(52) = i
T
49
49
86
86
52
52
i
Introduction to Algorithms Day 11 L7.6 2001 by Charles E. Leiserson
Analysis of chaining
We make the assumption of simple uniform
hashing:
Each key k K of keys is equally likely to
be hashed to any slot of table T, independent
of where other keys are hashed.
Let n be the number of keys in the table, and
let m be the number of slots.
Define the load factor of T to be
= n/m
= average number of keys per slot.
Introduction to Algorithms Day 11 L7.7 2001 by Charles E. Leiserson
Search cost
Expected time to search for a record with
a given key = (1 + ).
apply hash
function and
access slot
search
the list
Expected search time = (1) if = O(1),
or equivalently, if n = O(m).
Introduction to Algorithms Day 11 L7.8 2001 by Charles E. Leiserson
Choosing a hash function
The assumption of simple uniform hashing
is hard to guarantee, but several common
techniques tend to work well in practice as
long as their deficiencies can be avoided.
Desirata:
A good hash function should distribute the
keys uniformly into the slots of the table.
Regularity in the key distribution should
not affect this uniformity.
Introduction to Algorithms Day 11 L7.9 2001 by Charles E. Leiserson
h(k)
Division method
Assume all keys are integers, and define
h(k) = k mod m.
Extreme deficiency: If m = 2
r
, then the hash
doesnt even depend on all the bits of k:
If k = 1011000111011010
2
and r = 6, then
h(k) = 011010
2
.
Deficiency: Dont pick an m that has a small
divisor d. A preponderance of keys that are
congruent modulo d can adversely affect
uniformity.
Introduction to Algorithms Day 11 L7.10 2001 by Charles E. Leiserson
Division method (continued)
h(k) = k mod m.
Pick m to be a prime not too close to a power
of 2 or 10 and not otherwise used prominently
in the computing environment.
Annoyance:
Sometimes, making the table size a prime is
inconvenient.
But, this method is popular, although the next
method well see is usually superior.
Introduction to Algorithms Day 11 L7.11 2001 by Charles E. Leiserson
Multiplication method
Assume that all keys are integers, m = 2
r
, and our
computer has w-bit words. Define
h(k) = (Ak mod 2
w
) rsh (w r),
where rsh is the bit-wise right-shift operator
and A is an odd integer in the range 2
w1
< A < 2
w
.
Dont pick A too close to 2
w
.
Multiplication modulo 2
w
is fast.
The rsh operator is fast.
Introduction to Algorithms Day 11 L7.12 2001 by Charles E. Leiserson
4
0
3 5
2 6
1 7
Modular wheel
Multiplication method
example
h(k) = (Ak mod 2
w
) rsh (w r)
Suppose that m = 8 = 2
3
and that our computer
has w = 7-bit words:
1 0 1 1 0 0 1
1 1 0 1 0 1 1
1 0 0 1 0 1 0 0 1 1 0 0 1 1
= A
= k
h(k)
A
.
.
2A
.
.
3A
.
.
Introduction to Algorithms Day 11 L7.13 2001 by Charles E. Leiserson
Dot-product method
Randomized strategy:
Let m be prime. Decompose key k into r + 1
digits, each with value in the set {0, 1, , m1}.
That is, let k = k
0
, k
1
, , k
m1
, where 0 k
i
< m.
Pick a = a
0
, a
1
, , a
m1
where each a
i
is chosen
randomly from {0, 1, , m1}.
m k a k h
r
i
i i a
mod ) (
0

=
= Define
.
Excellent in practice, but expensive to compute.
Introduction to Algorithms Day 11 L7.14 2001 by Charles E. Leiserson
Resolving collisions by open
addressing
No storage is used outside of the hash table itself.
Insertion systematically probes the table until an
empty slot is found.
The hash function depends on both the key and
probe number:
h : U {0, 1, , m1} {0, 1, , m1}.
The probe sequence h(k,0), h(k,1), , h(k,m1)
should be a permutation of {0, 1, , m1}.
The table may fill up, and deletion is difficult (but
not impossible).
Introduction to Algorithms Day 11 L7.15 2001 by Charles E. Leiserson
204 204
Example of open addressing
Insert key k = 496:
0. Probe h(496,0)
586
133
481
T
0
m1
collision
Introduction to Algorithms Day 11 L7.16 2001 by Charles E. Leiserson
Example of open addressing
Insert key k = 496:
0. Probe h(496,0)
586
133
204
481
T
0
m1
1. Probe h(496,1)
collision
586
Introduction to Algorithms Day 11 L7.17 2001 by Charles E. Leiserson
Example of open addressing
Insert key k = 496:
0. Probe h(496,0)
586
133
204
481
T
0
m1
1. Probe h(496,1)
insertion
496
2. Probe h(496,2)
Introduction to Algorithms Day 11 L7.18 2001 by Charles E. Leiserson
Example of open addressing
Search for key k = 496:
0. Probe h(496,0)
586
133
204
481
T
0
m1
1. Probe h(496,1)
496
2. Probe h(496,2)
Search uses the same probe
sequence, terminating suc-
cessfully if it finds the key
and unsuccessfully if it encounters an empty slot.
Introduction to Algorithms Day 11 L7.19 2001 by Charles E. Leiserson
Probing strategies
Linear probing:
Given an ordinary hash function h(k), linear
probing uses the hash function
h(k,i) = (h(k) + i) mod m.
This method, though simple, suffers from primary
clustering, where long runs of occupied slots build
up, increasing the average search time. Moreover,
the long runs of occupied slots tend to get longer.
Introduction to Algorithms Day 11 L7.20 2001 by Charles E. Leiserson
Probing strategies
Double hashing
Given two ordinary hash functions h
1
(k) and h
2
(k),
double hashing uses the hash function
h(k,i) = (h
1
(k) + ih
2
(k)) mod m.
This method generally produces excellent results,
but h
2
(k) must be relatively prime to m. One way
is to make m a power of 2 and design h
2
(k) to
produce only odd numbers.
Introduction to Algorithms Day 11 L7.21 2001 by Charles E. Leiserson
Analysis of open addressing
We make the assumption of uniform hashing:
Each key is equally likely to have any one of
the m! permutations as its probe sequence.
Theorem. Given an open-addressed hash
table with load factor = n/m < 1, the
expected number of probes in an unsuccessful
search is at most 1/(1).
Introduction to Algorithms Day 11 L7.22 2001 by Charles E. Leiserson
Proof of the theorem
Proof.
At least one probe is always necessary.
With probability n/m, the first probe hits an
occupied slot, and a second probe is necessary.
With probability (n1)/(m1), the second probe
hits an occupied slot, and a third probe is
necessary.
With probability (n2)/(m2), the third probe
hits an occupied slot, etc.
Observe that
= <

m
n
i m
i n
for i = 1, 2, , n.
Introduction to Algorithms Day 11 L7.23 2001 by Charles E. Leiserson
Proof (continued)
Therefore, the expected number of probes is

+
+

+ + L L
1
1
1
2
2
1
1
1
1 1
n m m
n
m
n
m
n
( ) ( ) ( ) ( )

=
=
+ + + +
+ + + +

=
1
1
1
1 1 1 1
0
3 2
i
i
L
L L
.
The textbook has a
more rigorous proof.
Introduction to Algorithms Day 11 L7.24 2001 by Charles E. Leiserson
Implications of the theorem
If is constant, then accessing an open-
addressed hash table takes constant time.
If the table is half full, then the expected
number of probes is 1/(10.5) = 2.
If the table is 90%full, then the expected
number of probes is 1/(10.9) = 10.
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 8
Prof. Charles E. Leiserson
Introduction to Algorithms Day 12 L8.2 2001 by Charles E. Leiserson
A weakness of hashing
Problem: For any hash function h, a set
of keys exists that can cause the average
access time of a hash table to skyrocket.
IDEA: Choose the hash function at random,
independently of the keys.
Even if an adversary can see your code,
he or she cannot find a bad set of keys,
since he or she doesnt know exactly
which hash function will be chosen.
An adversary can pick all keys from
{k U : h(k) = i} for some slot i.
Introduction to Algorithms Day 12 L8.3 2001 by Charles E. Leiserson
Universal hashing
Definition. Let U be a universe of keys, and
let H be a finite collection of hash functions,
each mapping U to {0, 1, , m1}. We say
H is universal if for all x, y U, where x y,
we have |{h H: h(x) = h(y)}| = |H| / m.
That is, the chance
of a collision
between x and y is
1/m if we choose h
randomly from H.
H
{h : h(x) = h(y)}
|H|
m
Introduction to Algorithms Day 12 L8.4 2001 by Charles E. Leiserson
Universality is good
Theorem. Let h be a hash function chosen
(uniformly) at random from a universal set H
of hash functions. Suppose h is used to hash
n arbitrary keys into the m slots of a table T.
Then, for a given key x, we have
E[#collisions with x] < n/m.
Introduction to Algorithms Day 12 L8.5 2001 by Charles E. Leiserson
Proof of theorem
Proof. Let C
x
be the random variable denoting
the total number of collisions of keys in T with
x, and let
c
xy
=
1 if h(x) = h(y),
0 otherwise.
Note: E[c
xy
] = 1/m and


=
} {x T y
xy x
c C .
Introduction to Algorithms Day 12 L8.6 2001 by Charles E. Leiserson
Proof (continued)
(
(

=

} {
] [
x T y
xy x
c E C E Take expectation
of both sides.
Introduction to Algorithms Day 12 L8.7 2001 by Charles E. Leiserson
Proof (continued)



=
(
(

=
} {
} {
] [
] [
x T y
xy
x T y
xy x
c E
c E C E
Linearity of
expectation.
Take expectation
of both sides.
Introduction to Algorithms Day 12 L8.8 2001 by Charles E. Leiserson
Proof (continued)




=
=
(
(

=
} {
} {
} {
/ 1
] [
] [
x T y
x T y
xy
x T y
xy x
m
c E
c E C E
E[c
xy
] = 1/m.
Linearity of
expectation.
Take expectation
of both sides.
Introduction to Algorithms Day 12 L8.9 2001 by Charles E. Leiserson
Proof (continued)
m
n
m
c E
c E C E
x T y
x T y
xy
x T y
xy x
1
/ 1
] [
] [
} {
} {
} {

=
=
=
(
(




Take expectation
of both sides.
Linearity of
expectation.
E[c
xy
] = 1/m.
Algebra. .
Introduction to Algorithms Day 12 L8.10 2001 by Charles E. Leiserson
REMEMBER
THIS!
Constructing a set of
universal hash functions
Let m be prime. Decompose key k into r + 1
digits, each with value in the set {0, 1, , m1}.
That is, let k = k
0
, k
1
, , k
r
, where 0 k
i
< m.
Randomized strategy:
Pick a = a
0
, a
1
, , a
r
where each a
i
is chosen
randomly from {0, 1, , m1}.
m k a k h
r
i
i i a
mod ) (
0

=
= Define
.
How big is H= {h
a
}? |H| = m
r + 1
.
Dot product,
modulo m
Introduction to Algorithms Day 12 L8.11 2001 by Charles E. Leiserson
Universality of dot-product
hash functions
Theorem. The set H= {h
a
} is universal.
Proof. Suppose that x = x
0
, x
1
, , x
r
and y =
y
0
, y
1
, , y
r
be distinct keys. Thus, they differ
in at least one digit position, wlog position 0.
For how many h
a
Hdo x and y collide?
) (mod
0 0
m y a x a
r
i
i i
r
i
i i

= =
.
We must have h
a
(x) = h
a
(y), which implies that
Introduction to Algorithms Day 12 L8.12 2001 by Charles E. Leiserson
Proof (continued)
Equivalently, we have
) (mod 0 ) (
0
m y x a
r
i
i i i

=
or
) (mod 0 ) ( ) (
1
0 0 0
m y x a y x a
r
i
i i i
+

=
) (mod ) ( ) (
1
0 0 0
m y x a y x a
r
i
i i i

=

which implies that
,
.
Introduction to Algorithms Day 12 L8.13 2001 by Charles E. Leiserson
Fact from number theory
Theorem. Let m be prime. For any z Z
m
such that z 0, there exists a unique z
1
Z
m
such that
z z
1
1 (mod m).
Example: m = 7.
z
z
1
1 2 3 4 5 6
1 4 5 2 3 6
Introduction to Algorithms Day 12 L8.14 2001 by Charles E. Leiserson
Back to the proof
) (mod ) ( ) (
1
0 0 0
m y x a y x a
r
i
i i i

=

We have
and since x
0
y
0
, an inverse (x
0
y
0
)
1
must exist,
which implies that
,
) (mod ) ( ) (
1
0 0
1
0
m y x y x a a
r
i
i i i

=

|
|
.
|

\
|


.
Thus, for any choices of a
1
, a
2
, , a
r
, exactly
one choice of a
0
causes x and y to collide.
Introduction to Algorithms Day 12 L8.15 2001 by Charles E. Leiserson
Proof (completed)
Q. How many h
a
s cause x and y to collide?
A. There are m choices for each of a
1
, a
2
, , a
r
,
but once these are chosen, exactly one choice
for a
0
causes x and y to collide, namely
m y x y x a a
r
i
i i i
mod ) ( ) (
1
0 0
1
0
|
|
.
|

\
|

|
|
.
|

\
|
=

=

.
Thus, the number of h
a
s that cause x and y
to collide is m
r
1 = m
r
= |H|/m.
Introduction to Algorithms Day 12 L8.16 2001 by Charles E. Leiserson
Perfect hashing
Given a set of n keys, construct a static hash
table of size m = O(n) such that SEARCH takes
(1) time in the worst case.
IDEA: Two-
level scheme
with universal
hashing at
both levels.
No collisions
at level 2!
40
40
37
37
22
22
0
1
2
3
4
5
6
26
26
m a 0 1 2 3 4 5 6 7 8
14
14
27
27
S
4
S
6
S
1
4
4
31
31
1
1
00
00
9
9
86
86
T
h
31
(14) = h
31
(27) = 1
Introduction to Algorithms Day 12 L8.17 2001 by Charles E. Leiserson
Collisions at level 2
Theorem. Let Hbe a class of universal hash
functions for a table of size m = n
2
. Then, if we
use a random h Hto hash n keys into the table,
the expected number of collisions is at most 1/2.
Proof. By the definition of universality, the
probability that 2 given keys in the table collide
under h is 1/m = 1/n
2
. Since there are pairs
of keys that can possibly collide, the expected
number of collisions is
( )
2
n
2
1 1
2
) 1 (
1
2
2 2
<

=
|
.
|

\
|
n
n n
n
n
.
Introduction to Algorithms Day 12 L8.18 2001 by Charles E. Leiserson
No collisions at level 2
Corollary. The probability of no collisions
is at least 1/2.
Thus, just by testing random hash functions
in H, well quickly find one that works.
Proof. Markovs inequality says that for any
nonnegative random variable X, we have
Pr{X t} E[X]/t.
Applying this inequality with t = 1, we find
that the probability of 1 or more collisions is
at most 1/2.
Introduction to Algorithms Day 12 L8.19 2001 by Charles E. Leiserson
Analysis of storage
For the level-1 hash table T, choose m = n, and
let n
i
be random variable for the number of keys
that hash to slot i in T. By using n
i
2
slots for the
level-2 hash table S
i
, the expected total storage
required for the two-level scheme is therefore
( ) ) (
1
0
2
n n E
m
i
i
=
(

=
,
since the analysis is identical to the analysis from
recitation of the expected running time of bucket
sort. (For a probability bound, apply Markov.)
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 9
Prof. Charles E. Leiserson
Introduction to Algorithms Day 17 L9.2 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
3
3
Binary-search-tree sort
T Create an empty BST
for i = 1 to n
do TREE-INSERT(T, A[i])
Perform an inorder tree walk of T.
Example:
A = [3 1 8 2 6 7 5]
8
8
1
1
2
2
6
6
5
5
7
7
Tree-walk time = O(n),
but how long does it
take to build the BST?
Introduction to Algorithms Day 17 L9.3 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Analysis of BST sort
BST sort performs the same comparisons as
quicksort, but in a different order!
3 1 8 2 6 7 5
1 2 8 6 7 5
2 6 7 5
7 5
The expected time to build the tree is asymptot-
ically the same as the running time of quicksort.
Introduction to Algorithms Day 17 L9.4 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Node depth
The depth of a node = the number of comparisons
made during TREE-INSERT. Assuming all input
permutations are equally likely, we have
( )
) (lg
) lg (
1
node insert to s comparison #
1
1
n O
n n O
n
i E
n
n
i
=
=
(

=

=
Average node depth
.
(quicksort analysis)
Introduction to Algorithms Day 17 L9.5 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Expected tree height
But, average node depth of a randomly built
BST = O(lg n) does not necessarily mean that its
expected height is also O(lg n) (although it is).
Example.
lg n
n h =
) (lg
2
lg
1
n O
n n
n n
n
=
|
.
|

\
|

+ Ave. depth
Introduction to Algorithms Day 17 L9.6 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Height of a randomly built
binary search tree
Prove Jensens inequality, which says that
f(E[X]) E[f(X)] for any convex function f and
random variable X.
Analyze the exponential height of a randomly
built BST on n nodes, which is the random
variable Y
n
= 2
X
n
, where X
n
is the random
variable denoting the height of the BST.
Prove that 2
E[X
n
]
E[2
X
n
] = E[Y
n
] = O(n
3
),
and hence that E[X
n
] = O(lg n).
Outline of the analysis:
Introduction to Algorithms Day 17 L9.7 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Convex functions
A function f : R R is convex if for all
, 0 such that + = 1, we have
f(x + y) f(x) + f(y)
for all x,y R.
x + y
f(x) + f(y)
f(x + y)
x y
f(x)
f(y)
f
Introduction to Algorithms Day 17 L9.8 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Convexity lemma
Lemma. Let f : R R be a convex function,
and let {
1
,
2
, ,
n
} be a set of nonnegative
constants such that
k

k
= 1. Then, for any set
{x
1
, x
2
, , x
n
} of real numbers, we have
) (
1 1

= =

|
|
.
|

\
|
n
k
k k
n
k
k k
x f x f
Proof. By induction on n. For n = 1, we have

1
= 1, and hence f(
1
x
1
)
1
f(x
1
) trivially.
.
Introduction to Algorithms Day 17 L9.9 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof (continued)
|
|
.
|

\
|

+ =
|
|
.
|

\
|


= =
1
1 1
1
) 1 (
n
k
k
n
k
n n n
n
k
k k
x x f x f


Inductive step:
Algebra.
Introduction to Algorithms Day 17 L9.10 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof (continued)
|
|
.
|

\
|

+
|
|
.
|

\
|

+ =
|
|
.
|

\
|

= =
1
1
1
1 1
1
) 1 ( ) (
1
) 1 (
n
k
k
n
k
n n n
n
k
k
n
k
n n n
n
k
k k
x f x f
x x f x f


Inductive step:
Convexity.
Introduction to Algorithms Day 17 L9.11 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof (continued)

= =

+
|
|
.
|

\
|

+
|
|
.
|

\
|

+ =
|
|
.
|

\
|
1
1
1
1
1
1 1
) (
1
) 1 ( ) (
1
) 1 ( ) (
1
) 1 (
n
k
k
n
k
n n n
n
k
k
n
k
n n n
n
k
k
n
k
n n n
n
k
k k
x f x f
x f x f
x x f x f


Inductive step:
Induction.
Introduction to Algorithms Day 17 L9.12 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof (continued)
) (
) (
1
) 1 ( ) (
1
) 1 ( ) (
1
) 1 (
1
1
1
1
1
1
1 1

= =
=

+
|
|
.
|

\
|

+
|
|
.
|

\
|

+ =
|
|
.
|

\
|
n
k
k k
n
k
k
n
k
n n n
n
k
k
n
k
n n n
n
k
k
n
k
n n n
n
k
k k
x f
x f x f
x f x f
x x f x f


Inductive step:
Algebra.
.
Introduction to Algorithms Day 17 L9.13 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Jensens inequality
Lemma. Let f be a convex function, and let X
be a random variable. Then, f (E[X]) E[ f (X)].
|
|
.
|

\
|
= =

= k
k X k f X E f } Pr{ ]) [ (
Proof.
Definition of expectation.
Introduction to Algorithms Day 17 L9.14 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Jensens inequality

=
=
|
|
.
|

\
|
= =
k
k
k X k f
k X k f X E f
} Pr{ ) (
} Pr{ ]) [ (
Proof.
Convexity lemma (generalized).
Lemma. Let f be a convex function, and let X
be a random variable. Then, f (E[X]) E[ f (X)].
Introduction to Algorithms Day 17 L9.15 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Jensens inequality
)] ( [
} Pr{ ) (
} Pr{ ]) [ (
X f E
k X k f
k X k f X E f
k
k
=
=
|
|
.
|

\
|
= =

=
.
Proof.
Tricky step, but truethink about it.
Lemma. Let f be a convex function, and let X
be a random variable. Then, f (E[X]) E[ f (X)].
Introduction to Algorithms Day 17 L9.16 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Analysis of BST height
Let X
n
be the random variable denoting
the height of a randomly built binary
search tree on n nodes, and let Y
n
= 2
X
n
be its exponential height.
If the root of the tree has rank k, then
X
n
= 1 + max{X
k1
, X
nk
} ,
since each of the left and right subtrees
of the root are randomly built. Hence,
we have
Y
n
= 2 max{Y
k1
, Y
nk
} .
Introduction to Algorithms Day 17 L9.17 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Analysis (continued)
Define the indicator random variable Z
nk
as
Z
nk
=
1 if the root has rank k,
0 otherwise.
Thus, Pr{Z
nk
= 1} = E[Z
nk
] = 1/n, and
( )

=

=
n
k
k n k nk n
Y Y Z Y
1
1
} , max{ 2
.
Introduction to Algorithms Day 17 L9.18 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Exponential height recurrence
| | ( )
(

=

=

n
k
k n k nk n
Y Y Z E Y E
1
1
} , max{ 2
Take expectation of both sides.
Introduction to Algorithms Day 17 L9.19 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Exponential height recurrence
| | ( )
( ) | |

=

=

=
(

=
n
k
k n k nk
n
k
k n k nk n
Y Y Z E
Y Y Z E Y E
1
1
1
1
} , max{ 2
} , max{ 2
Linearity of expectation.
Introduction to Algorithms Day 17 L9.20 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Exponential height recurrence
| | ( )
( ) | |

=

=

=

=
=
(

=
n
k
k n k nk
n
k
k n k nk
n
k
k n k nk n
Y Y E Z E
Y Y Z E
Y Y Z E Y E
1
1
1
1
1
1
}] , [max{ ] [ 2
} , max{ 2
} , max{ 2
Independence of the rank of the root
from the ranks of subtree roots.
Introduction to Algorithms Day 17 L9.21 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Exponential height recurrence
| | ( )
( ) | |

=

=

=

=

+
=
=
(

=
n
k
k n k
n
k
k n k nk
n
k
k n k nk
n
k
k n k nk n
Y Y E
n
Y Y E Z E
Y Y Z E
Y Y Z E Y E
1
1
1
1
1
1
1
1
] [
2
}] , [max{ ] [ 2
} , max{ 2
} , max{ 2
The max of two nonnegative numbers
is at most their sum, and E[Z
nk
] = 1/n.
Introduction to Algorithms Day 17 L9.22 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Exponential height recurrence
| | ( )
( ) | |

=
=

=

=

=

=
+
=
=
(

=
1
0
1
1
1
1
1
1
1
1
] [
4
] [
2
}] , [max{ ] [ 2
} , max{ 2
} , max{ 2
n
k
k
n
k
k n k
n
k
k n k nk
n
k
k n k nk
n
k
k n k nk n
Y E
n
Y Y E
n
Y Y E Z E
Y Y Z E
Y Y Z E Y E
Each term appears
twice, and reindex.
Introduction to Algorithms Day 17 L9.23 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Solving the recurrence
Use substitution to
show that E[Y
n
] cn
3
for some positive
constant c, which we
can pick sufficiently
large to handle the
initial conditions.
| |

=
=
1
0
] [
4
n
k
k n
Y E
n
Y E
Introduction to Algorithms Day 17 L9.24 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Solving the recurrence
Use substitution to
show that E[Y
n
] cn
3
for some positive
constant c, which we
can pick sufficiently
large to handle the
initial conditions.
| |

=
1
0
3
1
0
4
] [
4
n
k
n
k
k n
ck
n
Y E
n
Y E
Substitution.
Introduction to Algorithms Day 17 L9.25 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Solving the recurrence
Use substitution to
show that E[Y
n
] cn
3
for some positive
constant c, which we
can pick sufficiently
large to handle the
initial conditions.
| |

=
n
n
k
n
k
k n
dx x
n
c
ck
n
Y E
n
Y E
0
3
1
0
3
1
0
4
4
] [
4
Integral method.
Introduction to Algorithms Day 17 L9.26 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Solving the recurrence
Use substitution to
show that E[Y
n
] cn
3
for some positive
constant c, which we
can pick sufficiently
large to handle the
initial conditions.
| |
|
.
|

\
|
=

=
4
4
4
4
] [
4
4
0
3
1
0
3
1
0
n
n
c
dx x
n
c
ck
n
Y E
n
Y E
n
n
k
n
k
k n
Solve the integral.
Introduction to Algorithms Day 17 L9.27 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Solving the recurrence
Use substitution to
show that E[Y
n
] cn
3
for some positive
constant c, which we
can pick sufficiently
large to handle the
initial conditions.
| |
3
4
0
3
1
0
3
1
0
4
4
4
4
] [
4
cn
n
n
c
dx x
n
c
ck
n
Y E
n
Y E
n
n
k
n
k
k n
=
|
.
|

\
|
=

=
. Algebra.
Introduction to Algorithms Day 17 L9.28 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
The grand finale
2
E[X
n
]
E[2
X
n
]
Putting it all together, we have
Jensens inequality, since
f(x) = 2
x
is convex.
Introduction to Algorithms Day 17 L9.29 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
The grand finale
2
E[X
n
]
E[2
X
n
]
= E[Y
n
]
Putting it all together, we have
Definition.
Introduction to Algorithms Day 17 L9.30 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
The grand finale
2
E[X
n
]
E[2
X
n
]
= E[Y
n
]
cn
3
.
Putting it all together, we have
What we just showed.
Introduction to Algorithms Day 17 L9.31 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
The grand finale
2
E[X
n
]
E[2
X
n
]
= E[Y
n
]
cn
3
.
Putting it all together, we have
Taking the lg of both sides yields
E[X
n
] 3lg n +O(1) .
Introduction to Algorithms Day 17 L9.32 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Post mortem
Q. Does the analysis have to be this hard?
Q. Why bother with analyzing exponential
height?
Q. Why not just develop the recurrence on
X
n
= 1 + max{X
k1
, X
nk
}
directly?
Introduction to Algorithms Day 17 L9.33 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Post mortem (continued)
A. The inequality
max{a, b} a + b .
provides a poor upper bound, since the RHS
approaches the LHS slowly as |a b| increases.
The bound
max{2
a
, 2
b
} 2
a
+ 2
b
allows the RHS to approach the LHS far more
quickly as |a b| increases. By using the
convexity of f(x) = 2
x
via Jensens inequality,
we can manipulate the sum of exponentials,
resulting in a tight analysis.
Introduction to Algorithms Day 17 L9.34 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Thought exercises
See what happens when you try to do the
analysis on X
n
directly.
Try to understand better why the proof
uses an exponential. Will a quadratic do?
See if you can find a simpler argument.
(This argument is a little simpler than the
one in the bookI hope its correct!)
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 10
Prof. Erik Demaine
Introduction to Algorithms Day 18 L10.2 2001 by Charles E. Leiserson
Balanced search trees
Balanced search tree: A search-tree data
structure for which a height of O(lg n) is
guaranteed when implementing a dynamic
set of n items.
Examples:
AVL trees
2-3 trees
2-3-4 trees
B-trees
Red-black trees
Introduction to Algorithms Day 18 L10.3 2001 by Charles E. Leiserson
Red-black trees
This data structure requires an extra one-
bit color field in each node.
Red-black properties:
1. Every node is either red or black.
2. The root and leaves (NILs) are black.
3. If a node is red, then its parent is black.
4. All simple paths from any node x to a
descendant leaf have the same number
of black nodes = black-height(x).
Introduction to Algorithms Day 18 L10.4 2001 by Charles E. Leiserson
Example of a red-black tree
h = 4
8
8
11
11
10
10
18
18
26
26
22
22
3
3
7
7
NIL NIL
NIL NIL NIL NIL
NIL
NIL NIL
Introduction to Algorithms Day 18 L10.5 2001 by Charles E. Leiserson
Example of a red-black tree
8
8
11
11
10
10
18
18
26
26
22
22
3
3
7
7
NIL NIL
NIL NIL NIL NIL
NIL
NIL NIL
1. Every node is either red or black.
Introduction to Algorithms Day 18 L10.6 2001 by Charles E. Leiserson
Example of a red-black tree
8
8
11
11
10
10
18
18
26
26
22
22
3
3
7
7
NIL NIL
NIL NIL NIL NIL
NIL
NIL NIL
2. The root and leaves (NILs) are black.
Introduction to Algorithms Day 18 L10.7 2001 by Charles E. Leiserson
Example of a red-black tree
8
8
11
11
10
10
18
18
26
26
22
22
3
3
7
7
NIL NIL
NIL NIL NIL NIL
NIL
NIL NIL
3. If a node is red, then its parent is black.
Introduction to Algorithms Day 18 L10.8 2001 by Charles E. Leiserson
Example of a red-black tree
4. All simple paths from any node x to a
descendant leaf have the same number of
black nodes = black-height(x).
8
8
11
11
10
10
18
18
26
26
22
22
3
3
7
7
NIL NIL
NIL NIL NIL NIL
NIL
NIL NIL
bh = 2
bh = 1
bh = 1
bh = 2
bh = 0
Introduction to Algorithms Day 18 L10.9 2001 by Charles E. Leiserson
Height of a red-black tree
Theorem. A red-black tree with n keys has height
h 2 lg(n + 1).
Proof. (The book uses induction. Read carefully.)
INTUITION:
Merge red nodes
into their black
parents.
Introduction to Algorithms Day 18 L10.10 2001 by Charles E. Leiserson
Height of a red-black tree
Theorem. A red-black tree with n keys has height
h 2 lg(n + 1).
Proof. (The book uses induction. Read carefully.)
INTUITION:
Merge red nodes
into their black
parents.
Introduction to Algorithms Day 18 L10.11 2001 by Charles E. Leiserson
Height of a red-black tree
Theorem. A red-black tree with n keys has height
h 2 lg(n + 1).
Proof. (The book uses induction. Read carefully.)
INTUITION:
Merge red nodes
into their black
parents.
Introduction to Algorithms Day 18 L10.12 2001 by Charles E. Leiserson
Height of a red-black tree
Theorem. A red-black tree with n keys has height
h 2 lg(n + 1).
Proof. (The book uses induction. Read carefully.)
INTUITION:
Merge red nodes
into their black
parents.
Introduction to Algorithms Day 18 L10.13 2001 by Charles E. Leiserson
Height of a red-black tree
Theorem. A red-black tree with n keys has height
h 2 lg(n + 1).
Proof. (The book uses induction. Read carefully.)
INTUITION:
Merge red nodes
into their black
parents.
Introduction to Algorithms Day 18 L10.14 2001 by Charles E. Leiserson
Height of a red-black tree
Theorem. A red-black tree with n keys has height
h 2 lg(n + 1).
Proof. (The book uses induction. Read carefully.)
This process produces a tree in which each node
has 2, 3, or 4 children.
The 2-3-4 tree has uniform depth h of leaves.
INTUITION:
Merge red nodes
into their black
parents.
h
Introduction to Algorithms Day 18 L10.15 2001 by Charles E. Leiserson
Proof (continued)
h
h
We have
h h/2, since
at most half
the leaves on any path
are red.
The number of leaves
in each tree is n + 1
n + 1 2
h'
lg(n + 1) h' h/2
h 2 lg(n + 1).
Introduction to Algorithms Day 18 L10.16 2001 by Charles E. Leiserson
Query operations
Corollary. The queries SEARCH, MIN,
MAX, SUCCESSOR, and PREDECESSOR
all run in O(lg n) time on a red-black
tree with n nodes.
Introduction to Algorithms Day 18 L10.17 2001 by Charles E. Leiserson
Modifying operations
The operations INSERT and DELETE cause
modifications to the red-black tree:
the operation itself,
color changes,
restructuring the links of the tree via
rotations.
Introduction to Algorithms Day 18 L10.18 2001 by Charles E. Leiserson
Rotations
A
A
B
B

RIGHT-ROTATE(B)
B
B
A
A

LEFT-ROTATE(A)
Rotations maintain the inorder ordering of keys:
a , b , c a A b B c.
A rotation can be performed in O(1) time.
Introduction to Algorithms Day 18 L10.19 2001 by Charles E. Leiserson
Insertion into a red-black tree
8
8
10
10
18
18
26
26
22
22
7
7
Example:
3
3
11
11
IDEA: Insert x in tree. Color x red. Only red-
black property 3 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
Introduction to Algorithms Day 18 L10.20 2001 by Charles E. Leiserson
Insertion into a red-black tree
8
8
11
11
10
10
18
18
26
26
22
22
7
7
15
15
Example:
Insert x =15.
Recolor, moving the
violation up the tree.
3
3
IDEA: Insert x in tree. Color x red. Only red-
black property 3 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
Introduction to Algorithms Day 18 L10.21 2001 by Charles E. Leiserson
Insertion into a red-black tree
8
8
11
11
10
10
18
18
26
26
22
22
7
7
15
15
Example:
Insert x =15.
Recolor, moving the
violation up the tree.
RIGHT-ROTATE(18).
3
3
IDEA: Insert x in tree. Color x red. Only red-
black property 3 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
Introduction to Algorithms Day 18 L10.22 2001 by Charles E. Leiserson
Insertion into a red-black tree
8
8
11
11
10
10
18
18
26
26
22
22
7
7
15
15
Example:
Insert x =15.
Recolor, moving the
violation up the tree.
RIGHT-ROTATE(18).
LEFT-ROTATE(7) and recolor.
3
3
IDEA: Insert x in tree. Color x red. Only red-
black property 3 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
Introduction to Algorithms Day 18 L10.23 2001 by Charles E. Leiserson
Insertion into a red-black tree
IDEA: Insert x in tree. Color x red. Only red-
black property 3 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
8
8
11
11
10
10
18
18
26
26
22
22
7
7
15
15
Example:
Insert x =15.
Recolor, moving the
violation up the tree.
RIGHT-ROTATE(18).
LEFT-ROTATE(7) and recolor.
3
3
Introduction to Algorithms Day 18 L10.24 2001 by Charles E. Leiserson
Pseudocode
RB-INSERT(T, x)
TREE-INSERT(T, x)
color[x] RED only RB property 3 can be violated
while x root[T] and color[p[x]] = RED
do if p[x] = left[p[p[x]]
then y right[p[p[x]] y = aunt/uncle of x
if color[y] = RED
then Case 1
else if x = right[p[x]]
then Case 2 Case 2 falls into Case 3
Case 3
else then clause with left and right swapped
color[root[T]] BLACK
Introduction to Algorithms Day 18 L10.25 2001 by Charles E. Leiserson
Graphical notation
Let denote a subtree with a black root.
All s have the same black-height.
Introduction to Algorithms Day 18 L10.26 2001 by Charles E. Leiserson
Case 1
B
B
C
C
D
D
A
A
x
y
(Or, children of
A are swapped.)
B
B
C
C
D
D
A
A
new x
Push Cs black onto
A and D, and recurse,
since Cs parent may
be red.
Recolor
Introduction to Algorithms Day 18 L10.27 2001 by Charles E. Leiserson
Case 2
B
B
C
C
A
A
x
y
LEFT-ROTATE(A)
A
A
C
C
B
B
x
y
Transform to Case 3.
Introduction to Algorithms Day 18 L10.28 2001 by Charles E. Leiserson
Case 3
A
A
C
C
B
B
x
y
RIGHT-ROTATE(C)
A
A
B
B
C
C
Done! No more
violations of RB
property 3 are
possible.
Introduction to Algorithms Day 18 L10.29 2001 by Charles E. Leiserson
Analysis
Go up the tree performing Case 1, which only
recolors nodes.
If Case 2 or Case 3 occurs, perform 1 or 2
rotations, and terminate.
Running time: O(lg n) with O(1) rotations.
RB-DELETE same asymptotic running time
and number of rotations as RB-INSERT (see
textbook).
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 11
Prof. Erik Demaine
Introduction to Algorithms Day 20 L11.2 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Dynamic order statistics
OS-SELECT(i, S): returns the i th smallest element
in the dynamic set S.
OS-RANK(x, S): returns the rank of x S in the
sorted order of Ss elements.
IDEA: Use a red-black tree for the set S, but keep
subtree sizes in the nodes.
key
size
key
size
Notation for nodes:
Introduction to Algorithms Day 20 L11.3 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of an OS-tree
M
9
M
9
C
5
C
5
A
1
A
1
F
3
F
3
N
1
N
1
Q
1
Q
1
P
3
P
3
H
1
H
1
D
1
D
1
size[x] = size[left[x]] + size[right[x]] + 1
Introduction to Algorithms Day 20 L11.4 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Selection
OS-SELECT(x, i) ith smallest element in the
subtree rooted at x
k size[left[x]] + 1 k = rank(x)
if i = k then return x
if i < k
then return OS-SELECT(left[x], i )
else return OS-SELECT(right[x], i k)
Implementation trick: Use a sentinel
(dummy record) for NIL such that size[NIL] = 0.
(OS-RANK is in the textbook.)
Introduction to Algorithms Day 20 L11.5 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example
M
9
M
9
C
5
C
5
A
1
A
1
F
3
F
3
N
1
N
1
Q
1
Q
1
P
3
P
3
H
1
H
1
D
1
D
1
OS-SELECT(root, 5)
i = 5
k = 6
M
9
M
9
C
5
C
5
i = 5
k = 2
i = 3
k = 2
F
3
F
3
i = 1
k = 1
H
1
H
1
H
1
H
1
Running time = O(h) = O(lg n) for red-black trees.
Introduction to Algorithms Day 20 L11.6 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Data structure maintenance
Q. Why not keep the ranks themselves
in the nodes instead of subtree sizes?
A. They are hard to maintain when the
red-black tree is modified.
Modifying operations: INSERT and DELETE.
Strategy: Update subtree sizes when
inserting or deleting.
Introduction to Algorithms Day 20 L11.7 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of insertion
M
9
M
9
C
5
C
5
A
1
A
1
F
3
F
3
N
1
N
1
Q
1
Q
1
P
3
P
3
H
1
H
1
D
1
D
1
INSERT(K)
M
10
M
10
C
6
C
6
F
4
F
4
H
2
H
2
K
1
K
1
Introduction to Algorithms Day 20 L11.8 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Handling rebalancing
Dont forget that RB-INSERT and RB-DELETE may
also need to modify the red-black tree in order to
maintain balance.
Recolorings: no effect on subtree sizes.
Rotations: fix up subtree sizes in O(1) time.
Example:
C
11
C
11
E
16
E
16
7 3
4
C
16
C
16
E
8
E
8
7
3 4
RB-INSERT and RB-DELETE still run in O(lg n) time.
Introduction to Algorithms Day 20 L11.9 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Data-structure augmentation
Methodology: (e.g., order-statistics trees)
1. Choose an underlying data structure (red-
black trees).
2. Determine additional information to be
stored in the data structure (subtree sizes).
3. Verify that this information can be
maintained for modifying operations (RB-
INSERT, RB-DELETE dont forget rotations).
4. Develop new dynamic-set operations that use
the information (OS-SELECT and OS-RANK).
These steps are guidelines, not rigid rules.
Introduction to Algorithms Day 20 L11.10 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Interval trees
Goal: To maintain a dynamic set of intervals,
such as time intervals.
low[i] = 7 10 = high[i]
i = [7, 10]
5
4 15 22
17 11
8 18
19
23
Query: For a given query interval i, find an
interval in the set that overlaps i.
Introduction to Algorithms Day 20 L11.11 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Following the methodology
1. Choose an underlying data structure.
Red-black tree keyed on low (left) endpoint.
int
m
int
m
2. Determine additional information to be
stored in the data structure.
Store in each node x the largest value m[x]
in the subtree rooted at x, as well as the
interval int[x] corresponding to the key.
Introduction to Algorithms Day 20 L11.12 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
17,19
23
17,19
23
Example interval tree
5,11
18
5,11
18
4,8
8
4,8
8
15,18
18
15,18
18
7,10
10
7,10
10
22,23
23
22,23
23
m[x] = max
high[int[x]]
m[left[x]]
m[right[x]]
Introduction to Algorithms Day 20 L11.13 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Modifying operations
3. Verify that this information can be maintained
for modifying operations.
INSERT: Fix ms on the way down.
6,20
30
6,20
30
11,15
19
11,15
19
19
19
14
14
30
30
11,15
30
11,15
30
6,20
30
6,20
30
30
30
14
14
19
19
Rotations Fixup = O(1) time per rotation:
Total INSERT time = O(lg n); DELETE similar.
Introduction to Algorithms Day 20 L11.14 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
New operations
4. Develop new dynamic-set operations that use
the information.
INTERVAL-SEARCH(i)
x root
while x NIL and (low[i] > high[int[x]]
or low[int[x]] > high[i])
do i and int[x] dont overlap
if left[x] NIL and low[i] m[left[x]]
then x left[x]
else x right[x]
return x
Introduction to Algorithms Day 20 L11.15 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example 1: INTERVAL-SEARCH([14,16])
17,19
23
17,19
23
5,11
18
5,11
18
4,8
8
4,8
8
15,18
18
15,18
18
7,10
10
7,10
10
22,23
23
22,23
23
x
x root
[14,16] and [17,19] dont overlap
14 18 x left[x]
Introduction to Algorithms Day 20 L11.16 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example 1: INTERVAL-SEARCH([14,16])
17,19
23
17,19
23
5,11
18
5,11
18
4,8
8
4,8
8
15,18
18
15,18
18
7,10
10
7,10
10
22,23
23
22,23
23
x
[14,16] and [5,11] dont overlap
14 > 8 x right[x]
Introduction to Algorithms Day 20 L11.17 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example 1: INTERVAL-SEARCH([14,16])
17,19
23
17,19
23
5,11
18
5,11
18
4,8
8
4,8
8
15,18
18
15,18
18
7,10
10
7,10
10
22,23
23
22,23
23
x
[14,16] and [15,18] overlap
return [15,18]
Introduction to Algorithms Day 20 L11.18 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example 2: INTERVAL-SEARCH([12,14])
17,19
23
17,19
23
5,11
18
5,11
18
4,8
8
4,8
8
15,18
18
15,18
18
7,10
10
7,10
10
22,23
23
22,23
23
x
x root
[12,14] and [17,19] dont overlap
12 18 x left[x]
Introduction to Algorithms Day 20 L11.19 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example 2: INTERVAL-SEARCH([12,14])
17,19
23
17,19
23
5,11
18
5,11
18
4,8
8
4,8
8
15,18
18
15,18
18
7,10
10
7,10
10
22,23
23
22,23
23
x
[12,14] and [5,11] dont overlap
12 > 8 x right[x]
Introduction to Algorithms Day 20 L11.20 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example 2: INTERVAL-SEARCH([12,14])
17,19
23
17,19
23
5,11
18
5,11
18
4,8
8
4,8
8
15,18
18
15,18
18
7,10
10
7,10
10
22,23
23
22,23
23
x
[12,14] and [15,18] dont overlap
12 > 10 x right[x]
Introduction to Algorithms Day 20 L11.21 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example 2: INTERVAL-SEARCH([12,14])
17,19
23
17,19
23
5,11
18
5,11
18
4,8
8
4,8
8
15,18
18
15,18
18
7,10
10
7,10
10
22,23
23
22,23
23
x
x = NIL no interval that
overlaps [12,14] exists
Introduction to Algorithms Day 20 L11.22 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Analysis
Time = O(h) = O(lg n), since INTERVAL-SEARCH
does constant work at each level as it follows a
simple path down the tree.
List all overlapping intervals:
Search, list, delete, repeat.
Insert them all again at the end.
This is an output-sensitive bound.
Best algorithm to date: O(k + lg n).
Time = O(k lg n), where k is the total number of
overlapping intervals.
Introduction to Algorithms Day 20 L11.23 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Correctness
Theorem. Let L be the set of intervals in the
left subtree of node x, and let R be the set of
intervals in xs right subtree.
If the search goes right, then
{ i L : i overlaps i } = .
If the search goes left, then
{i L : i overlaps i } =
{i R : i overlaps i } = .
In other words, its always safe to take only 1
of the 2 children: well either find something,
or nothing was to be found.
Introduction to Algorithms Day 20 L11.24 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Correctness proof
Proof. Suppose first that the search goes right.
If left[x] = NIL, then were done, since L = .
Otherwise, the code dictates that we must have
low[i] > m[left[x]]. The value m[left[x]]
corresponds to the right endpoint of some
interval j L, and no other interval in L can
have a larger right endpoint than high( j).
L
high( j) = m[left[x]]
i
low(i)
Therefore, {i L : i overlaps i } = .
Introduction to Algorithms Day 20 L11.25 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof (continued)
Suppose that the search goes left, and assume that
{i L : i overlaps i } = .
Then, the code dictates that low[i] m[left[x]] =
high[ j] for some j L.
Since j L, it does not overlap i, and hence
high[i] < low[ j].
But, the binary-search-tree property implies that
for all i R, we have low[ j] low[i ].
But then {i R : i overlaps i } = .
L
i j
i
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 12
Prof. Erik Demaine
Introduction to Algorithms Day 21 L12.2 2001 by Erik D. Demaine
Computational geometry
Algorithms for solving geometric problems
in 2D and higher.
Fundamental objects:
point line segment line
Basic structures:
polygon point set
Introduction to Algorithms Day 21 L12.3 2001 by Erik D. Demaine
Computational geometry
Algorithms for solving geometric problems
in 2D and higher.
Fundamental objects:
point line segment line
Basic structures:
convex hull triangulation
Introduction to Algorithms Day 21 L12.4 2001 by Erik D. Demaine
Orthogonal range searching
Input: n points in d dimensions
E.g., representing a database of n records
each with d numeric fields
Query: Axis-aligned box (in 2D, a rectangle)
Report on the points inside the box:
Are there any points?
How many are there?
List the points.
Introduction to Algorithms Day 21 L12.5 2001 by Erik D. Demaine
Orthogonal range searching
Input: n points in d dimensions
Query: Axis-aligned box (in 2D, a rectangle)
Report on the points inside the box
Goal: Preprocess points into a data structure
to support fast queries
Primary goal: Static data structure
In 1D, we will also obtain a
dynamic data structure
supporting insert and delete
Introduction to Algorithms Day 21 L12.6 2001 by Erik D. Demaine
1D range searching
In 1D, the query is an interval:
First solution using ideas we know:
Interval trees
Represent each point x by the interval [x, x].
Obtain a dynamic structure that can list
k answers in a query in O(k lg n) time.
Introduction to Algorithms Day 21 L12.7 2001 by Erik D. Demaine
1D range searching
In 1D, the query is an interval:
Second solution using ideas we know:
Sort the points and store them in an array
Solve query by binary search on endpoints.
Obtain a static structure that can list
k answers in a query in O(k + lg n) time.
Goal: Obtain a dynamic structure that can list
k answers in a query in O(k + lg n) time.
Introduction to Algorithms Day 21 L12.8 2001 by Erik D. Demaine
1D range searching
In 1D, the query is an interval:
New solution that extends to higher dimensions:
Balanced binary search tree
New organization principle:
Store points in the leaves of the tree.
Internal nodes store copies of the leaves
to satisfy binary search property:
Node x stores in key[x] the maximum
key of any leaf in the left subtree of x.
Introduction to Algorithms Day 21 L12.9 2001 by Erik D. Demaine
Example of a 1D range tree
1
1
6
6
8
8
12
12
14
14
17
17
26
26
35
35
41
41
42
42
43
43
59
59
61
61
Introduction to Algorithms Day 21 L12.10 2001 by Erik D. Demaine
Example of a 1D range tree
12
12
1
1
6
6
8
8
12
12
14
14
17
17
26
26
35
35
41
41
42
42
43
43
59
59
61
61
6
6
26
26
41
41
59
59
1
1
14
14
35
35
43
43
42
42
8
8
17
17
x
x
x > x
Introduction to Algorithms Day 21 L12.11 2001 by Erik D. Demaine
12
12
8
8
12
12
14
14
17
17
26
26
35
35
41
41
26
26
14
14
Example of a 1D range query
1
1
6
6
42
42
43
43
59
59
61
61
6
6
41
41
59
59
1
1
12
12
8
8
12
12
14
14
17
17
26
26
35
35
41
41
26
26
14
14
35
35
43
43
42
42
8
8
17
17
RANGE-QUERY([7, 41])
x
x
x > x
Introduction to Algorithms Day 21 L12.12 2001 by Erik D. Demaine
General 1D range query
root
split node
Introduction to Algorithms Day 21 L12.13 2001 by Erik D. Demaine
Pseudocode, part 1:
Find the split node
1D-RANGE-QUERY(T, [x
1
, x
2
])
w root[T]
while w is not a leaf and (x
2
key[w] or key[w] < x
1
)
do if x
2
key[w]
then w left[w]
else w right[w]
w is now the split node
[traverse left and right from w and report relevant subtrees]
Introduction to Algorithms Day 21 L12.14 2001 by Erik D. Demaine
Pseudocode, part 2: Traverse
left and right from split node
1D-RANGE-QUERY(T, [x
1
, x
2
])
[find the split node]
w is now the split node
if w is a leaf
then output the leaf w if x
1
key[w] x
2
else v left[w] Left traversal
while v is not a leaf
do if x
1
key[v]
then output the subtree rooted at right[v]
v left[v]
else v right[v]
output the leaf v if x
1
key[v] x
2
[symmetrically for right traversal]
Introduction to Algorithms Day 21 L12.15 2001 by Erik D. Demaine
Analysis of 1D-RANGE-QUERY
Query time: Answer to range query represented
by O(lg n) subtrees found in O(lg n) time.
Thus:
Can test for points in interval in O(lg n) time.
Can count points in interval in O(lg n) time
if we augment the tree with subtree sizes.
Can report the first k points in
interval in O(k + lg n) time.
Space: O(n)
Preprocessing time: O(n lg n)
Introduction to Algorithms Day 21 L12.16 2001 by Erik D. Demaine
2D range trees
Store a primary 1D range tree for all the points
based on x-coordinate.
Thus in O(lg n) time we can find O(lg n) subtrees
representing the points with proper x-coordinate.
How to restrict to points with proper y-coordinate?
Introduction to Algorithms Day 21 L12.17 2001 by Erik D. Demaine
2D range trees
Idea: In primary 1D range tree of x-coordinate,
every node stores a secondary 1D range tree
based on y-coordinate for all points in the subtree
of the node. Recursively search within each.
Introduction to Algorithms Day 21 L12.18 2001 by Erik D. Demaine
Analysis of 2D range trees
Query time: In O(lg
2
n) = O((lg n)
2
) time, we can
represent answer to range query by O(lg
2
n) subtrees.
Total cost for reporting k points: O(k + (lg n)
2
).
Preprocessing time: O(n lg n)
Space: The secondary trees at each level of the
primary tree together store a copy of the points.
Also, each point is present in each secondary
tree along the path from the leaf to the root.
Either way, we obtain that the space is O(n lg n).
Introduction to Algorithms Day 21 L12.19 2001 by Erik D. Demaine
d-dimensional range trees
Query time: O(k + lg
d
n) to report k points.
Space: O(n lg
d 1
n)
Preprocessing time: O(n lg
d 1
n)
Each node of the secondary y-structure stores
a tertiary z-structure representing the points
in the subtree rooted at the node, etc.
Best data structure to date:
Query time: O(k + lg
d 1
n) to report k points.
Space: O(n (lg n / lg lg n)
d 1
)
Preprocessing time: O(n lg
d 1
n)
Introduction to Algorithms Day 21 L12.20 2001 by Erik D. Demaine
Primitive operations:
Crossproduct
Given two vectors v
1
= (x
1
, y
1
) and v
2
= (x
2
, y
2
),
is their counterclockwise angle
convex (< 180),
reflex (> 180), or
borderline (0 or 180)?
v
1
v
2

v
2
v
1

convex reflex
Crossproduct v
1
v
2
= x
1
y
2
y
1
x
2
= |v
1
| |v
2
| sin .
Thus, sign(v
1
v
2
) = sign(sin ) > 0 if convex,
< 0 if reflex,
= 0 if borderline.
Introduction to Algorithms Day 21 L12.21 2001 by Erik D. Demaine
Primitive operations:
Orientation test
Given three points p
1
, p
2
, p
3
are they
in clockwise (cw) order,
in counterclockwise (ccw) order, or
collinear?
(p
2
p
1
) (p
3
p
1
)
> 0 if ccw
< 0 if cw
= 0 if collinear
p
1
p
3
p
2
cw p
1
p
2
p
3
ccw
p
1
p
2
p
3
collinear
Introduction to Algorithms Day 21 L12.22 2001 by Erik D. Demaine
Primitive operations:
Sidedness test
Given three points p
1
, p
2
, p
3
are they
in clockwise (cw) order,
in counterclockwise (ccw) order, or
collinear?
Let L be the oriented line from p
1
to p
2
.
Equivalently, is the point p
3
right of L,
left of L, or
on L?
p
1
p
3
p
2
cw p
1
p
2
p
3
ccw
p
1
p
2
p
3
collinear
Introduction to Algorithms Day 21 L12.23 2001 by Erik D. Demaine
Line-segment intersection
Given n line segments, does any pair intersect?
Obvious algorithm: O(n
2
).
a
b
c
d
e
f
Introduction to Algorithms Day 21 L12.24 2001 by Erik D. Demaine
Sweep-line algorithm
Sweep a vertical line from left to right
(conceptually replacing x-coordinate with time).
Maintain dynamic set S of segments
that intersect the sweep line, ordered
(tentatively) by y-coordinate of intersection.
Order changes when
new segment is encountered,
existing segment finishes, or
two segments cross
Key event points are therefore segment endpoints.
segment
endpoints
Introduction to Algorithms Day 21 L12.25 2001 by Erik D. Demaine
a
b
c
d
e
f
a
a
b b b b b b f f f f
c
a
c
a d d e d b e e
d
c c d b d d d
e e e b
Introduction to Algorithms Day 21 L12.26 2001 by Erik D. Demaine
Sweep-line algorithm
Process event points in order by sorting segment
endpoints by x-coordinate and looping through:
For a left endpoint of segment s:
Add segment s to dynamic set S.
Check for intersection between s
and its neighbors in S.
For a right endpoint of segment s:
Remove segment s from dynamic set S.
Check for intersection between
the neighbors of s in S.
Introduction to Algorithms Day 21 L12.27 2001 by Erik D. Demaine
Analysis
Use red-black tree to store dynamic set S.
Total running time: O(n lg n).
Introduction to Algorithms Day 21 L12.28 2001 by Erik D. Demaine
Correctness
Theorem: If there is an intersection,
the algorithm finds it.
Proof: Let X be the leftmost intersection point.
Assume for simplicity that
only two segments s
1
, s
2
pass through X, and
no two points have the same x-coordinate.
At some point before we reach X,
s
1
and s
2
become consecutive in the order of S.
Either initially consecutive when s
1
or s
2
inserted,
or became consecutive when another deleted.
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 13
Prof. Erik Demaine
Introduction to Algorithms Day 23 L12.2 2001 by Erik D. Demaine
Fixed-universe
successor problem
Goal: Maintain a dynamic subset S of size n
of the universe U = {0, 1, , u 1} of size u
subject to these operations:
INSERT(x U \ S): Add x to S.
DELETE(x S): Remove x from S.
SUCCESSOR(x U): Find the next element in S
larger than any element x of the universe U.
PREDECESSOR(x U): Find the previous
element in S smaller than x.
Introduction to Algorithms Day 23 L12.3 2001 by Erik D. Demaine
Solutions to fixed-universe
successor problem
Goal: Maintain a dynamic subset S of size n
of the universe U = {0, 1, , u 1} of size u
subject to INSERT, DELETE, SUCCESSOR, PREDECESSOR.
Balanced search trees can implement operations in
O(lg n) time, without fixed-universe assumption.
In 1975, Peter van Emde Boas solved this problem
in O(lg lg u) time per operation.
If u is only polynomial in n, that is, u = O(n
c
),
then O(lg lg n) time per operation--
exponential speedup!
Introduction to Algorithms Day 23 L12.4 2001 by Erik D. Demaine
O(lg lg u)?!
Where could a bound of O(lg lg u) arise?
Binary search over O(lg u) things
T(u) = T( ) + O(1)
T(lg u) = T((lg u)/2) + O(1)
= O(lg lg u)
u
Introduction to Algorithms Day 23 L12.5 2001 by Erik D. Demaine
(1) Starting point: Bit vector
Bit vector v stores, for each x U,
1 if x S
0 if x S
v
x
=
Insert/Delete run in O(1) time.
Successor/Predecessor run in O(u) worst-case time.
Example: u = 16; n = 4; S = {1, 9, 10, 15}.
0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Introduction to Algorithms Day 23 L12.6 2001 by Erik D. Demaine
(2) Split universe into widgets
Example: u = 16, 4 = u .
0 1 0 0
0 1 2 3
0 0 0 0
4 5 6 7
1 0 1 0
8 9 10 11
0 0 0 1
12 13 14 15
W
0
W
1
W
2
W
3
Carve universe of size u into widgets
u
W
0
, W
1
, , W
1 u
each of size
u
.
Introduction to Algorithms Day 23 L12.7 2001 by Erik D. Demaine
(2) Split universe into widgets
W
0
represents 0, 1, ,
1 u
U;
W
1
represents
1 2 u
U;
u
,
1 + u
, ,
W
i
represents 1 ) 1 ( + u i U;
u i
,
1 + u i
, ,
:
:
W represents
u 1 U.
u u
,
1 + u u
, ,
1 u
Carve universe of size u into widgets
u
W
0
, W
1
, , W
1 u
each of size
u
.
Introduction to Algorithms Day 23 L12.8 2001 by Erik D. Demaine
(2) Split universe into widgets
Define high(x) 0 and low(x) 0
so that x = high(x)
That is, if we write x U in binary,
high(x) is the high-order half of the bits,
and low(x) is the low-order half of the bits.
For x U, high(x) is index of widget containing x
and low(x) is the index of x within that widget.
u
+ low(x).
x = 9
high(x)
= 2
low(x)
= 1
1 0 0 1
0 1 0 0
0 1 2 3
0 0 0 0
4 5 6 7
0 1 1 0
8 9 10 11
0 0 0 1
12 13 14 15
W
0
W
1
W
2
W
3
Introduction to Algorithms Day 23 L12.9 2001 by Erik D. Demaine
(2) Split universe into widgets
INSERT(x)
insert x into widget W
high(x)
at position low(x).
mark W
high(x)
as nonempty.
Running time T(n) = O(1).
Introduction to Algorithms Day 23 L12.10 2001 by Erik D. Demaine
(2) Split universe into widgets
SUCCESSOR(x)
look for successor of x within widget W
high(x)
starting after position low(x).
if successor found
then return it
else find smallest i > high(x)
for which W
i
is nonempty.
return smallest element in W
i
O( )
u
O( )
u
O( )
u
Running time T(u) = O( ).
u
Introduction to Algorithms Day 23 L12.11 2001 by Erik D. Demaine
Revelation
SUCCESSOR(x)
look for successor of x within widget W
high(x)
starting after position low(x).
if successor found
then return it
else find smallest i > high(x)
for which W
i
is nonempty.
return smallest element in W
i
recursive
successor
recursive
successor
recursive
successor
Introduction to Algorithms Day 23 L12.12 2001 by Erik D. Demaine
(3) Recursion
Represent universe by widget of size u.
Recursively split each widget Wof size |W|
into
sub[W][
W
subwidgets sub[W][0], sub[W][1], ,
W
summary[W]
W
sub[W][0]
W
sub[W][1]
W
sub[W][ ]
W

1 W
Store a summary widget summary[W] of size
representing which subwidgets are nonempty.
W
1 W
. ] each of size
W
Introduction to Algorithms Day 23 L12.13 2001 by Erik D. Demaine
(3) Recursion
INSERT(x, W)
if sub[W][high(x)] is empty
then INSERT(high(x), summary[W])
INSERT(low(x), sub[W][high(x)])
Running time T(u) = 2 T( ) + O(1)
T(lg u) = 2 T((lg u) / 2) + O(1)
= O(lg u) .
u
Define high(x) 0 and low(x) 0
so that x = high(x)
W
+ low(x).
Introduction to Algorithms Day 23 L12.14 2001 by Erik D. Demaine
(3) Recursion
SUCCESSOR(x, W)
j SUCCESSOR(low(x), sub[W][high(x)])
if j <
then return high(x)
else i SUCCESSOR(high(x), summary[W])
j SUCCESSOR( , sub[W][i])
return i
Running time T(u) = 3 T( ) + O(1)
T(lg u) = 3 T((lg u) / 2) + O(1)
= O((lg u)
lg 3
) .
u
W
+ j
W + j
T( )
u
T( )
u
T( )
u
Introduction to Algorithms Day 23 L12.15 2001 by Erik D. Demaine
Improvements
2 calls: T(u) = 2 T( ) + O(1)
= O(lg n)
u
3 calls: T(u) = 3 T( ) + O(1)
= O((lg u)
lg 3
)
u
1 call: T(u) = 1 T( ) + O(1)
= O(lg lg n)
u
Need to reduce INSERT and SUCCESSOR
down to 1 recursive call each.
Were closer to this goal than it may seem!
Introduction to Algorithms Day 23 L12.16 2001 by Erik D. Demaine
Recursive calls in successor
If x has a successor within sub[W][high(x)],
then there is only 1 recursive call to SUCCESSOR.
Otherwise, there are 3 recursive calls:
SUCCESSOR(low(x), sub[W][high(x)])
discovers that sub[W][high(x)] hasnt successor.
SUCCESSOR(high(x), summary[W])
finds next nonempty subwidget sub[W][i].
SUCCESSOR( , sub[W][i])
finds smallest element in subwidget sub[W][i].
Introduction to Algorithms Day 23 L12.17 2001 by Erik D. Demaine
Reducing recursive calls
in successor
If x has no successor within sub[W][high(x)],
there are 3 recursive calls:
SUCCESSOR(low(x), sub[W][high(x)])
discovers that sub[W][high(x)] hasnt successor.
Could be determined using the maximum
value in the subwidget sub[W][high(x)].
SUCCESSOR(high(x), summary[W])
finds next nonempty subwidget sub[W][i].
SUCCESSOR( , sub[W][i])
finds minimum element in subwidget sub[W][i].
Introduction to Algorithms Day 23 L12.18 2001 by Erik D. Demaine
(4) Improved successor
INSERT(x, W)
if sub[W][high(x)] is empty
then INSERT(high(x), summary[W])
INSERT(low(x), sub[W][high(x)])
if x < min[W] then min[W] x
if x > max[W] then max[W] x
Running time T(u) = 2 T( ) + O(1)
T(lg u) = 2 T((lg u) / 2) + O(1)
= O(lg u) .
u
new (augmentation)
Introduction to Algorithms Day 23 L12.19 2001 by Erik D. Demaine
(4) Improved successor
SUCCESSOR(x, W)
if low(x) < max[sub[W][high(x)]]
then j SUCCESSOR(low(x), sub[W][high(x)])
return high(x)
else i SUCCESSOR(high(x), summary[W])
j min[sub[W][i]]
return i
Running time T(u) = 1 T( ) + O(1)
= O(lg lg u) .
u
T( )
u
T( )
u
W
+ j
W + j
Introduction to Algorithms Day 23 L12.20 2001 by Erik D. Demaine
Recursive calls in insert
If sub[W][high(x)] is already in summary[W],
then there is only 1 recursive call to INSERT.
Otherwise, there are 2 recursive calls:
INSERT(high(x), summary[W])
INSERT(low(x), sub[W][high(x)])
Idea:We know that sub[W][high(x)]) is empty.
Avoid second recursive call by specially
storing a widget containing just 1 element.
Specifically, do not store min recursively.
Introduction to Algorithms Day 23 L12.21 2001 by Erik D. Demaine
(5) Improved insert
INSERT(x, W)
if x < min[W] then exchange x min[W]
if sub[W][high(x)] is nonempty, that is,
min[sub[W][high(x)] NIL
then INSERT(low(x), sub[W][high(x)])
else min[sub[W][high(x)]] low(x)
INSERT(high(x), summary[W])
if x > max[W] then max[W] x
Running time T(u) = 1 T( ) + O(1)
= O(lg lg u) .
u
Introduction to Algorithms Day 23 L12.22 2001 by Erik D. Demaine
(5) Improved insert
SUCCESSOR(x, W)
if x < min[W] then return min[W]
if low(x) < max[sub[W][high(x)]]
then j SUCCESSOR(low(x), sub[W][high(x)])
return high(x)
else i SUCCESSOR(high(x), summary[W])
j min[sub[W][i]]
return i
Running time T(u) = 1 T( ) + O(1)
= O(lg lg u) .
u
T( )
u
T( )
u
W
+ j
W + j
new
Introduction to Algorithms Day 23 L12.23 2001 by Erik D. Demaine
Deletion
DELETE(x, W)
if min[W] = NIL or x < min[W] then return
if x = min[W]
then i min[summary[W]]
x i
min[W] x
DELETE(low(x), sub[W][high(x)])
if sub[W][high(x)] is now empty, that is,
min[sub[W][high(x)] = NIL
then DELETE(high(x), summary[W])
(in this case, the first recursive call was cheap)
+ min[sub[W][i]]
W
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 14
Prof. Charles E. Leiserson
Introduction to Algorithms Day 24 L14.2 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
How large should a hash
table be?
Problem: What if we dont know the proper size
in advance?
Goal: Make the table as small as possible, but
large enough so that it wont overflow (or
otherwise become inefficient).
IDEA: Whenever the table overflows, grow it
by allocating (via malloc or new) a new, larger
table. Move all items from the old table into the
new one, and free the storage for the old table.
Solution: Dynamic tables.
Introduction to Algorithms Day 24 L14.3 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of a dynamic table
1. INSERT
1
2. INSERT
overflow
Introduction to Algorithms Day 24 L14.4 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
1
1
Example of a dynamic table
1. INSERT
2. INSERT
overflow
Introduction to Algorithms Day 24 L14.5 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
1
1
2
Example of a dynamic table
1. INSERT
2. INSERT
Introduction to Algorithms Day 24 L14.6 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of a dynamic table
1. INSERT
2. INSERT
1
1
2
2
3. INSERT
overflow
Introduction to Algorithms Day 24 L14.7 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of a dynamic table
1. INSERT
2. INSERT
3. INSERT
2
1
overflow
Introduction to Algorithms Day 24 L14.8 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of a dynamic table
1. INSERT
2. INSERT
3. INSERT
2
1
Introduction to Algorithms Day 24 L14.9 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of a dynamic table
1. INSERT
2. INSERT
3. INSERT
4. INSERT
4
3
2
1
Introduction to Algorithms Day 24 L14.10 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of a dynamic table
1. INSERT
2. INSERT
3. INSERT
4. INSERT
5. INSERT
4
3
2
1
overflow
Introduction to Algorithms Day 24 L14.11 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of a dynamic table
1. INSERT
2. INSERT
3. INSERT
4. INSERT
5. INSERT
4
3
2
1
overflow
Introduction to Algorithms Day 24 L14.12 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of a dynamic table
1. INSERT
2. INSERT
3. INSERT
4. INSERT
5. INSERT
4
3
2
1
Introduction to Algorithms Day 24 L14.13 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of a dynamic table
1. INSERT
2. INSERT
3. INSERT
4. INSERT
6. INSERT
6
5. INSERT
5
4
3
2
1
7
7. INSERT
Introduction to Algorithms Day 24 L14.14 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Worst-case analysis
Consider a sequence of n insertions. The
worst-case time to execute one insertion is
(n). Therefore, the worst-case time for n
insertions is n (n) = (n
2
).
WRONG! In fact, the worst-case cost for
n insertions is only (n) (n
2
).
Lets see why.
Introduction to Algorithms Day 24 L14.15 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Tighter analysis
i 1 2 3 4 5 6 7 8 9 10
size
i
1 2 4 4 8 8 8 8 16 16
c
i
1 2 3 1 5 1 1 1 9 1
Let c
i
= the cost of the i th insertion
=
i if i 1 is an exact power of 2,
1 otherwise.
Introduction to Algorithms Day 24 L14.16 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Tighter analysis
Let c
i
= the cost of the i th insertion
=
i if i 1 is an exact power of 2,
1 otherwise.
i 1 2 3 4 5 6 7 8 9 10
size
i
1 2 4 4 8 8 8 8 16 16
1 1 1 1 1 1 1 1 1 1
1 2 4 8
c
i
Introduction to Algorithms Day 24 L14.17 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Tighter analysis (continued)

) (
3
2

) 1 lg(
0
1
n
n
n
c
n
j
j
n
i
i
=

+
=

=
=
Cost of n insertions
.
Thus, the average cost of each dynamic-table
operation is (n)/n = (1).
Introduction to Algorithms Day 24 L14.18 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Amortized analysis
An amortized analysis is any strategy for
analyzing a sequence of operations to
show that the average cost per operation is
small, even though a single operation
within the sequence might be expensive.
Even though were taking averages, however,
probability is not involved!
An amortized analysis guarantees the
average performance of each operation in
the worst case.
Introduction to Algorithms Day 24 L14.19 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Types of amortized analyses
Three common amortization arguments:
the aggregate method,
the accounting method,
the potential method.
Weve just seen an aggregate analysis.
The aggregate method, though simple, lacks the
precision of the other two methods. In particular,
the accounting and potential methods allow a
specific amortized cost to be allocated to each
operation.
Introduction to Algorithms Day 24 L14.20 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Accounting method
Charge i th operation a fictitious amortized cost

i
, where $1 pays for 1 unit of work (i.e., time).
This fee is consumed to perform the operation.
Any amount not immediately consumed is stored
in the bank for use by subsequent operations.
The bank balance must not go negative! We
must ensure that

= =

n
i
i
n
i
i
c c
1 1

for all n.
Thus, the total amortized costs provide an upper
bound on the total true costs.
Introduction to Algorithms Day 24 L14.21 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
$0
$0
$0
$0
$0
$0
$0
$0
$2
$2
$2
$2
Example:
$2 $2
Accounting analysis of
dynamic tables
Charge an amortized cost of
i
= $3 for the i th
insertion.
$1 pays for the immediate insertion.
$2 is stored for later table doubling.
When the table doubles, $1 pays to move a
recent item, and $1 pays to move an old item.
overflow
Introduction to Algorithms Day 24 L14.22 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example:
Accounting analysis of
dynamic tables
Charge an amortized cost of
i
= $3 for the i th
insertion.
$1 pays for the immediate insertion.
$2 is stored for later table doubling.
When the table doubles, $1 pays to move a
recent item, and $1 pays to move an old item.
overflow
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
Introduction to Algorithms Day 24 L14.23 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example:
Accounting analysis of
dynamic tables
Charge an amortized cost of
i
= $3 for the i th
insertion.
$1 pays for the immediate insertion.
$2 is stored for later table doubling.
When the table doubles, $1 pays to move a
recent item, and $1 pays to move an old item.
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0
$0 $2 $2 $2
Introduction to Algorithms Day 24 L14.24 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Accounting analysis
(continued)
Key invariant: Bank balance never drops below 0.
Thus, the sum of the amortized costs provides an
upper bound on the sum of the true costs.
i 1 2 3 4 5 6 7 8 9 10
size
i
1 2 4 4 8 8 8 8 16 16
c
i
1 2 3 1 5 1 1 1 9 1

i
2 3 3 3 3 3 3 3 3 3
bank
i
1 2 2 4 2 4 6 8 2 4
*
*Okay, so I lied. The first operation costs only $2, not $3.
Introduction to Algorithms Day 24 L14.25 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Potential method
IDEA: View the bank account as the potential
energy ( la physics) of the dynamic set.
Framework:
Start with an initial data structure D
0
.
Operation i transforms D
i1
to D
i
.
The cost of operation i is c
i
.
Define a potential function : {D
i
} R,
such that (D
0
) = 0 and (D
i
) 0 for all i.
The amortized cost
i
with respect to is
defined to be
i
= c
i
+ (D
i
) (D
i1
).
Introduction to Algorithms Day 24 L14.26 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Understanding potentials

i
= c
i
+ (D
i
) (D
i1
)
potential difference
i
If
i
> 0, then
i
> c
i
. Operation i stores
work in the data structure for later use.
If
i
< 0, then
i
< c
i
. The data structure
delivers up stored work to help pay for
operation i.
Introduction to Algorithms Day 24 L14.27 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
The amortized costs bound
the true costs
The total amortized cost of n operations is
( )

=

=
+ =
n
i
i i i
n
i
i
D D c c
1
1
1
) ( ) (

Summing both sides.


Introduction to Algorithms Day 24 L14.28 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
The amortized costs bound
the true costs
The total amortized cost of n operations is
( )
) ( ) (
) ( ) (

0
1
1
1
1
D D c
D D c c
n
n
i
i
n
i
i i i
n
i
i
+ =
+ =


=
=

=
The series telescopes.
Introduction to Algorithms Day 24 L14.29 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
The amortized costs bound
the true costs
The total amortized cost of n operations is
( )


=
=
=

=

+ =
+ =
n
i
i
n
n
i
i
n
i
i i i
n
i
i
c
D D c
D D c c
1
0
1
1
1
1
) ( ) (
) ( ) (

since (D
n
) 0 and
(D
0
) = 0.
Introduction to Algorithms Day 24 L14.30 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Potential analysis of table
doubling
Define the potential of the table after the ith
insertion by (D
i
) = 2i 2
lg i
. (Assume that
2
lg 0
= 0.)
Note:
(D
0
) = 0,
(D
i
) 0 for all i.
Example:

= 26 2
3
= 4
$0
$0
$0
$0
$0
$0
$0
$0
$2
$2
$2
$2
accounting method) (
Introduction to Algorithms Day 24 L14.31 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Calculation of amortized costs
The amortized cost of the i th insertion is

i
= c
i
+ (D
i
) (D
i1
)
i + (2i 2
lg i
) (2(i 1) 2
lg (i1)
)
if i 1 is an exact power of 2,
1 + (2i 2
lg i
) (2(i 1) 2
lg (i1)
)
otherwise.
=
Introduction to Algorithms Day 24 L14.32 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Calculation (Case 1)
Case 1: i 1 is an exact power of 2.

i
= i + (2i 2
lg i
) (2(i 1) 2
lg (i1)
)
= i + 2 (2
lg i
2
lg (i1)
)
= i + 2 (2(i 1) (i 1))
= i + 2 2i + 2 + i 1
= 3
Introduction to Algorithms Day 24 L14.33 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Calculation (Case 2)
Case 2: i 1 is not an exact power of 2.

i
= 1 + (2i 2
lg i
) (2(i 1) 2
lg (i1)
)
= 1 + 2 (2
lg i
2
lg (i1)
)
= 3
Therefore, n insertions cost (n) in the worst case.
Exercise: Fix the bug in this analysis to show that
the amortized cost of the first insertion is only 2.
Introduction to Algorithms Day 24 L14.34 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Conclusions
Amortized costs can provide a clean abstraction
of data-structure performance.
Any of the analysis methods can be used when
an amortized analysis is called for, but each
method has some situations where it is arguably
the simplest.
Different schemes may work for assigning
amortized costs in the accounting method, or
potentials in the potential method, sometimes
yielding radically different bounds.
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 15
Prof. Charles E. Leiserson
Introduction to Algorithms Day 26 L15.2 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Dynamic programming
Design technique, like divide-and-conquer.
Example: Longest Common Subsequence (LCS)
Given two sequences x[1 . . m] and y[1 . . n], find
a longest subsequence common to them both.
x: A B C B D A B
y: B D C A B A
a not the
BCBA =
LCS(x, y)
functional notation,
but not a function
Introduction to Algorithms Day 26 L15.3 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Brute-force LCS algorithm
Check every subsequence of x[1 . . m] to see
if it is also a subsequence of y[1 . . n].
Analysis
Checking = O(n) time per subsequence.
2
m
subsequences of x (each bit-vector of
length m determines a distinct subsequence
of x).
Worst-case running time = O(n2
m
)
= exponential time.
Introduction to Algorithms Day 26 L15.4 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Towards a better algorithm
Simplification:
1. Look at the length of a longest-common
subsequence.
2. Extend the algorithm to find the LCS itself.
Strategy: Consider prefixes of x and y.
Define c[i, j] = | LCS(x[1 . . i], y[1 . . j]) |.
Then, c[m, n] = | LCS(x, y) |.
Notation: Denote the length of a sequence s
by | s |.
Introduction to Algorithms Day 26 L15.5 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Recursive formulation
Theorem.
c[i, j] =
c[i1, j1] + 1 if x[i] = y[j],
max{c[i1, j], c[i, j1]} otherwise.
Let z[1 . . k] = LCS(x[1 . . i], y[1 . . j]), where c[i, j]
= k. Then, z[k] = x[i], or else z could be extended.
Thus, z[1 . . k1] is CS of x[1 . . i1] and y[1 . . j1].
Proof. Case x[i] = y[ j]:
L
1 2 i m
L
1 2
j
n
x:
y:
=
Introduction to Algorithms Day 26 L15.6 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof (continued)
Claim: z[1 . . k1] = LCS(x[1 . . i1], y[1 . . j1]).
Suppose w is a longer CS of x[1 . . i1] and
y[1 . . j1], that is, | w| > k1. Then, cut and
paste: w || z[k] (w concatenated with z[k]) is a
common subsequence of x[1 . . i] and y[1 . . j]
with | w || z[k] | > k. Contradiction, proving the
claim.
Thus, c[i1, j1] = k1, which implies that c[i, j]
= c[i1, j1] + 1.
Other cases are similar.
Introduction to Algorithms Day 26 L15.7 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Dynamic-programming
hallmark #1
Optimal substructure
An optimal solution to a problem
(instance) contains optimal
solutions to subproblems.
If z = LCS(x, y), then any prefix of z is
an LCS of a prefix of x and a prefix of y.
Introduction to Algorithms Day 26 L15.8 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Recursive algorithm for LCS
LCS(x, y, i, j)
if x[i] = y[ j]
then c[i, j] LCS(x, y, i1, j1) + 1
else c[i, j] max{LCS(x, y, i1, j),
LCS(x, y, i, j1)}
Worst-case: x[i] y[ j], in which case the
algorithm evaluates two subproblems, each
with only one parameter decremented.
Introduction to Algorithms Day 26 L15.9 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
same
subproblem
,
but were solving subproblems already solved!
Recursion tree
m = 3, n = 4:
3,4
3,4
2,4
2,4
1,4
1,4
3,3
3,3
3,2
3,2
2,3
2,3
1,3
1,3
2,2
2,2
Height = m + n work potentially exponential.
2,3
2,3
1,3
1,3
2,2
2,2
m+n
Introduction to Algorithms Day 26 L15.10 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Dynamic-programming
hallmark #2
Overlapping subproblems
A recursive solution contains a
small number of distinct
subproblems repeated many times.
The number of distinct LCS subproblems for
two strings of lengths m and n is only mn.
Introduction to Algorithms Day 26 L15.11 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Memoization algorithm
Memoization: After computing a solution to a
subproblem, store it in a table. Subsequent calls
check the table to avoid redoing work.
Time = (mn) = constant work per table entry.
Space = (mn).
LCS(x, y, i, j)
if c[i, j] = NIL
then if x[i] = y[j]
then c[i, j] LCS(x, y, i1, j1) + 1
else c[i, j] max{LCS(x, y, i1, j),
LCS(x, y, i, j1)}
same
as
before
Introduction to Algorithms Day 26 L15.12 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
1
1
1
1
1
1
2
2
2
2 D
2
2
0
0
0
0
1
1
2
2
2
2
2
2
2
2 C
2
2
0
0
1
1
1
1
2
2
2
2
2
2
3
3 A
3
3
0
0
1
1
2
2
2
2
3
3
3
3
3
3 B
4
4
0
0
1
1
2
2
2
2
3
3
3
3
A
Dynamic-programming
algorithm
IDEA:
Compute the
table bottom-up.
A B C B D B
B
A
4
4
4
4
Time = (mn).
Introduction to Algorithms Day 26 L15.13 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
1
1
2
2
2
2 D
2
2
0
0
0
0
1
1
2
2
2
2
2
2
2
2 C
2
2
0
0
1
1
1
1
2
2
2
2
2
2
3
3 A
3
3
0
0
1
1
2
2
2
2
3
3
3
3
3
3 B
4
4
0
0
1
1
2
2
2
2
3
3
3
3
A
Dynamic-programming
algorithm
IDEA:
Compute the
table bottom-up.
A B C B D B
B
A
4
4
4
4
Time = (mn).
Reconstruct
LCS by tracing
backwards.
0
A
4
0
B
B
1
C
C
2
B
B
3
A
A
D
1
A
2
D
3
B
4
Space = (mn).
Exercise:
O(min{m, n}).
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 16
Prof. Charles E. Leiserson
Introduction to Algorithms Day 27 L16.2 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Graphs (review)
Definition. A directed graph (digraph)
G = (V, E) is an ordered pair consisting of
a set V of vertices (singular: vertex),
a set E V V of edges.
In an undirected graph G = (V, E), the edge
set E consists of unordered pairs of vertices.
In either case, we have | E| = O(V
2
). Moreover,
if G is connected, then | E| | V| 1, which
implies that lg| E| = (lgV).
(Review CLRS, Appendix B.)
Introduction to Algorithms Day 27 L16.3 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Adjacency-matrix
representation
The adjacency matrix of a graph G = (V, E), where
V = {1, 2, , n}, is the matrix A[1 . . n, 1 . . n]
given by
A[i, j] =
1 if (i, j) E,
0 if (i, j) E.
2
2
1
1
3
3
4
4
A 1 2 3 4
1
2
3
4
0 1 1 0
0 0 1 0
0 0 0 0
0 0 1 0
(V
2
) storage
dense
representation.
Introduction to Algorithms Day 27 L16.4 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Adjacency-list representation
An adjacency list of a vertex v V is the list Adj[v]
of vertices adjacent to v.
2
2
1
1
3
3
4
4
Adj[1] = {2, 3}
Adj[2] = {3}
Adj[3] = {}
Adj[4] = {3}
For undirected graphs, | Adj[v] | = degree(v).
For digraphs, | Adj[v] | = out-degree(v).
Handshaking Lemma:
vV
= 2| E| for undirected
graphs adjacency lists use (V + E) storage
a sparse representation (for either type of graph).
Introduction to Algorithms Day 27 L16.5 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Minimum spanning trees
Input: A connected, undirected graph G = (V, E)
with weight function w : E R.
For simplicity, assume that all edge weights are
distinct. (CLRS covers the general case.)

=
T v u
v u w T w
) , (
) , ( ) ( .
Output: A spanning tree T a tree that connects
all vertices of minimum weight:
Introduction to Algorithms Day 27 L16.6 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of MST
6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.7 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
u
v
Remove any edge (u, v) T. Remove any edge (u, v) T. Then, T is partitioned
into two subtrees T
1
and T
2
.
T
1
T
2
u
v
Optimal substructure
MST T:
(Other edges of G
are not shown.)
Theorem. The subtree T
1
is an MST of G
1
= (V
1
, E
1
),
the subgraph of G induced by the vertices of T
1
:
V
1
= vertices of T
1
,
E
1
= { (x, y) E : x, y V
1
}.
Similarly for T
2
.
Introduction to Algorithms Day 27 L16.8 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof of optimal substructure
w(T) = w(u, v) + w(T
1
) + w(T
2
).
Proof. Cut and paste:
If T
1
were a lower-weight spanning tree than T
1
for
G
1
, then T = {(u, v)} T
1
T
2
would be a
lower-weight spanning tree than T for G.
Do we also have overlapping subproblems?
Yes.
Great, then dynamic programming may work!
Yes, but MST exhibits another powerful property
which leads to an even more efficient algorithm.
Introduction to Algorithms Day 27 L16.9 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Hallmark for greedy
algorithms
Greedy-choice property
A locally optimal choice
is globally optimal.
Theorem. Let T be the MST of G = (V, E),
and let A V. Suppose that (u, v) E is the
least-weight edge connecting A to V A.
Then, (u, v) T.
Introduction to Algorithms Day 27 L16.10 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof of theorem
Proof. Suppose (u, v) T. Cut and paste.
A
V A
T:
u
v
(u, v) = least-weight edge
connecting A to V A
Introduction to Algorithms Day 27 L16.11 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof of theorem
Proof. Suppose (u, v) T. Cut and paste.
A
V A
T:
u
Consider the unique simple path from u to v in T.
(u, v) = least-weight edge
connecting A to V A
v
Introduction to Algorithms Day 27 L16.12 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof of theorem
Proof. Suppose (u, v) T. Cut and paste.
A
V A
T:
u
(u, v) = least-weight edge
connecting A to V A
v
Consider the unique simple path from u to v in T.
Swap (u, v) with the first edge on this path that
connects a vertex in A to a vertex in V A.
Introduction to Algorithms Day 27 L16.13 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof of theorem
Proof. Suppose (u, v) T. Cut and paste.
A
V A
T:
u
(u, v) = least-weight edge
connecting A to V A
v
Consider the unique simple path from u to v in T.
Swap (u, v) with the first edge on this path that
connects a vertex in A to a vertex in V A.
A lighter-weight spanning tree than T results.
Introduction to Algorithms Day 27 L16.14 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Prims algorithm
IDEA: Maintain V A as a priority queue Q. Key
each vertex in Q with the weight of the least-
weight edge connecting it to a vertex in A.
Q V
key[v] for all v V
key[s] 0 for some arbitrary s V
while Q
do u EXTRACT-MIN(Q)
for each v Adj[u]
do if v Q and w(u, v) < key[v]
then key[v] w(u, v) DECREASE-KEY
[v] u
At the end, {(v, [v])} forms the MST.
Introduction to Algorithms Day 27 L16.15 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A

0
0

6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.16 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A

0
0

6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.17 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A

7
7

0
0
10
10

15
15
6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.18 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A

7
7

0
0
10
10

15
15
6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.19 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A
12
12
5
5
7
7

0
0
10
10
9
9
15
15
6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.20 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A
12
12
5
5
7
7

0
0
10
10
9
9
15
15
6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.21 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A
6
6
5
5
7
7
14
14
0
0
8
8
9
9
15
15
6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.22 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A
6
6
5
5
7
7
14
14
0
0
8
8
9
9
15
15
6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.23 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A
6
6
5
5
7
7
14
14
0
0
8
8
9
9
15
15
6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.24 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A
6
6
5
5
7
7
3
3
0
0
8
8
9
9
15
15
6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.25 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A
6
6
5
5
7
7
3
3
0
0
8
8
9
9
15
15
6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.26 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A
6
6
5
5
7
7
3
3
0
0
8
8
9
9
15
15
6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.27 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Prims algorithm
A
V A
6
6
5
5
7
7
3
3
0
0
8
8
9
9
15
15
6 12
5
14
3
8
10
15
9
7
Introduction to Algorithms Day 27 L16.28 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Handshaking Lemma (E) implicit DECREASE-KEYs.
Q V
key[v] for all v V
key[s] 0 for some arbitrary s V
while Q
do u EXTRACT-MIN(Q)
for each v Adj[u]
do if v Q and w(u, v) < key[v]
then key[v] w(u, v)
[v] u
Analysis of Prim
degree(u)
times
|V|
times
(V)
total
Time = (V)T
EXTRACT
-
MIN
+ (E)T
DECREASE
-
KEY
Introduction to Algorithms Day 27 L16.29 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Analysis of Prim (continued)
Time = (V)T
EXTRACT
-
MIN
+ (E)T
DECREASE
-
KEY
Q T
EXTRACT
-
MIN
T
DECREASE
-
KEY
Total
array O(V) O(1) O(V
2
)
binary
heap
O(lg V) O(lg V) O(Elg V)
Fibonacci
heap
O(lg V)
amortized
O(1)
amortized
O(E + Vlg V)
worst case
Introduction to Algorithms Day 27 L16.30 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
MST algorithms
Kruskals algorithm (see CLRS):
Uses the disjoint-set data structure (Lecture 20).
Running time = O(Elg V).
Best to date:
Karger, Klein, and Tarjan [1993].
Randomized algorithm.
O(V + E) expected time.
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 17
Prof. Erik Demaine
Introduction to Algorithms Day 29 L17.2 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Paths in graphs
Consider a digraph G = (V, E) with edge-weight
function w : E R. The weight of path p = v
1

v
2
Lv
k
is defined to be

=
+
=
1
1
1
) , ( ) (
k
i
i i
v v w p w .
v
1
v
1
v
2
v
2
v
3
v
3
v
4
v
4
v
5
v
5
4 2 5 1
Example:
w(p) = 2
Introduction to Algorithms Day 29 L17.3 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Shortest paths
A shortest path from u to v is a path of
minimum weight from u to v. The shortest-
path weight from u to v is defined as
(u, v) = min{w(p) : p is a path from u to v}.
Note: (u, v) = if no path from u to v exists.
Introduction to Algorithms Day 29 L17.4 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Optimal substructure
Theorem. A subpath of a shortest path is a
shortest path.
Proof. Cut and paste:
Introduction to Algorithms Day 29 L17.5 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Triangle inequality
Theorem. For all u, v, x V, we have
(u, v) (u, x) + (x, v).
u
u
Proof.
x
x
v
v
(u, v)
(u, x) (x, v)
Introduction to Algorithms Day 29 L17.6 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Well-definedness of shortest
paths
If a graph G contains a negative-weight cycle,
then some shortest paths may not exist.
Example:
u
u
v
v

< 0
Introduction to Algorithms Day 29 L17.7 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Single-source shortest paths
Problem. From a given source vertex s V, find
the shortest-path weights (s, v) for all v V.
If all edge weights w(u, v) are nonnegative, all
shortest-path weights must exist.
IDEA: Greedy.
1. Maintain a set S of vertices whose shortest-
path distances from s are known.
2. At each step add to S the vertex v V S
whose distance estimate from s is minimal.
3. Update the distance estimates of vertices
adjacent to v.
Introduction to Algorithms Day 29 L17.8 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Dijkstras algorithm
d[s] 0
for each v V {s}
do d[v]
S
Q V Q is a priority queue maintaining V S
while Q
do u EXTRACT-MIN(Q)
S S {u}
for each v Adj[u]
do if d[v] > d[u] + w(u, v)
then d[v] d[u] + w(u, v)
relaxation
step
Implicit DECREASE-KEY
Introduction to Algorithms Day 29 L17.9 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Dijkstras
algorithm
A
A
B
B
D
D
C
C
E
E
10
3
1 4 7 9
8
2
2
Graph with
nonnegative
edge weights:
Introduction to Algorithms Day 29 L17.10 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Dijkstras
algorithm
A
A
B
B
D
D
C
C
E
E
10
3
1 4 7 9
8
2
2
Initialize:
A B C D E Q:
0
S: {}
0

Introduction to Algorithms Day 29 L17.11 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Dijkstras
algorithm
A
A
B
B
D
D
C
C
E
E
10
3
1 4 7 9
8
2
2
A B C D E Q:
0
S: { A }
0

A EXTRACT-MIN(Q):
Introduction to Algorithms Day 29 L17.12 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Dijkstras
algorithm
A
A
B
B
D
D
C
C
E
E
10
3
1 4 7 9
8
2
2
A B C D E Q:
0
S: { A }
0
10
3

10 3
Relax all edges leaving A:
Introduction to Algorithms Day 29 L17.13 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Dijkstras
algorithm
A
A
B
B
D
D
C
C
E
E
10
3
1 4 7 9
8
2
2
A B C D E Q:
0
S: { A, C }
0
10
3

10 3
C EXTRACT-MIN(Q):
Introduction to Algorithms Day 29 L17.14 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Dijkstras
algorithm
A
A
B
B
D
D
C
C
E
E
10
3
1 4 7 9
8
2
2
A B C D E Q:
0
S: { A, C }
0
7
3 5
11
10 3
7 11 5
Relax all edges leaving C:
Introduction to Algorithms Day 29 L17.15 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Dijkstras
algorithm
A
A
B
B
D
D
C
C
E
E
10
3
1 4 7 9
8
2
2
A B C D E Q:
0
S: { A, C, E }
0
7
3 5
11
10 3
7 11 5
E EXTRACT-MIN(Q):
Introduction to Algorithms Day 29 L17.16 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Dijkstras
algorithm
A
A
B
B
D
D
C
C
E
E
10
3
1 4 7 9
8
2
2
A B C D E Q:
0
S: { A, C, E }
0
7
3 5
11
10 3
7 11 5
7 11
Relax all edges leaving E:
Introduction to Algorithms Day 29 L17.17 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Dijkstras
algorithm
A
A
B
B
D
D
C
C
E
E
10
3
1 4 7 9
8
2
2
A B C D E Q:
0
S: { A, C, E, B }
0
7
3 5
11
10 3
7 11 5
7 11
B EXTRACT-MIN(Q):
Introduction to Algorithms Day 29 L17.18 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Dijkstras
algorithm
A
A
B
B
D
D
C
C
E
E
10
3
1 4 7 9
8
2
2
A B C D E Q:
0
S: { A, C, E, B }
0
7
3 5
9
10 3
7 11 5
7 11
Relax all edges leaving B:
9
Introduction to Algorithms Day 29 L17.19 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Dijkstras
algorithm
A
A
B
B
D
D
C
C
E
E
10
3
1 4 7 9
8
2
2
A B C D E Q:
0
S: { A, C, E, B, D }
0
7
3 5
9
10 3
7 11 5
7 11
9
D EXTRACT-MIN(Q):
Introduction to Algorithms Day 29 L17.20 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Correctness Part I
Lemma. Initializing d[s] 0 and d[v] for all
v V {s} establishes d[v] (s, v) for all v V,
and this invariant is maintained over any sequence
of relaxation steps.
Proof. Suppose not. Let v be the first vertex for
which d[v] < (s, v), and let u be the vertex that
caused d[v] to change: d[v] = d[u] + w(u, v). Then,
d[v] < (s, v) supposition
(s, u) + (u, v) triangle inequality
(s,u) + w(u, v) sh. path specific path
d[u] + w(u, v) v is first violation
Contradiction.
Introduction to Algorithms Day 29 L17.21 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Correctness Part II
Theorem. Dijkstras algorithm terminates with
d[v] = (s, v) for all v V.
Proof. It suffices to show that d[v] = (s, v) for every
v V when v is added to S. Suppose u is the first
vertex added to S for which d[u] (s, u). Let y be the
first vertex in V S along a shortest path from s to u,
and let x be its predecessor:
s
s
x
x
y
y
u
u
S, just before
adding u.
Introduction to Algorithms Day 29 L17.22 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Correctness Part II
(continued)
Since u is the first vertex violating the claimed invariant,
we have d[x] = (s, x). Since subpaths of shortest paths
are shortest paths, it follows that d[y] was set to (s, x) +
w(x, y) = (s, y) when (x, y) was relaxed just after x was
added to S. Consequently, we have d[y] = (s, y) (s, u)
d[u]. But, d[u] d[y] by our choice of u, and hence d[y]
= (s, y) = (s, u) = d[u]. Contradiction.
s
s
x
x
y
y
u
u
S
Introduction to Algorithms Day 29 L17.23 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Analysis of Dijkstra
degree(u)
times
|V|
times
Handshaking Lemma (E) implicit DECREASE-KEYs.
Time = (V)T
EXTRACT
-
MIN
+ (E)T
DECREASE
-
KEY
Note: Same formula as in the analysis of Prims
minimum spanning tree algorithm.
while Q
do u EXTRACT-MIN(Q)
S S {u}
for each v Adj[u]
do if d[v] > d[u] + w(u, v)
then d[v] d[u] + w(u, v)
Introduction to Algorithms Day 29 L17.24 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Analysis of Dijkstra
(continued)
Time = (V)T
EXTRACT
-
MIN
+ (E)T
DECREASE
-
KEY
Q T
EXTRACT
-
MIN
T
DECREASE
-
KEY
Total
array O(V) O(1) O(V
2
)
binary
heap
O(lg V) O(lg V) O(Elg V)
Fibonacci
heap
O(lg V)
amortized
O(1)
amortized
O(E + Vlg V)
worst case
Introduction to Algorithms Day 29 L17.25 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Unweighted graphs
Suppose w(u, v) = 1 for all (u, v) E. Can the
code for Dijkstra be improved?
while Q
do u DEQUEUE(Q)
for each v Adj[u]
do if d[v] =
then d[v] d[u] + 1
ENQUEUE(Q, v)
Use a simple FIFO queue instead of a priority
queue.
Breadth-first search
Analysis: Time = O(V + E).
Introduction to Algorithms Day 29 L17.26 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of breadth-first
search
a
a
b
b
c
c
d
d
e
e
g
g
i
i
f
f
h
h
Q:
Introduction to Algorithms Day 29 L17.27 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of breadth-first
search
a
a
b
b
c
c
d
d
e
e
g
g
i
i
f
f
h
h
Q: a
0
0
Introduction to Algorithms Day 29 L17.28 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of breadth-first
search
a
a
b
b
c
c
d
d
e
e
g
g
i
i
f
f
h
h
Q: a b d
0
1
1
1 1
Introduction to Algorithms Day 29 L17.29 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of breadth-first
search
a
a
b
b
c
c
d
d
e
e
g
g
i
i
f
f
h
h
Q: a b d c e
0
1
1
2
2
1 2 2
Introduction to Algorithms Day 29 L17.30 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of breadth-first
search
a
a
b
b
c
c
d
d
e
e
g
g
i
i
f
f
h
h
Q: a b d c e
0
1
1
2
2
2 2
Introduction to Algorithms Day 29 L17.31 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of breadth-first
search
a
a
b
b
c
c
d
d
e
e
g
g
i
i
f
f
h
h
Q: a b d c e
0
1
1
2
2
2
Introduction to Algorithms Day 29 L17.32 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of breadth-first
search
a
a
b
b
c
c
d
d
e
e
g
g
i
i
f
f
h
h
Q: a b d c e g i
0
1
1
2
2
3
3
3 3
Introduction to Algorithms Day 29 L17.33 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of breadth-first
search
a
a
b
b
c
c
d
d
e
e
g
g
i
i
f
f
h
h
Q: a b d c e g i f
0
1
1
2
2
3
3
4
3 4
Introduction to Algorithms Day 29 L17.34 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of breadth-first
search
a
a
b
b
c
c
d
d
e
e
g
g
i
i
f
f
h
h
Q: a b d c e g i f h
0
1
1
2
2
3
3
4 4
4 4
Introduction to Algorithms Day 29 L17.35 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of breadth-first
search
a
a
b
b
c
c
d
d
e
e
g
g
i
i
f
f
h
h
Q: a b d c e g i f h
0
1
1
2
2
3
3
4 4
4
Introduction to Algorithms Day 29 L17.36 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of breadth-first
search
a
a
b
b
c
c
d
d
e
e
g
g
i
i
f
f
h
h
Q: a b d c e g i f h
0
1
1
2
2
3
3
4 4
Introduction to Algorithms Day 29 L17.37 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of breadth-first
search
a
a
b
b
c
c
d
d
e
e
g
g
i
i
f
f
h
h
Q: a b d c e g i f h
0
1
1
2
2
3
3
4 4
Introduction to Algorithms Day 29 L17.38 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Correctness of BFS
Key idea:
The FIFO Q in breadth-first search mimics
the priority queue Q in Dijkstra.
Invariant: v comes after u in Q implies that
d[v] = d[u] or d[v] = d[u] + 1.
while Q
do u DEQUEUE(Q)
for each v Adj[u]
do if d[v] =
then d[v] d[u] + 1
ENQUEUE(Q, v)
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 18
Prof. Erik Demaine
Introduction to Algorithms Day 31 L18.2 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Negative-weight cycles
Recall: If a graph G = (V, E) contains a negative-
weight cycle, then some shortest paths may not exist.
Example:
u
u
v
v

< 0
Bellman-Ford algorithm: Finds all shortest-path
lengths from a source s V to all v V or
determines that a negative-weight cycle exists.
Introduction to Algorithms Day 31 L18.3 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Bellman-Ford algorithm
d[s] 0
for each v V {s}
do d[v]
for i 1 to |V| 1
do for each edge (u, v) E
do if d[v] > d[u] + w(u, v)
then d[v] d[u] + w(u, v)
for each edge (u, v) E
do if d[v] > d[u] + w(u, v)
then report that a negative-weight cycle exists
initialization
At the end, d[v] = (s, v). Time = O(VE).
relaxation
step
Introduction to Algorithms Day 31 L18.4 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Example of Bellman-Ford
A
A
B
B
E
E
C
C
D
D
1
4
1
2
3
2
5
3
A B C D E
0

0

Introduction to Algorithms Day 31 L18.5 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
1
0 1
Example of Bellman-Ford
A
A
B
B
E
E
C
C
D
D
1
4
1
2
3
2
5
3
A B C D E
0
0

Introduction to Algorithms Day 31 L18.6 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
1
0 1
Example of Bellman-Ford
A
A
B
B
E
E
C
C
D
D
1
4
1
2
3
2
5
3
A B C D E
0
0
4
0 1 4
Introduction to Algorithms Day 31 L18.7 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
4
0 1 2
2
1
0 1
Example of Bellman-Ford
A
A
B
B
E
E
C
C
D
D
1
4
1
2
3
2
5
3
A B C D E
0
0

0 1 4
Introduction to Algorithms Day 31 L18.8 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
1
Example of Bellman-Ford
A
A
B
B
E
E
C
C
D
D
1
4
1
2
3
2
5
3
0
2
0 1 2
0 1
A B C D E
0
0 1 4
Introduction to Algorithms Day 31 L18.9 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
1
Example of Bellman-Ford
A
A
B
B
E
E
C
C
D
D
1
4
1
2
3
2
5
3
0
2
0 1 2
0 1
A B C D E
0
0 1 4
1
0 1 2 1
Introduction to Algorithms Day 31 L18.10 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson

0 1 2 1 1
1
1
Example of Bellman-Ford
A
A
B
B
E
E
C
C
D
D
1
4
1
2
3
2
5
3
0
2
0 1 2
0 1
A B C D E
0
0 1 4
1
0 1 2 1
Introduction to Algorithms Day 31 L18.11 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
1
0 1 2 2 1
2
0 1 2 1 1
1
Example of Bellman-Ford
A
A
B
B
E
E
C
C
D
D
1
4
1
2
3
2
5
3
0
2
0 1 2
0 1
A B C D E
0
0 1 4
1
0 1 2 1
Introduction to Algorithms Day 31 L18.12 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
1
0 1 2 2 1
2
0 1 2 1 1
1
Example of Bellman-Ford
A
A
B
B
E
E
C
C
D
D
1
4
1
2
3
2
5
3
0
2
0 1 2
0 1
A B C D E
0
0 1 4
1
0 1 2 1
Note: Values decrease
monotonically.
Introduction to Algorithms Day 31 L18.13 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Correctness
Theorem. If G = (V, E) contains no negative-
weight cycles, then after the Bellman-Ford
algorithm executes, d[v] = (s, v) for all v V.
Proof. Let v V be any vertex, and consider a shortest
path p from s to v with the minimum number of edges.
v
1
v
1
v
2
v
2
v
3
v
3
v
k
v
k
v
0
v
0

s
v
p:
Since p is a shortest path, we have
(s, v
i
) = (s, v
i1
) + w(v
i1
, v
i
) .
Introduction to Algorithms Day 31 L18.14 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Correctness (continued)
v
1
v
1
v
2
v
2
v
3
v
3
v
k
v
k
v
0
v
0

s
v
p:
Initially, d[v
0
] = 0 = (s, v
0
), and d[s] is unchanged by
subsequent relaxations (because of the lemma from
Lecture 17 that d[v] (s, v)).
After 1 pass through E, we have d[v
1
] = (s, v
1
).
After 2 passes through E, we have d[v
2
] = (s, v
2
).
M
After k passes through E, we have d[v
k
] = (s, v
k
).
Since G contains no negative-weight cycles, p is simple.
Longest simple path has |V| 1 edges.
Introduction to Algorithms Day 31 L18.15 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Detection of negative-weight
cycles
Corollary. If a value d[v] fails to converge after
|V| 1 passes, there exists a negative-weight
cycle in G reachable from s.
Introduction to Algorithms Day 31 L18.16 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
DAG shortest paths
If the graph is a directed acyclic graph (DAG), we first
topologically sort the vertices.
Walk through the vertices u V in this order, relaxing
the edges in Adj[u], thereby obtaining the shortest paths
from s in a total of O(V + E) time.
Determine f : V {1, 2, , |V|} such that (u, v) E
f (u) < f (v).
O(V + E) time using depth-first search.
3
3
5
5
6
6
4
4
2
2
s
7
7
9
9
8
8 1
1
Introduction to Algorithms Day 31 L18.17 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Linear programming
Let A be an mn matrix, b be an m-vector, and c
be an n-vector. Find an n-vector x that maximizes
c
T
x subject to Ax b, or determine that no such
solution exists.
.

.
maximizing
m
n
A x b c
T
x
Introduction to Algorithms Day 31 L18.18 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Linear-programming
algorithms
Algorithms for the general problem
Simplex methods practical, but worst-case
exponential time.
Ellipsoid algorithm polynomial time, but
slow in practice.
Interior-point methods polynomial time and
competes with simplex.
Feasibility problem: No optimization criterion.
Just find x such that Ax b.
In general, just as hard as ordinary LP.
Introduction to Algorithms Day 31 L18.19 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Solving a system of difference
constraints
Linear programming where each row of A contains
exactly one 1, one 1, and the rest 0s.
Example:
x
1
x
2
3
x
2
x
3
2
x
1
x
3
2
x
j
x
i
w
ij
Solution:
x
1
= 3
x
2
= 0
x
3
= 2
Constraint graph:
v
j
v
j
v
i
v
i
x
j
x
i
w
ij
w
ij
(The A
matrix has
dimensions
|E| |V|.)
Introduction to Algorithms Day 31 L18.20 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Unsatisfiable constraints
Theorem. If the constraint graph contains
a negative-weight cycle, then the system of
differences is unsatisfiable.
Proof. Suppose that the negative-weight cycle is
v
1
v
2
L v
k
v
1
. Then, we have
x
2
x
1
w
12
x
3
x
2
w
23
M
x
k
x
k1
w
k1, k
x
1
x
k
w
k1
Therefore, no
values for the x
i
can satisfy the
constraints.
0 weight of cycle
< 0
Introduction to Algorithms Day 31 L18.21 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Satisfying the constraints
Theorem. Suppose no negative-weight cycle
exists in the constraint graph. Then, the
constraints are satisfiable.
Proof. Add a new vertex s to V with a 0-weight edge
to each vertex v
i
V.
v
1
v
1
v
4
v
4
v
7
v
7
v
9
v
9
v
3
v
3
s
0
Note:
No negative-weight
cycles introduced
shortest paths exist.
Introduction to Algorithms Day 31 L18.22 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
The triangle inequality gives us (s,v
j
) (s, v
i
) + w
ij
.
Since x
i
= (s, v
i
) and x
j
= (s, v
j
), the constraint x
j
x
i
w
ij
is satisfied.
Proof (continued)
Claim: The assignment x
i
= (s, v
i
) solves the constraints.
s
s
v
j
v
j
v
i
v
i
(s, v
i
)
(s, v
j
)
w
ij
Consider any constraint x
j
x
i
w
ij
, and consider the
shortest paths from s to v
j
and v
i
:
Introduction to Algorithms Day 31 L18.23 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Bellman-Ford and linear
programming
Corollary. The Bellman-Ford algorithm can
solve a system of m difference constraints on n
variables in O(mn) time.
Single-source shortest paths is a simple LP
problem.
In fact, Bellman-Ford maximizes x
1
+ x
2
+ L+ x
n
subject to the constraints x
j
x
i
w
ij
and x
i
0
(exercise).
Bellman-Ford also minimizes max
i
{x
i
} min
i
{x
i
}
(exercise).
Introduction to Algorithms Day 31 L18.24 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Application to VLSI layout
compaction
Integrated
-circuit
features:
Problem: Compact (in one dimension) the
space between the features of a VLSI layout
without bringing any features too close together.
minimum separation
Introduction to Algorithms Day 31 L18.25 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
VLSI layout compaction
1
1
x
1
x
2
2
d
1
Constraint: x
2
x
1
d
1
+
Bellman-Ford minimizes max
i
{x
i
} min
i
{x
i
},
which compacts the layout in the x-dimension.
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 19
Prof. Erik Demaine
Introduction to Algorithms Day 32 L19.2 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Shortest paths
Single-source shortest paths
Nonnegative edge weights
Dijkstras algorithm: O(E + V lg V)
General
Bellman-Ford: O(VE)
DAG
One pass of Bellman-Ford: O(V + E)
All-pairs shortest paths
Nonnegative edge weights
Dijkstras algorithm |V| times: O(VE + V
2
lg V)
General
Three algorithms today.
Introduction to Algorithms Day 32 L19.3 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
All-pairs shortest paths
Input: Digraph G = (V, E), where |V| = n, with
edge-weight function w : E R.
Output: n n matrix of shortest-path lengths
(i, j) for all i, j V.
IDEA #1:
Run Bellman-Ford once from each vertex.
Time = O(V
2
E).
Dense graph O(V
4
) time.
Good first try!
Introduction to Algorithms Day 32 L19.4 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Dynamic programming
Consider the n n adjacency matrix A = (a
ij
)
of the digraph, and define
d
ij
(0)
=
0 if i = j,
if i j;
Claim: We have
and for m = 1, 2, , n 1,
d
ij
(m)
= min
k
{d
ik
(m1)
+ a
kj
}.
d
ij
(m)
= weight of a shortest path from
i to j that uses at most m edges.
Introduction to Algorithms Day 32 L19.5 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof of claim
d
ij
(m)
= min
k
{d
ik
(m1)
+ a
kj
}
i
i
j
j
i
M
ks

1
e
d
g
e
s

1
e
d
g
e
s

1
e
d
g
e
s
m 1 edges
Relaxation!
for k 1 to n
do if d
ij
> d
ik
+ a
kj
then d
ij
d
ik
+ a
kj
Note: No negative-weight cycles implies
(i, j) = d
ij
(n1)
= d
ij
(n)
= d
ij
(n+1)
= L
Introduction to Algorithms Day 32 L19.6 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Matrix multiplication
Compute C = A B, where C, A, and B are n n
matrices:

=
=
n
k
kj ik ij
b a c
1
.
Time = (n
3
) using the standard algorithm.
What if we map + min and +?
c
ij
= min
k
{a
ik
+ b
kj
}.
Thus, D
(m)
= D
(m1)
A.
Identity matrix = I =





0
0
0
0
= D
0
= (d
ij
(0)
).
Introduction to Algorithms Day 32 L19.7 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Matrix multiplication
(continued)
The (min, +) multiplication is associative, and
with the real numbers, it forms an algebraic
structure called a closed semiring.
Consequently, we can compute
D
(1)
= D
(0)
A = A
1
D
(2)
= D
(1)
A = A
2
M M
D
(n1)
= D
(n2)
A = A
n1
,
yielding D
(n1)
= ((i, j)).
Time = (nn
3
) = (n
4
). No better than n B-F.
Introduction to Algorithms Day 32 L19.8 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Improved matrix
multiplication algorithm
Repeated squaring: A
2k
= A
k
A
k
.
Compute A
2
, A
4
, , A
2
lg(n1)
.
O(lg n) squarings
Time = (n
3
lg n).
To detect negative-weight cycles, check the
diagonal for negative values in O(n) additional
time.
Note: A
n1
= A
n
= A
n+1
= L.
Introduction to Algorithms Day 32 L19.9 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Floyd-Warshall algorithm
Also dynamic programming, but faster!
Define c
ij
(k)
= weight of a shortest path from i
to j with intermediate vertices
belonging to the set {1, 2, , k}.
i
i
k
k
k
k
k
k
k
k
j
j
Thus, (i, j) = c
ij
(n)
. Also, c
ij
(0)
= a
ij
.
Introduction to Algorithms Day 32 L19.10 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Floyd-Warshall recurrence
c
ij
(k)
= min
k
{c
ij
(k1)
, c
ik
(k1)
+ c
kj
(k1)
}
i
i
j
j
k
i
c
ij
(k1)
c
ik
(k1)
c
kj
(k1)
intermediate vertices in {1, 2, , k}
Introduction to Algorithms Day 32 L19.11 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Pseudocode for Floyd-
Warshall
for k 1 to n
do for i 1 to n
do for j 1 to n
do if c
ij
> c
ik
+ c
kj
then c
ij
c
ik
+ c
kj
relaxation
Notes:
Okay to omit superscripts, since extra relaxations
cant hurt.
Runs in (n
3
) time.
Simple to code.
Efficient in practice.
Introduction to Algorithms Day 32 L19.12 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Transitive closure of a
directed graph
Compute t
ij
=
1 if there exists a path from i to j,
0 otherwise.
IDEA: Use Floyd-Warshall, but with (, ) instead
of (min, +):
t
ij
(k)
= t
ij
(k1)
(t
ik
(k1)
t
kj
(k1)
).
Time = (n
3
).
Introduction to Algorithms Day 32 L19.13 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Graph reweighting
Theorem. Given a label h(v) for each v V, reweight
each edge (u, v) E by
(u, v) = w(u, v) + h(u) h(v).
Then, all paths between the same two vertices are
reweighted by the same amount.
Proof. Let p = v
1
v
2
L v
k
be a path in the graph.
( )
) ( ) ( ) (
) ( ) ( ) , (
) ( ) ( ) , (
) , ( ) (
1
1
1
1
1
1
1
1 1
1
1
1
k
k
k
i
i i
k
i
i i i i
k
i
i i
v h v h p w
v h v h v v w
v h v h v v w
v v w p w
+ =
+ =
+ =
=

=
+

=
+ +

=
+
.
Then, we have
Introduction to Algorithms Day 32 L19.14 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Johnsons algorithm
1. Find a vertex labeling h such that (u, v) 0 for all
(u, v) E by using Bellman-Ford to solve the
difference constraints
h(v) h(u) w(u, v),
or determine that a negative-weight cycle exists.
Time = O(VE).
2. Run Dijkstras algorithm from each vertex using .
Time = O(VE + V
2
lg V).
3. Reweight each shortest-path length (p) to produce
the shortest-path lengths w(p) of the original graph.
Time = O(V
2
).
Total time = O(VE + V
2
lg V).
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 20
Prof. Erik Demaine
Introduction to Algorithms Day 33 L20.2 2001 by Erik D. Demaine
Disjoint-set data structure
(Union-Find)
Problem: Maintain a dynamic collection of
pairwise-disjoint sets S = {S
1
, S
2
, , S
r
}.
Each set S
i
has one element distinguished as the
representative element, rep[S
i
].
Must support 3 operations:
MAKE-SET(x): adds new set {x} to S
with rep[{x}] = x (for any x S
i
for all i).
UNION(x, y): replaces sets S
x
, S
y
with S
x
S
y
in S for any x, y in distinct sets S
x
, S
y
.
FIND-SET(x): returns representative rep[S
x
]
of set S
x
containing element x.
Introduction to Algorithms Day 33 L20.3 2001 by Erik D. Demaine
Simple linked-list solution
Store each set S
i
= {x
1
, x
2
, , x
k
} as an (unordered)
doubly linked list. Define representative element
rep[S
i
] to be the front of the list, x
1
.

S
i
:
x
1
x
2
x
k
rep[S
i
]
MAKE-SET(x) initializes x as a lone node.
FIND-SET(x) walks left in the list containing x
until it reaches the front of the list.
UNION(x, y) concatenates the lists containing
x and y, leaving rep. as FIND-SET[x].
(1)
(n)
(n)
Introduction to Algorithms Day 33 L20.4 2001 by Erik D. Demaine
Simple balanced-tree solution
Store each set S
i
= {x
1
, x
2
, , x
k
} as a balanced tree
(ignoring keys). Define representative element
rep[S
i
] to be the root of the tree.
x
1
x
4
x
3
x
2
x
5
MAKE-SET(x) initializes x
as a lone node.
FIND-SET(x) walks up the
tree containing x until it
reaches the root.
UNION(x, y) concatenates
the trees containing x and y,
changing rep.
S
i
= {x
1
, x
2
, x
3
, x
4
, x
5
}
rep[S
i
]
(1)
(lg n)
(lg n)
Introduction to Algorithms Day 33 L20.5 2001 by Erik D. Demaine
Plan of attack
We will build a simple disjoint-union data structure
that, in an amortized sense, performs significantly
better than (lg n) per op., even better than
(lg lg n), (lg lg lg n), etc., but not quite (1).
To reach this goal, we will introduce two key tricks.
Each trick converts a trivial (n) solution into a
simple (lg n) amortized solution. Together, the
two tricks yield a much better solution.
First trick arises in an augmented linked list.
Second trick arises in a tree structure.
Introduction to Algorithms Day 33 L20.6 2001 by Erik D. Demaine
Augmented linked-list solution

S
i
:
x
1
x
2
x
k
rep[S
i
]
rep
Store set S
i
= {x
1
, x
2
, , x
k
} as unordered doubly
linked list. Define rep[S
i
] to be front of list, x
1
.
Each element x
j
also stores pointer rep[x
j
] to rep[S
i
].
FIND-SET(x) returns rep[x].
UNION(x, y) concatenates the lists containing
x and y, and updates the rep pointers for
all elements in the list containing y.
(n)
(1)
Introduction to Algorithms Day 33 L20.7 2001 by Erik D. Demaine
Example of
augmented linked-list solution
S
x
: x
1
x
2
rep[S
x
]
rep
Each element x
j
stores pointer rep[x
j
] to rep[S
i
].
UNION(x, y)
concatenates the lists containing x and y, and
updates the rep pointers for all elements in the
list containing y.
S
y
:
y
1
y
2
y
3
rep[S
y
]
rep
Introduction to Algorithms Day 33 L20.8 2001 by Erik D. Demaine
Example of
augmented linked-list solution
S
x
S
y
:
x
1
x
2
rep[S
x
]
rep
Each element x
j
stores pointer rep[x
j
] to rep[S
i
].
UNION(x, y)
concatenates the lists containing x and y, and
updates the rep pointers for all elements in the
list containing y.
y
1
y
2
y
3
rep[S
y
]
rep
Introduction to Algorithms Day 33 L20.9 2001 by Erik D. Demaine
Example of
augmented linked-list solution
S
x
S
y
:
x
1
x
2
rep[S
x
S
y
]
Each element x
j
stores pointer rep[x
j
] to rep[S
i
].
UNION(x, y)
concatenates the lists containing x and y, and
updates the rep pointers for all elements in the
list containing y.
y
1
y
2
y
3
rep
Introduction to Algorithms Day 33 L20.10 2001 by Erik D. Demaine
Alternative concatenation
S
x
:
x
1
x
2
rep[S
y
]
UNION(x, y) could instead
concatenate the lists containing y and x, and
update the rep pointers for all elements in the
list containing x.
y
1
y
2
y
3
rep
rep[S
x
]
rep
S
y
:
Introduction to Algorithms Day 33 L20.11 2001 by Erik D. Demaine
Alternative concatenation
S
x
S
y
:
x
1
x
2
rep[S
y
]
UNION(x, y) could instead
concatenate the lists containing y and x, and
update the rep pointers for all elements in the
list containing x.
y
1
y
2
y
3
rep[S
x
]
rep
rep
Introduction to Algorithms Day 33 L20.12 2001 by Erik D. Demaine
Alternative concatenation
S
x
S
y
:
x
1
x
2
UNION(x, y) could instead
concatenate the lists containing y and x, and
update the rep pointers for all elements in the
list containing x.
y
1
y
2
y
3
rep
rep
rep[S
x
S
y
]
Introduction to Algorithms Day 33 L20.13 2001 by Erik D. Demaine
Trick 1: Smaller into larger
To save work, concatenate smaller list onto the end
of the larger list. Cost = (length of smaller list).
Augment list to store its weight (# elements).
Let n denote the overall number of elements
(equivalently, the number of MAKE-SET operations).
Let m denote the total number of operations.
Let f denote the number of FIND-SET operations.
Theorem: Cost of all UNIONs is O(n lg n).
Corollary: Total cost is O(m + n lg n).
Introduction to Algorithms Day 33 L20.14 2001 by Erik D. Demaine
Analysis of Trick 1
To save work, concatenate smaller list onto the end
of the larger list. Cost = (1 + length of smaller list).
Theorem: Total cost of UNIONs is O(n lg n).
Proof. Monitor an element x and set S
x
containing it.
After initial MAKE-SET(x), weight[S
x
] = 1. Each
time S
x
is united with set S
y
, weight[S
y
] weight[S
x
],
pay 1 to update rep[x], and weight[S
x
] at least
doubles (increasing by weight[S
y
]). Each time S
y
is
united with smaller set S
y
, pay nothing, and
weight[S
x
] only increases. Thus pay lg n for x.
Introduction to Algorithms Day 33 L20.15 2001 by Erik D. Demaine
Representing sets as trees
Store each set S
i
= {x
1
, x
2
, , x
k
} as an unordered,
potentially unbalanced, not necessarily binary tree,
storing only parent pointers. rep[S
i
] is the tree root.
x
1
x
4
x
3
x
2
x
5
S
i
= {x
1
, x
2
, x
3
, x
4
, x
5
, x
6
}
rep[S
i
]
MAKE-SET(x) initializes x
as a lone node.
FIND-SET(x) walks up the
tree containing x until it
reaches the root.
UNION(x, y) concatenates
the trees containing x and y
(1)
(depth[x])
x
6
Introduction to Algorithms Day 33 L20.16 2001 by Erik D. Demaine
Trick 1 adapted to trees
UNION(x, y) can use a simple concatenation strategy:
Make root FIND-SET(y) a child of root FIND-SET(x).
FIND-SET(y) = FIND-SET(x).
y
1
y
4
y
3
y
2
y
5
We can adapt Trick 1
to this context also:
Merge tree with smaller
weight into tree with
larger weight.
Height of tree increases only when its size
doubles, so height is logarithmic in weight.
Thus total cost is O(m + f lg n).
x
1
x
4
x
3
x
2
x
5
x
6
Introduction to Algorithms Day 33 L20.17 2001 by Erik D. Demaine
Trick 2: Path compression
When we execute a FIND-SET operation and walk
up a path p to the root, we know the representative
for all the nodes on path p.
y
1
y
4
y
3
y
2
y
5
x
1
x
4
x
3
x
2
x
5
x
6
Path compression makes
all of those nodes direct
children of the root.
Cost of FIND-SET(x)
is still (depth[x]).
FIND-SET(y
2
)
Introduction to Algorithms Day 33 L20.18 2001 by Erik D. Demaine
Trick 2: Path compression
When we execute a FIND-SET operation and walk
up a path p to the root, we know the representative
for all the nodes on path p.
y
1
y
4
y
3
y
2
y
5
x
1
x
4
x
3
x
2
x
5
x
6
Path compression makes
all of those nodes direct
children of the root.
Cost of FIND-SET(x)
is still (depth[x]).
FIND-SET(y
2
)
Introduction to Algorithms Day 33 L20.19 2001 by Erik D. Demaine
Trick 2: Path compression
When we execute a FIND-SET operation and walk
up a path p to the root, we know the representative
for all the nodes on path p.
y
1
y
4
y
3
y
2
y
5
x
1
x
4
x
3
x
2
x
5
x
6
FIND-SET(y
2
)
Path compression makes
all of those nodes direct
children of the root.
Cost of FIND-SET(x)
is still (depth[x]).
Introduction to Algorithms Day 33 L20.20 2001 by Erik D. Demaine
Analysis of Trick 2 alone
Theorem: Total cost of FIND-SETs is O(m lg n).
Proof: Amortization by potential function.
The weight of a node x is # nodes in its subtree.
Define (x
1
, , x
n
) =
i
lg weight[x
i
].
UNION(x
i
, x
j
) increases potential of root FIND-SET(x
i
)
by at most lg weight[root FIND-SET(x
j
)] lg n.
Each step down p c made by FIND-SET(x
i
),
except the first, moves cs subtree out of ps subtree.
Thus if weight[c] weight[p], decreases by 1,
paying for the step down. There can be at most lg n
steps p c for which weight[c] < weight[p].
Introduction to Algorithms Day 33 L20.21 2001 by Erik D. Demaine
Analysis of Trick 2 alone
Theorem: If all UNION operations occur before
all FIND-SET operations, then total cost is O(m).
Proof: If a FIND-SET operation traverses a path
with k nodes, costing O(k) time, then k 2 nodes
are made new children of the root. This change
can happen only once for each of the n elements,
so the total cost of FIND-SET is O(f + n).
Introduction to Algorithms Day 33 L20.22 2001 by Erik D. Demaine
Ackermanns function A
Define

=
+
=
+

. 1 if
, 0 if
) (
1
) (
) 1 (
1
k
k
j A
j
j A
j
k
k
Define (n) = min {k : A
k
(1) n} 4 for practical n.
A
0
(j) = j + 1
A
1
(j) ~ 2 j
A
2
(j) ~ 2j 2
j
> 2
j
A
3
(j) >
A
4
(j) is a lot bigger.
2
2
2
2
j
.
.
.
j
A
0
(1) = 2
A
1
(1) = 3
A
2
(1) = 7
A
3
(1) = 2047
A
4
(1) >
iterate j+1 times
2
2
2
2
2047
.
.
.
2048
Introduction to Algorithms Day 33 L20.23 2001 by Erik D. Demaine
Analysis of Tricks 1 + 2
Theorem: In general, total cost is O(m (n)).
(long, tricky proof see Section 21.4 of CLRS)
Introduction to Algorithms Day 33 L20.24 2001 by Erik D. Demaine
Application:
Dynamic connectivity
Suppose a graph is given to us incrementally by
ADD-VERTEX(v)
ADD-EDGE(u, v)
and we want to support connectivity queries:
CONNECTED(u, v):
Are u and v in the same connected component?
For example, we want to maintain a spanning forest,
so we check whether each new edge connects a
previously disconnected pair of vertices.
Introduction to Algorithms Day 33 L20.25 2001 by Erik D. Demaine
Application:
Dynamic connectivity
Sets of vertices represent connected components.
Suppose a graph is given to us incrementally by
ADD-VERTEX(v) MAKE-SET(v)
ADD-EDGE(u, v) if not CONNECTED(u, v)
then UNION(v, w)
and we want to support connectivity queries:
CONNECTED(u, v): FIND-SET(u) = FIND-SET(v)
Are u and v in the same connected component?
For example, we want to maintain a spanning forest,
so we check whether each new edge connects a
previously disconnected pair of vertices.
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 21
Prof. Charles E. Leiserson
Introduction to Algorithms Day 35 L21.2 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Take-home quiz
No notes (except this one).
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 22
Prof. Charles E. Leiserson
Introduction to Algorithms Day 38 L22.2 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Flow networks
Definition. A flow network is a directed graph
G = (V, E) with two distinguished vertices: a
source s and a sink t. Each edge (u, v) E has
a nonnegative capacity c(u, v). If (u, v) E,
then c(u, v) = 0.
Example:
s
s
t
t
3
2
3
3 2
2
3
3
1
2
1
Introduction to Algorithms Day 38 L22.3 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Flow networks
Definition. A positive flow on G is a function
p : V V R satisfying the following:
Capacity constraint: For all u, v V,
0 p(u, v) c(u, v).
Flow conservation: For all u V {s, t},
0 ) , ( ) , ( =

V v V v
u v p v u p
.
The value of a flow is the net flow out of the
source:



V v V v
s v p v s p ) , ( ) , (
.
Introduction to Algorithms Day 38 L22.4 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
A flow on a network
s
s
t
t
1:3
2:2
2:3
1
:
1
2:3 1:2
1:2
2:3
1:3
0:1
2:2
positive
flow
capacity
The value of this flow is 1 0 + 2 = 3.
Flow conservation (like Kirchoffs current law):
Flow into u is 2 + 1 = 3.
Flow out of u is 0 + 1 + 2 = 3.
u
Introduction to Algorithms Day 38 L22.5 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
The maximum-flow problem
s
s
t
t
2:3
2:2
2:3
1
:
1
2:3 1:2
2:2
3:3
0:3
0:1
2:2
The value of the maximum flow is 4.
Maximum-flow problem: Given a flow network
G, find a flow of maximum value on G.
Introduction to Algorithms Day 38 L22.6 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Flow cancellation
Without loss of generality, positive flow goes
either from u to v, or from v to u, but not both.
v
v
u
u
2:3 1:2
v
v
u
u
1:3 0:2
Net flow from
u to v in both
cases is 1.
The capacity constraint and flow conservation
are preserved by this transformation.
INTUITION: View flow as a rate, not a quantity.
Introduction to Algorithms Day 38 L22.7 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
One summation
instead of two.
A notational simplification
IDEA: Work with the net flow between two
vertices, rather than with the positive flow.
Definition. A (net) flow on G is a function
f : V V R satisfying the following:
Capacity constraint: For all u, v V,
f (u, v) c(u, v).
Flow conservation: For all u V {s, t},
0 ) , ( =

V v
v u f .
Skew symmetry: For all u, v V,
f (u, v) = f (v, u).
Introduction to Algorithms Day 38 L22.8 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Equivalence of definitions
Theorem. The two definitions are equivalent.
Proof. () Let f (u, v) = p(u, v) p(v, u).
Capacity constraint: Since p(u, v) c(u, v) and
p(v, u) 0, we have f (u, v) c(u, v).
Flow conservation:
( )




=
=
V v V v
V v V v
u v p v u p
u v p v u p v u f
) , ( ) , (
) , ( ) , ( ) , (
Skew symmetry:
f (u, v) = p(u, v) p(v, u)
= (p(v, u) p(u, v))
= f (v, u).
Introduction to Algorithms Day 38 L22.9 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof (continued)
() Let
p(u, v) =
f (u, v) if f(u, v) > 0,
0 if f(u, v) 0.
Capacity constraint: By definition, p(u, v) 0. Since f
(u, v) c(u, v), it follows that p(u, v) c(u, v).
Flow conservation: If f (u, v) > 0, then p(u, v) p(v, u)
= f (u, v). If f (u, v) 0, then p(u, v) p(v, u) = f (v, u)
= f (u, v) by skew symmetry. Therefore,


=
V v V v V v
v u f u v p v u p ) , ( ) , ( ) , (
.
Introduction to Algorithms Day 38 L22.10 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Notation
Definition. The value of a flow f, denoted by | f |,
is given by
) , (
) , (
V s f
v s f f
V v
=
=

.
Implicit summation notation: A set used in
an arithmetic formula represents a sum over
the elements of the set.
Example flow conservation:
f (u, V) = 0 for all u V {s, t}.
Introduction to Algorithms Day 38 L22.11 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Simple properties of flow
Lemma.
f (X, X) = 0,
f (X, Y) = f (Y, X),
f (XY, Z) = f (X, Z) + f (Y, Z) if XY = .
Theorem. | f | = f (V, t).
Proof.
| f | = f (s, V)
= f (V, V) f (Vs, V) Omit braces.
= f (V, Vs)
= f (V, t) + f (V, Vst)
= f (V, t).
Introduction to Algorithms Day 38 L22.12 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Flow into the sink
s
s
t
t
2:3
2:2
2:3
1
:
1
1:3 0:2
2:2
3:3
0:3
0:1
2:2
| f | = f (s, V) = 4 f (V, t) = 4
Introduction to Algorithms Day 38 L22.13 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Cuts
Definition. A cut (S, T) of a flow network G =
(V, E) is a partition of V such that s S and t T.
If f is a flow on G, then the flow across the cut is
f (S, T).
s
s
t
t
2:3
2:2
2:3
1
:
1
1:3 0:2
2:2
3:3
0:3
0:1
2:2
S
T
f (S, T) = (2 + 2) + ( 2 + 1 1 + 2)
= 4
Introduction to Algorithms Day 38 L22.14 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Another characterization of
flow value
Lemma. For any flow f and any cut (S, T), we
have | f | = f (S, T).
Proof. f (S, T) = f (S, V) f (S, S)
= f (S, V)
= f (s, V) + f (Ss, V)
= f (s, V)
= | f |.
Introduction to Algorithms Day 38 L22.15 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Capacity of a cut
Definition. The capacity of a cut (S, T) is c(S, T).
s
s
t
t
2:3
2:2
2:3
1
:
1
1:3 0:2
2:2
3:3
0:3
0:1
2:2
S
T
c(S, T) = (3 + 2) + (1 + 2 + 3)
= 11
Introduction to Algorithms Day 38 L22.16 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Upper bound on the maximum
flow value
Theorem. The value of any flow is bounded
above by the capacity of any cut.
.
) , (
) , (
) , (
) , (
T S c
v u c
v u f
T S f f
S u T v
S u T v
=

=
=



Proof.
Introduction to Algorithms Day 38 L22.17 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Residual network
Definition. Let f be a flow on G = (V, E). The
residual network G
f
(V, E
f
) is the graph with
strictly positive residual capacities
c
f
(u, v) = c(u, v) f (u, v) > 0.
Edges in E
f
admit more flow.
u
u
v
v
0:1
3:5
G:
u
u
v
v
4
2
G
f
:
Example:
Lemma. | E
f
| 2| E|.
Introduction to Algorithms Day 38 L22.18 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Augmenting paths
Definition. Any path from s to t in G
f
is an aug-
menting path in G with respect to f. The flow
value can be increased along an augmenting
path p by
)} , ( { min ) (
) , (
v u c p c
f
p v u
f

=
.
s
s
3:5
G:
2:6 0:2
5:5 2:3
t
t
2:5
s
s
2
3
G
f
:
4
2
7 2
1
t
t
3
2
c
f
(p) = 2
Ex.:
Introduction to Algorithms Day 38 L22.19 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Max-flow, min-cut theorem
Theorem. The following are equivalent:
1. f is a maximum flow.
2. f admits no augmenting paths.
3. | f | = c(S, T) for some cut (S, T).
Proof (and algorithms). Next time.
Introduction to Algorithms
6.046J/18.401J/SMA5503
Lecture 23
Prof. Charles E. Leiserson
Introduction to Algorithms Day 40 L23.2 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Recall from Lecture 22
Flow value: | f | = f (s, V).
Cut: Any partition (S, T) of V such that s S
and t T.
Lemma. | f | = f (S, T) for any cut (S, T).
Corollary. | f | c(S, T) for any cut (S, T).
Residual graph: The graph G
f
= (V, E
f
) with
strictly positive residual capacities c
f
(u, v) =
c(u, v) f (u, v) > 0.
Augmenting path: Any path from s to t in G
f
.
Residual capacity of an augmenting path:
)} , ( { min ) (
) , (
v u c p c
f
p v u
f

=
.
Introduction to Algorithms Day 40 L23.3 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Max-flow, min-cut theorem
Theorem. The following are equivalent:
1. | f | = c(S, T) for some cut (S, T).
2. f is a maximum flow.
3. f admits no augmenting paths.
Proof.
(1) (2): Since | f | c(S, T) for any cut (S, T) (by
the corollary from Lecture 22), the assumption that
| f | = c(S, T) implies that f is a maximum flow.
(2) (3): If there were an augmenting path, the
flow value could be increased, contradicting the
maximality of f.
Introduction to Algorithms Day 40 L23.4 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Proof (continued)
(3) (1): Suppose that f admits no augmenting paths.
Define S = {v V : there exists a path in G
f
from s to v},
and let T = V S. Observe that s S and t T, and thus
(S, T) is a cut. Consider any vertices u S and v T.
We must have c
f
(u, v) = 0, since if c
f
(u, v) > 0, then v S,
not v T as assumed. Thus, f (u, v) = c(u, v), since c
f
(u, v)
= c(u, v) f (u, v). Summing over all u S and v T
yields f (S, T) = c(S, T), and since | f | = f (S, T), the theorem
follows.
s
s
u
u
v
v
S T
path in G
f
Introduction to Algorithms Day 40 L23.5 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Ford-Fulkerson max-flow
algorithm
Algorithm:
f [u, v] 0 for all u, v V
while an augmenting path p in G wrt f exists
do augment f by c
f
(p)
Can be slow:
s
s
t
t
10
9
10
9
10
9
1
10
9
G:
Introduction to Algorithms Day 40 L23.6 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Ford-Fulkerson max-flow
algorithm
Can be slow:
s
s
t
t
0:10
9
0:10
9
0:10
9
0:1
0:10
9
G:
Algorithm:
f [u, v] 0 for all u, v V
while an augmenting path p in G wrt f exists
do augment f by c
f
(p)
Introduction to Algorithms Day 40 L23.7 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Ford-Fulkerson max-flow
algorithm
Can be slow:
s
s
t
t
0:10
9
0:10
9
0:10
9
0:1
0:10
9
G:
Algorithm:
f [u, v] 0 for all u, v V
while an augmenting path p in G wrt f exists
do augment f by c
f
(p)
Introduction to Algorithms Day 40 L23.8 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Ford-Fulkerson max-flow
algorithm
Can be slow:
s
s
t
t
1:10
9
0:10
9
1:10
9
1:1
0:10
9
G:
Algorithm:
f [u, v] 0 for all u, v V
while an augmenting path p in G wrt f exists
do augment f by c
f
(p)
Introduction to Algorithms Day 40 L23.9 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Ford-Fulkerson max-flow
algorithm
Can be slow:
s
s
t
t
1:10
9
0:10
9
1:10
9
1:1
0:10
9
G:
Algorithm:
f [u, v] 0 for all u, v V
while an augmenting path p in G wrt f exists
do augment f by c
f
(p)
Introduction to Algorithms Day 40 L23.10 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Ford-Fulkerson max-flow
algorithm
Can be slow:
s
s
t
t
1:10
9
1:10
9
1:10
9
0:1
1:10
9
G:
Algorithm:
f [u, v] 0 for all u, v V
while an augmenting path p in G wrt f exists
do augment f by c
f
(p)
Introduction to Algorithms Day 40 L23.11 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Ford-Fulkerson max-flow
algorithm
Can be slow:
s
s
t
t
1:10
9
1:10
9
1:10
9
0:1
1:10
9
G:
Algorithm:
f [u, v] 0 for all u, v V
while an augmenting path p in G wrt f exists
do augment f by c
f
(p)
Introduction to Algorithms Day 40 L23.12 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Ford-Fulkerson max-flow
algorithm
Can be slow:
s
s
t
t
2:10
9
1:10
9
2:10
9
1:1
1:10
9
G:
2 billion iterations on a graph with 4 vertices!
Algorithm:
f [u, v] 0 for all u, v V
while an augmenting path p in G wrt f exists
do augment f by c
f
(p)
Introduction to Algorithms Day 40 L23.13 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Edmonds-Karp algorithm
Edmonds and Karp noticed that many peoples
implementations of Ford-Fulkerson augment along
a breadth-first augmenting path: a shortest path in
G
f
from s to t where each edge has weight 1. These
implementations would always run relatively fast.
Since a breadth-first augmenting path can be found
in O(E) time, their analysis, which provided the first
polynomial-time bound on maximum flow, focuses
on bounding the number of flow augmentations.
(In independent work, Dinic also gave polynomial-
time bounds.)
Introduction to Algorithms Day 40 L23.14 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Monotonicity lemma
Lemma. Let (v) =
f
(s, v) be the breadth-first
distance from s to v in G
f
. During the Edmonds-
Karp algorithm, (v) increases monotonically.
Proof. Suppose that f is a flow on G, and augmentation
produces a new flow f . Let (v) =
f

(s, v). Well


show that (v) (v) by induction on (v). For the base
case, (s) = (s) = 0.
For the inductive case, consider a breadth-first path s
Lu v in G
f

. We must have (v) = (u) + 1, since


subpaths of shortest paths are shortest paths. Certainly,
(u, v) E
f

, and now consider two cases depending on


whether (u, v) E
f
.
Introduction to Algorithms Day 40 L23.15 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Case 1
Case: (u, v) E
f
.
(v) (u) + 1 (triangle inequality)
(u) + 1 (induction)
= (v) (breadth-first path),
and thus monotonicity of (v) is established.
We have
Introduction to Algorithms Day 40 L23.16 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Case 2
Case: (u, v) E
f
.
Since (u, v) E
f

, the augmenting path p that produced


f from f must have included (v, u). Moreover, p is a
breadth-first path in G
f
:
p = s Lv u Lt .
Thus, we have
(v) = (u) 1 (breadth-first path)
(u) 1 (induction)
(v) 2 (breadth-first path)
< (v) ,
thereby establishing monotonicity for this case, too.
Introduction to Algorithms Day 40 L23.17 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Counting flow augmentations
Theorem. The number of flow augmentations
in the Edmonds-Karp algorithm (Ford-Fulkerson
with breadth-first augmenting paths) is O(VE).
Proof. Let p be an augmenting path, and suppose that
we have c
f
(u, v) = c
f
(p) for edge (u, v) p. Then, we
say that (u, v) is critical, and it disappears from the
residual graph after flow augmentation.
s
s
2
3
G
f
:
4
2
7 2
1
t
t
3
c
f
(p) = 2
Example:
2
Introduction to Algorithms Day 40 L23.18 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Counting flow augmentations
Theorem. The number of flow augmentations
in the Edmonds-Karp algorithm (Ford-Fulkerson
with breadth-first augmenting paths) is O(VE).
Proof. Let p be an augmenting path, and suppose that
the residual capacity of edge (u, v) p is c
f
(u, v) = c
f
(p).
Then, we say (u, v) is critical, and it disappears from the
residual graph after flow augmentation.
s
s
5
G
f

:
2
4
5
3
t
t
1
Example:
4 4
Introduction to Algorithms Day 40 L23.19 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Counting flow augmentations
(continued)
The first time an edge (u, v) is critical, we have (v) =
(u) + 1, since p is a breadth-first path. We must wait
until (v, u) is on an augmenting path before (u, v) can
be critical again. Let be the distance function when
(v, u) is on an augmenting path. Then, we have
s
s
u
u
v
v
t
t
Example:
(u) = (v) + 1 (breadth-first path)
(v) + 1 (monotonicity)
= (u) + 2 (breadth-first path).
Introduction to Algorithms Day 40 L23.20 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Counting flow augmentations
(continued)
The first time an edge (u, v) is critical, we have (v) =
(u) + 1, since p is a breadth-first path. We must wait
until (v, u) is on an augmenting path before (u, v) can
be critical again. Let be the distance function when
(v, u) is on an augmenting path. Then, we have
(u) = (v) + 1 (breadth-first path)
(v) + 1 (monotonicity)
= (u) + 2 (breadth-first path).
s
s
u
u
v
v
t
t
(u) = 5
(v) = 6
Example:
Introduction to Algorithms Day 40 L23.21 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Counting flow augmentations
(continued)
The first time an edge (u, v) is critical, we have (v) =
(u) + 1, since p is a breadth-first path. We must wait
until (v, u) is on an augmenting path before (u, v) can
be critical again. Let be the distance function when
(v, u) is on an augmenting path. Then, we have
s
s
u
u
v
v
t
t
(u) = 5
(v) = 6
Example:
(u) = (v) + 1 (breadth-first path)
(v) + 1 (monotonicity)
= (u) + 2 (breadth-first path).
Introduction to Algorithms Day 40 L23.22 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Counting flow augmentations
(continued)
The first time an edge (u, v) is critical, we have (v) =
(u) + 1, since p is a breadth-first path. We must wait
until (v, u) is on an augmenting path before (u, v) can
be critical again. Let be the distance function when
(v, u) is on an augmenting path. Then, we have
s
s
u
u
v
v
t
t
(u) 7
(v) 6
Example:
(u) = (v) + 1 (breadth-first path)
(v) + 1 (monotonicity)
= (u) + 2 (breadth-first path).
Introduction to Algorithms Day 40 L23.23 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Counting flow augmentations
(continued)
The first time an edge (u, v) is critical, we have (v) =
(u) + 1, since p is a breadth-first path. We must wait
until (v, u) is on an augmenting path before (u, v) can
be critical again. Let be the distance function when
(v, u) is on an augmenting path. Then, we have
s
s
u
u
v
v
t
t
(u) 7
(v) 6
Example:
(u) = (v) + 1 (breadth-first path)
(v) + 1 (monotonicity)
= (u) + 2 (breadth-first path).
Introduction to Algorithms Day 40 L23.24 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Counting flow augmentations
(continued)
The first time an edge (u, v) is critical, we have (v) =
(u) + 1, since p is a breadth-first path. We must wait
until (v, u) is on an augmenting path before (u, v) can
be critical again. Let be the distance function when
(v, u) is on an augmenting path. Then, we have
s
s
u
u
v
v
t
t
(u) 7
(v) 8
Example:
(u) = (v) + 1 (breadth-first path)
(v) + 1 (monotonicity)
= (u) + 2 (breadth-first path).
Introduction to Algorithms Day 40 L23.25 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Running time of Edmonds-
Karp
Distances start out nonnegative, never decrease, and are
at most |V| 1 until the vertex becomes unreachable.
Thus, (u, v) occurs as a critical edge O(V) times, because
(v) increases by at least 2 between occurrences. Since
the residual graph contains O(E) edges, the number of
flow augmentations is O(VE).
Corollary. The Edmonds-Karp maximum-flow
algorithm runs in O(VE
2
) time.
Proof. Breadth-first search runs in O(E) time, and all
other bookkeeping is O(V) per augmentation.
Introduction to Algorithms Day 40 L23.26 2001 by Charles E. Leiserson 2001 by Charles E. Leiserson
Best to date
The asymptotically fastest algorithm to date for
maximum flow, due to King, Rao, and Tarjan,
runs in O(VE log
E/(V lg V)
V) time.
If we allow running times as a function of edge
weights, the fastest algorithm for maximum
flow, due to Goldberg and Rao, runs in time
O(min{V
2/3
, E
1/2
} E lg (V
2
/E + 2) lg C),
where C is the maximum capacity of any edge
in the graph.

You might also like