Professional Documents
Culture Documents
SORTS
Counting sort
Bucket sort
Radix sort
How fast can we sort?
All the sorting algorithms we have
seen so far are comparison sorts:
only use comparisons to determine
the relative order of elements.
• E.g., insertion sort, merge sort,
quicksort, heapsort.
The best worst-case running time
that we’ve seen for comparison
sorting is O(n lg n) .
Decision-tree example
Sort 〈a1, a2, …, an〉 1:2
2:3 1:3
Counting sort
Bucket sort
Radix sort
Counting sort
Assumptions:
n records
Each record contains keys and data
Space
The unsorted list is stored in A, the
sorted list will be stored in an additional
array B
Uses an additional array C of size k
Counting sort
Main idea:
1. For each key value i, i = 1,…,k, count the
number of times the keys occurs in the
unsorted input array A. Store results in an
auxiliary array, C
2. Use these counts to compute the offset.
Offseti is used to calculate the location
where the record with key value i will be
stored in the sorted output list B.
Counting-sort example
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C:
B:
Loop 1
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 0 0 0 0
B:
for i ← 1 to k
do C[i] ← 0
Loop 2
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 0 0 0 1
B:
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
Loop 2
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 0 1
B:
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
Loop 2
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 1 1
B:
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
Loop 2
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 1 2
B:
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1 ⊳ C[i] = |{key =
i}|
Loop 2
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 2 2
B:
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1 ⊳ C[i] = |{key = i}|
Loop 3
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 2 2
B: C': 1 1 2 2
for i ← 2 to k
do C[i] ← C[i] + C[i–1] ⊳ C[i] = |{key ≤ i}|
Loop 3
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 2 2
B: C': 1 1 3 2
for i ← 2 to k
do C[i] ← C[i] + C[i–1] ⊳ C[i] = |{key ≤ i}|
Loop 3
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 0 2 2
B: C': 1 1 3 5
for i ← 2 to k
do C[i] ← C[i] + C[i–1] ⊳ C[i] = |{key ≤ i}|
Loop 4
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 1 3 5
B: 3 C': 1 1 2 5
for j ← n downto 1
doB[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Loop 4
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 1 2 5
B: 3 4 C': 1 1 2 4
for j ← n downto 1
doB[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Loop 4
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 1 2 4
B: 3 3 4 C': 1 1 1 4
for j ← n downto 1
doB[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Loop 4
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 1 1 1 4
B: 1 3 3 4 C': 0 1 1 4
for j ← n downto 1
doB[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Loop 4
1 2 3 4 5 1 2 3 4
A: 4 1 3 4 3 C: 0 1 1 4
B: 1 3 3 4 4 C': 0 1 1 3
for j ← n downto 1
doB[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Counting sort
for i ← 1 to k
do C[i] ← 0
for j ← 1 to n
do C[A[ j]] ← C[A[ j]] + 1 ⊳ C[i] = |{key = i}|
for i ← 2 to k
do C[i] ← C[i] + C[i–1] ⊳ C[i] = |{key ≤ i}|
for j ← n downto 1
doB[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Analysis
for i ← 1 to k
Θ(k) do C[i] ← 0
for j ← 1 to n
Θ(n) do C[A[ j]] ← C[A[ j]] + 1
for i ← 2 to k
Θ(k) do C[i] ← C[i] + C[i–1]
for j ← n downto 1
Θ(n) do B[C[A[ j]]] ← A[ j]
C[A[ j]] ← C[A[ j]] – 1
Θ(n + k)
Running time
If k = O(n), then counting sort takes Θ(n) time.
• But, sorting takes Ω(n lg n) time!
• Where’s the fallacy?
Answer:
• Comparison sorting takes Ω(n lg n) time.
• Counting sort is not a comparison sort.
• In fact, not a single comparison between
elements occurs!
Stable sorting
Counting sort is a stable sort: it preserves
the input order among equal elements.
A: 4 1 3 4 3
B: 1 3 3 4 4
Radix sort
• Origin: Herman Hollerith’s card-sorting machine
for the 1890 U.S. Census.
• Digit-by-digit sort.
• Hollerith’s original idea: sort on most-significant
digit first.
• Sort on least-significant digit first with auxiliary
stable sort.
Radix sort
Main idea
Break key into “digit” representation key =
id, id-1, …, i2, i1
"digit" can be a number in any base, a
character, etc
Radix sort:
for i= 1 to d
Sort “digit” i using a stable sort
Radix sort
Which stable sort?
1 .78 0 / 0 /
2 .17 1 .17/ .17/
.12 1 .12
3 .39
4 2 .23 .21 .26/ 2 .21 .23 .26/
.26
5 .72 3 .39/ 3 .39/
6 .94 4 / 4 /
7 .21 5 / 5 /
8 .12 6 .68/ 6 .68/
9 .23
10 7 .72 .78/ 7 .72 .78/
.68
8 / 8 /
9 .94/ 9 .94/
BUCKET_SORT (A)
1 n ← length [A] , k number of buckets
2 For i = 1 to n do
3 Insert A[i] into an appropriate bucket.
4 For i = 1 to k do
5 Sort ith bucket using any reasonable
comparison sort.
6 Concatenate the buckets together in
order
How should we implement the
buckets?
Linked Lists or Arrays?
Linked list saves space as some buckets
may have few entries and some may
have lots
With linked fast algorithms like Quick Sort
, Heap Sort cannot be used.
Analysis
Let S(m) denote the number of
comparisons for a bucket with m keys.
Let ni be the no of keys in the ith bucket.
Total number of comparisons = ∑ki=1
S(ni)
Total number of comparisons =
Analysis contd..
Let, S(m)=Θ(m (log m))
If the keys are uniformly distributed the
= k(n/k) log(n/k)
= n log(n/k)
. If k=n/10 , then n log(10) comparisons
would be done and running time would
be linear in n.