Professional Documents
Culture Documents
program?
Three criteria:
1, Correct - testing, verification
Data Structures and Algorithms
2, Efficient - what we are going to study in this
course is mainly concerned with efficiency.
Textbook: 3, Simple.
Data Structures and Algorithms Analysis in C
Programs, including the ones you are going to
by Mark Allen Weiss
write, will be judged by all these criteria.
Florida International University
http://www.cs.fiu.edu/∼ weiss Question: why do we need to worry about the
efficiency, since computers have become more and
Source code in the textbook can be found by
more powerful?
following the link.
Answer: there are problems which are difficult or
Data structure: methods of organising (large
very difficult, and even on the fastest computers,
amount of) data.
they still take a lot of time or space to solve.
Algorithm: the way to process data.
Moreover, some problems can be solved efficiently,
but bad designs will take too much time or space.
Therefore, we must study methods that lead to
good solutions.
of c
of c
a7
of centuries
a 184-digit
40 trillion
centuries
1/1000 s
number
0.28 h
100
Among decidable problems, there is no clear line
separating hard from easy problems. Usually, we
expect tractable problems to be solvable in
of centuries
a 69-digit
1/4000 s
number
0.52 m polynomial time. Problems that have
3.6 y
50
0.33 trillion
million instructions per second)
1/25000 s
centuries
However, in real life, polynomial is more likely N 2
0.32 s
0.1 s
20
1/10000 s
0.28 h
10
2N
N
N
N
Find the shortest paths from one vertex to all An efficient implementation will use iteration
other vertices. instead.
This problem can be solved by an algorithm with When the input gets bigger, the difference of the
O(N2 ) (we will define the O soon) running time, running time of the two programs become
where N is the number of vertices. Therefore, this significant.
is an easy problem. The algorithm is in Chapter
9.
Running time of the recursive Fibonacci
program
Algorithm Analysis Method
Let the running time be T(N), where N is the
Use positive functions to denote running time of
input.
an algorithm.
T (0) = T (1) = 1 Definitions
T (N ) = T (N − 1) + T (N − 2) + 2 • T(N)=O(f(N)) if there are positive constant c
When N=0 or N=1, the program just executes a and n such that T(N)≤cf(N) when N≥n.
return statement, and we assume this takes one • T(N)=Ω(g(N)) if there are positive constant c
unit time. When N>1, the program first and n such that T(N)≥cg(N) when N≥n.
calculates Fib(N-1) and Fib(N-2) and they take
• T(N)=Θ(h(N)) if and only if T(N)=O(h(N))
T(N-1) and T(N-2) time respectively, then add
and T(N)=Ω(h(N)).
the two values and return and we assume these
two operations take two unit time. • T(N)=o(p(N)) if and only if T(N)=O(p(N))
and T(N)6= Θ(p(N)).
We can easily prove (how?) T(N)≥Fib(N). It can
be shown Fib(N)≥ 1.5N . Therefore, the program
has exponentical running time.
Direct Calculation
Computation Model 3
A simple example: Calculate ΣN
i=1 i
int Sum(int N)
Assume any single statement takes one unit time
{
to execute. Let the input size be N, Tavg (N) and
int i, PartialSum;
Tworst (N) are the average and worst case
PartialSum=0; // 1
running time. Obviously, Tavg (N) ≤Tworst (N).
for (i=1;i<=N;i++) // 2
Usually, we only only study worse-case running
PartialSum+=i*i*i; // 3
time, because
return PartialSum; // 4
• it provides an upper bound }
• easier to analyse than the average case Line 1 and 4: one unit each
Sometimes, we also study the best case running Line 3: two *, one +, one assigenment, together
time. 4N
Line 2: one initialization, N+1 tests, N
increments, together 2N+2
Total: 6N+4=O(N)
More Examples:
Analysis of Recursion
Use recurrence equations. Search
longint Factorial(int N) Given an array of integers A[0],A[1],...,A[N-1],
{ and an integer X, find i such that A[i]=X or
if (N<=1) return i=-1 if X is not in the array.
return 1; Linear Search
else
Search the array one by one
return N*Factorial(N-1);
O(N)
}
Binary Search
T (1) = 1 If the array is sorted, A[0]≤A[1]≤ · · · ≤A[N-1],
T (N ) = 2 + T (N − 1) then check the middle element, if it equals X,
then the job is done, and if it does not equal X,
T (N ) = 2 + 2 + · · · + 2 +T (1) then either search the first half or the second half
| {z }
N −1 of the array.
= O(N )
Code: Binary search, Figure 2.9
Running time O(logN). How to calculate this?
Exponentiation: X N
Obvious solution, N-1 multiplication, O(N),
inefficient when N is large.
An O(logN) solution
X 62
=(X 2 )31
An algorithm is =(X 4 )15 X 2
• O(logN) if it takes constant (i.e., O(1)) time =(X 8 )7 X 4 X 2
to cut the problem size by a fraction (usually =(X 16 )3 X 8 X 4 X 2
1/2).
=(X 32 )X 16 X 8 X 4 X 2
• O(N) if it takes constant time to cut the
Code: Efficient exponentiation, Figure 2.11.
problem size by a constant (e.g., N→N-1).
Calculate Pow(X,62) following the execution of
the algorithm.
Bad alternatives
return Pow(Pow(X,2),N/2);
return Pow(Pow(X,N/2),2);
infinite loop when N=2
return Pow(X,N/2)*Pow(X,N/2);
less efficient, why (what is the running time)?