Professional Documents
Culture Documents
NUMERICAL ALGORITHMS
Dr Derek O’Connor
n
X
cij = aik × bkj , i = 1, 2, . . . , m, j = 1, 2, . . . , p. (1)
k=1
Do the following
1. Using this definition, derive an expression, Nops (m, n, p), for the number of floating-point additions
and multiplications needed to form the matrix product C = AB,
2. Derive a similar expression Nops (m, n, p, q) for D = ABC, where A is m × n, B is n × p, and C is
p × q , and D is m × q.
Although matrix multiplication is associative, i.e., A(BC) = (AB)C, show that
That is, Nops (m, n, p, q) depends on the order in which the product ABC is formed or parenthesized.
3. Write a Matlabfunction C = function MatMult(A,B) that implements the definition above.
Test and compare this function with Matlab’s C = A*B for random square matrices of size
n = 250, 500, 1000.
4. Find 3 sets of values {m, n, p, q} that demonstrate clearly that the inequality (2) above is true
in general. Use both your function MatMult and Matlab’s C = A*B in this demonstration and
compare the results.
5. Calculate the mflops/sec for each of the tests above. Remember to give the machine parameters
with these rates.
note: When timing the operations above use Matlab’s cputime. Here is the help for this function :
CPUTIME returns the CPU time in seconds that has been used
by the MATLAB process since MATLAB started.
For example:
t=cputime; your_operation; cputime-t
returns the cpu time used to run your_operation.
Analysis
Standard matrix multiplication, C = AB, where A is m × n, B is n × p, and C is m × p is as follows :
n
X
cij = aik × bkj , i = 1, 2, . . . , m, j = 1, 2, . . . , p.
k=1
There are m × p elements cij and each requires the summation ai1 × b1j + ai2 × b2j + · · · + ain × bnj , which
requires n mults and n − 1 adds. Hence we get a total of 2mnp − mp = O(mnp) operations.
The matrix triple multiplication operation D = ABC, where A is m × n, B is n × p, and C is p × q, and
D is m × q, is defined in terms of the matrix pair multiplication above. This gives two possible orders of
multipliction :
D1 = (AB)C or D2 = A(BC).
Mathematically, D1 and D2 are identical, but computationally they are not. Using O(mnp) for matrix pair
multiplication we have
It is very difficult to say in general when these two functions have different or equal values. A crude way of
getting some idea is to run this program
for m = low:high
for n = low:high
for p = low:high
for q = low:high
N1 = n*p*q + m*n*q;
N2 = m*n*p + m*p*q;
if N1 < N2
kl = kl + 1;
Less(kl,:) = [m n p q];
elseif N1 == N2
ke = ke + 1;
Equal(ke,:) = [m n p q];
else
kg = kg + 1;
Great(kg,:) = [m n p q];
end;
end;
end;
end;
end;
%-------------------------- End of MNPQ(low, high) ------------------%
Running this program for low = 1 and high = 5 gives a total of 54 = 625 4-tuples (m, n, p, q) of which 290
give N1 < N2 , 45 give N1 = N2 , and 290 give N1 > N2 . Here are the 45 for which N1 = N2 .
1 1 1 1 2 1 1 2 3 1 1 3 4 1 1 4 5 1 1 5
1 1 2 2 2 2 1 1 3 2 2 3 4 2 2 4 5 2 2 5
1 1 3 3 2 2 2 2 3 3 1 1 4 3 3 4 5 3 3 5
1 1 4 4 2 2 3 3 3 3 2 2 4 4 1 1 5 4 4 5
1 1 5 5 2 2 4 4 3 3 3 3 4 4 2 2 5 5 1 1
1 2 2 1 2 2 5 5 3 3 4 4 4 4 3 3 5 5 2 2
1 3 3 1 2 3 3 2 3 3 5 5 4 4 4 4 5 5 3 3
1 4 4 1 2 4 4 2 3 4 4 3 4 4 5 5 5 5 4 4
1 5 5 1 2 5 5 2 3 5 5 3 4 5 5 4 5 5 5 5
The main point here is that for most 4−tuples (m, n, p, q) we have N1 6= N2 , which prompts the question :
How do we decide on A(BC) or (AB)C ? The obvious answer is to calculate N1 and N2 before we do the
computations.
Exercise 6.0.2 : Re-write the first program to handle the general case, i.e., when k is not a power of 2. ⊓
⊔
Matrix-Chain Multiplication.
Calculating A1 A2 · · · Ak where Ai is an mi−1 × mi matrix is not an easy extension of the 3-matrix case.
Consider A1 A2 A3 A4 . This can be parenthesized in 5 different ways :
((A1 A2 )(A3 A4 )) (A1 ((A2 A3 )A4 )) (A1 (A2 (A3 A4 ))) (((A1 A2 )A3 )A4 )) ((A1 (A2 A3 ))A4 ).
Let C(k) be the number of ways to parenthesize the matrix-chain A1 A2 · · · Ak . Let us put the first parentheses
between Ai−1 and Ai : (A1 A2 . . . Ai−1 )(Ai . . . Ak ). There are C(i) ways to parenthesize the left part and
C(k − i) ways to parenthesize the right part. Now any parenthesization of the left part may be combined
with any parenthesization of the right part and so there are C(i)C(k − i) ways of doing this. Now i can have
any value between 1 and k − 1. Hence we must sum C(i)C(k − i) for all i to get
k−1
X
C(k) = C(i)C(k − i).
i=1
Catalan Numbers
n 1 2 3 4 5 6 7 8 9 10 15
C(k) 1 1 2 5 14 42 132 429 1430 4862 2674440
Consider the following example, taken from Cormen, Leiserson and Rivest, page 307,
Matrix-Chain Multiplication
A1 30 × 35 A2 35 × 15 A3 15 × 5 A4 5 × 10 A5 10 × 20 A6 20 × 25
There are 42 different ways to parenthesize A1 A2 · · · A6 . The optimum can be found by dynamic programming
in O(k 3 ) time. The optimum parenthesization is
Tests, Part 3
The test below gave the following results : %============= C = MatMult 0,1,2(A,B) ============%
%
Matlab 6.5 vs MatMult. (P III Xeon 800MHz)
% Variations of the original are shown in comments
%
n = 250 n = 500 n = 1000
%==================================================%
Matlab 0.046 0.5 3.812 function C = MatMult(A,B)
MatMult 1.843 16.235 130.125 %==================================================%
[m,n] = size(A); [p,q] = size(B);
ratio mm/ml 40.0 32.5 34.0 if n ~= p
%====================== TestMult(A,B)==============% error(’Matrix sizes incompatible’);
% Tests and compares Matlabs matrix multiplication end;
% against the loop implementation of the standard p = q; % makes parameters same as notes.
% definition of matrix multiplication C = zeros(m,p);
%==================================================% for i = 1:m
function [tmatlab, tmatmult] = TestMult(sizes) for j = 1:p
%==================================================% sum = 0.0;
[m ndims] = size(sizes); for k = 1:n
tmatlab = zeros(1,ndims); sum = sum + A(i,k)*B(k,j);
tmatmult = zeros(1,ndims); end;
for n = 1:ndims C(i,j) = sum;
A = rand(sizes(n),sizes(n)); end;
B = rand(sizes(n),sizes(n)); end;
C = zeros(sizes(n),sizes(n)); %------------------ Version 1 --------------------%
tstart = cputime; % for i = 1:m
C = A*B; % for j = 1:p
tmatlab(n) = cputime - tstart; % C(i,j) = C(i,j) + A(i,:)*B(:,j);
end; % end;
for n = 1:ndims % end;
A = rand(sizes(n),sizes(n)); % end;
B = rand(sizes(n),sizes(n)); %------------------ Version 2 --------------------%
C = zeros(sizes(n),sizes(n)); % for i = 1:m
tstart = cputime; % C(i,:) = C(i,:) + A(i,:)*B;
C = MatMult(A,B); % end;
tmatmult(n) = cputime - tstart; %---------------- End of MatMult1 ----------------%
end;
%---------------- End of TestMult ----------------%