Professional Documents
Culture Documents
AND
MATRIX COMPUTATIONS
1. DIRECT METHODS FOR SOLVING LINEAR SYSTEMS OF EQUATIONS
We shall assume that this system has a unique solution and proceed to describe the
simple “Gaussian Elimination Method”, (from now on abbreviated as GEM),Page 2 of 11
for finding the solution. The method reduces the system to an upper triangular system
using elementary row operations (ERO).
Let
y1(1)
(1)
y2
y(1) = M
y (1)
n
where yi(1) = yi
We assume a11(1) ≠ 0
Then by ERO applied to A(1) , (that is, subtracting suitable multiples of the first
row from the remaining rows), reduce all entries below a11(1) to zero. Let the
resulting matrix be denoted by A(2).
Ri +mi(11)R1
A(1)
→ A(2)
2
ai(11)
where mi(11) = − ; i > 1.
a11(1)
1 0 0 L 0 0
(1)
m21
(1)
M = m31 (1)
I n −1
M
(1)
mn1
(In-1 being the n-1 × n-1 identity matrix).
i.e.
M(1) A(1) = A(2)
Let
y(2) = M(1) y(1)
i.e.
R i + m i 1 R1
y (1) → y ( 2 )
Then the system Ax = y is equivalent to
A(2)x = y(2)
Next we assume
(2)
a22 ≠0
Ri +mi(22)
A(2) → A(3) ;
3
ai(22)
m ( 2)
i2 = − ( 2) ; i>3
a22
Here
1 0 0 L 0
0 1 0 L 0
0 m ( 2)
M ( 2) = 32
0 m42
( 2)
I n −2
M M
0 m ( 2)
n2
and
M(2) A(2) = A(3) ; M(2) y(2) = y(3) ;
1 0 L 0
0 1 L 0 0 r ×( n − r )
M M M M
0 0 L 1
M (r ) =
0 0 L mr( +r )1r
0 0 L mr( +r )2 r I n −r
M M M M
0 0 L mnr( r )
4
a11(1) ... ... ... ... a1(n1)
0 ( 2)
a22 ... ... ... a2( rn)
M 0 arr( r ) ... ... arn( r )
M (r )
A (r )
=A ( r +1)
=
M M 0 ar( r++1r1+) 1 ... ar( +r 1+n1)
M M M ... ... ...
0 anr( r ++11) ( r +1)
0 0 ... ann
M(r) y(r) = y(r+1)
M(n-1) M(n-2) …. M(1) A(1) = A(n) ; M(n-1) M(n-2) …. M(1) y(1) = y(n)
a11(1) a12(1) L a1(n1)
( 2)
a22 L a2( 2n)
A( n ) =
where
O
( n)
ann
which is an upper triangular matrix and the given system is equivalent to
A(n)x = y(n)
Since this is an upper triangular, this can be solved by back substitution; and
hence the system can be solved easily.
Note further that each M(r) is a lower triangular matrix with all diagonal entries as
1. Thus determinant of M(r) is 1 for every r. Now,
Thus
det A(n) = det M(n-1) det M(n-2) …. det M(1) det A(1)
5
Now A(n) is an upper triangular matrix and hence its determinant is a11(1) a22
(2) (n)
L ann .
Thus det A is given by
Thus the simple GEM can be used to solve the system Ax = y and also to
evaluate det A provided aii( i ) ≠ 0 for each i.
Further note that M(1), M(2), …. , M(n-1) are lower triangular, and nonsingular as
their det = 1 and hence not zero. They are all therefore invertible and their
inverses are all lower triangular, i.e. if L = M(n-1) M(n-2) …. M(1) then L is lower
triangular and nonsingular and L-1 is also lower triangular.
Now L-1 is lower triangular which we denote by L and A(n) is upper triangular
which we denote by U, and we thus get the so called LU decomposition
A = LU
EXAMPLE:
x1 + x2 + 2x3 = 4
2x1 - x2 + x3 = 2
x1 + 2x2 =3
Here
1 1 2 4
A = 2 −1 1 y = 2
1 2 0 3
6
1 1 2 1 1 2
R 2 − 2 R1
= 2 −1 → −3 − 3 = A (2)
(1)
A 1 0
1 0
R 3 − R1 0 − 2
2 1
a11(1) = 1 ≠ 0
(1)
m21 = −2
(1)
m31 = −1
(2)
a22 = −3 ≠ 0
1 0 0 4 4
M (1)
= − 2 1 0 y (1 )
= 2 → − 6 = y (2) = M (1 )
y (1 )
− 1 0 1 3 −1
1
R3 + R 2 1 1 2
3
→ −3 − 3 = A ( 3) = −3
(2) (3)
A 0 a33
0 − 3
0
(2)
m31 =1 3
1 0
0 4
M (2)
= 0 1 0 y (3)
= M (2)
y (2)
= − 6
− 3
0 1 1
3
x1 + x2 + 2x3 = 4
-3x2 - 3x3 = -6
- 3x3 = -3
Back Substitution
x3 = 1
-3x2 - 3 = - 6 ⇒ -3x2 = -3 ⇒ x2 = 1
7
x1 + 1 + 2 = 4 ⇒ x1 = 1
x1 1
x = x 2 = 1
x 1
3
The determinant of the given matrix A is
Now
1 0 0
(M )
( −1)
(1)
= 2 1 0
1 0 1
1 0 0
(M ( ) )
( −1)
= 0 0
2
1
0 − 1 1
3
L = M(2) M(-1)
0
1 0 0 1 0
= 2 1 0 0 1 0
1 0 1
0 − 1 1
3
1 0 0
L = L(-1) = 2 1 0
1 − 1 1
3
1 1 2
U=A (n)
=A (3)
= 0 − 3 − 3
0 0 − 3
8
Therefore A = LU i.e.,
1 0
1 1 2 0 1 1 2
A= 2 −1 1 = 2 1 0 0 − 3 − 3
1 2 0 0 0 − 3
1 − 1 1
3
We observed that in order to apply simple GEM we need arr( r ) ≠ 0 for each
stage r. This may not be satisfied always. So we have to modify the simple
GEM in order to overcome this situation. Further, even if the condition arr( r ) ≠ 0 is
satisfied at each stage, simple GEM may not be a very accurate method to use.
What do we mean by this? Consider, as an example, the following system:
Here,
0 . 235262
y (1)
= 0 . 127653 a11(1) = 0.000003 ≠ 0
0 . 285321
(1)
a21 0.215512
(1)
m21 =− (1)
=− = −71837.3
a11 0.000003
(1)
a31 0.173257
(1)
m31 =− (1)
=− = −57752.3
a11 0.000003
9
1 0 0 0.235262
M (1)
= − 71837 .3 1 0 ; y (2) (1)
=M y(1)
= − 16900.5
− 57752 . 3 1 − 13586.6
0
(2)
a22 = −15334.9 ≠ 0
(2)
a32 −12327.8
(2)
m32 =− =− = −0.803905
(2)
a22 −15334.9
1 0 0
M(2) = 0 1 0
0 − 0.803905 1
0 .235262
y (3)
=M (2) (2)
y = − 16900 . 5
− 0 . 20000
A(3)x = y(3)
x3 = 0.40 00 00
x2 = 0.47 97 23
x1 = -1.33 33 3
10
This compares poorly with the correct answers (to 10 digits) given by
x1 = 0.67 41 21 46 94
x2 = 0.05 32 03 93 39.1
x3 = -0.99 12 89 42 52
Thus we see that the simple Gaussian Elimination method needs modification in
order to handle the situations that may lead to arr( r ) = 0 for some r or situations as
arising in the above example. In order to do this we introduce the idea of Partial
Pivoting in the next section.
11
1.2 GAUSS ELIMINATION METHOD WITH PARTIAL PIVOTING
Note: The elementary row operations on the matrix A and the vector y can be
simultaneously carried out by introducing the ”Augmented matrix”, Aaug which is
obtained by appending y as an additional column at the end.
Example 1:
x1 + x2 + 2 x3 = 4
2x1 – x2 + x3 = 2
x1 + 2x2 =3
We have
1 1 2 4
Aaug = 2 − 1 1 2
1 0 3
2
1st Stage: The pivot has to be chosen as 2 as this is the largest absolute valued
entry in the first column. Therefore we do
2 −1 1 2
Aaug →
R12
1 1 2 4
1 3
2 0
Therefore we have
0 1 0 2 −1 1
M(1) = 1 0 0 and M(1) A(1) = A(2) = 1 1 2
0 1 1 2 0
0
12
2
(1) (1) (2)
M A =y = 4
3
Next we have
2 −1 1 2
1
R2 − R1
2 0 3 3 3
(2)
Aaug 2 2
1
R3 − R1 0 5 −1 2
2
2 2
Here
1 2 1
0 0 −1
M(2) = − 1 1 0 ; M A = A = 0
(2) (2) (3) 3 3
2 2 2
− 1 0 − 1
0 1 5
2 2 2
2
2
(2) (3)
M y =y = 3
2
5
Now at the next stage the pivot is since this is the entry with the largest
2
absolute value in the 1st column of the next submatrix. So we have to do another
row interchange.
Therefore
2 −1 1 2
(3)
Aavg →
R23 0 5 −1 2
2 2
0 3 3 3
2 2
13
2 −1 1
1 0 0
(3)
M = 0 0 1 M(3) A(3) = A(4) = 0 5 −1
2 2
0 0 0 3
1 3
2 2
2
(3) (3) (4)
M y =y = 2
3
Next we have
2 −1 1 2
3
R2 − R2
(4)
Aavg → 5
0
5 −1 2
2 2
9 9
0 0
5 5
Here
1 2
0 0 −1 1
M(4) = 0 1 0 M A =A = 0
(4) (4) (5) 5 − 1
2 2
0 −3 1 0 9
5
0
5
2
M(4) y(4) = y(5) = 2
9
5
This completes the reduction and we have that the given system is equivalent to
the system
A(5)x = y(5)
i.e.
2x1 – x2 + x3 = 2
14
5 1
x2 - x3 = 2
2 2
9 9
x3 =
5 5
x3 = 1
5 1 5 5
x2 - = 2 giving x2 = and hence x2 = 1.
2 2 2 2
2x1 – 1 + 1 = 2 giving x1 = 1
Example 2:
Let us now apply the Gaussian elimination method with partial pivoting to the
following example:
the system to which we had earlier applied the simple GEM and had obtained
solutions which were far away from the correct solutions.
Note that
0 .235262
y = 0 .127653
0 . 285321
15
We observe that at the first stage we must choose 0.215512 as the pivot. So we
have
0 . 127653 0 1 0
y(1) = y →
R12
y(2) = 0 . 235262 M(1) = 1 0 0
0 .285321 0 1
0
where
a 21 0.000003
m21 = - =- = - 0.000014
a11 0.215512
a31 0.173257
m31 = - =- = - 0.803932
a11 0.215512
1 0 0 0 .127653
M(2) = − 0.000014 1 0 ; y(3)
=M (2)
y(2)
= 0 .235260
− 0.803932 0 1 0 .182697
In the next stage we observe that we must choose 0.361282 as the pivot. Thus
we have to interchange 2nd and 3rd row. We get,
16
0 .215512 0 .375623 0 .476625
A(3) →
R
A(4) =
23 0 0.361282 0.242501
0.332140
0 0.213467
1 0 0 0 .127653
M = 0
(3) 0 1 y (4)
=M (3)
y(3)
= 0 .182697
0 0 0 .235260
1
0.213467
m32 = - = - 0.590860
0.361282
1 0 0 0 .127653
0
M(4) = 0 1 y(5)
=M(4)
y(4)
= 0.182697
0 − 0.59086 1 0 .127312
A(5) x = y(5)
which is an upper triangular system and can be solved by back substitution to get
x3 = 0.674122
x2 = 0.053205 ,
x1 = - 0.991291
which compares well with the 10 decimal accurate solution given at the end of
section 1.1(page11). Notice that while we got very bad errors in the solutions
while using simple GEM whereas we have come around this difficulty by using
partial pivoting.
17
1.3 DETERMINANT EVALUATION
det M(k) det M(k-1) …. det M(1) det A = Product of the diagonal entries in the final
upper triangular matrix.
Now det M(i) = 1 if it refers to the process of nullifying entries below the diagonal
to zero; and
Therefore det M(k) …. det M(1) = (-1)m where m is the number of row inverses
effected in the reduction.
Therefore det A = (-1)m product of the diagonal entries in the final upper
triangular matrix.
In our example 1 above, we had M(1), M(2), M(3), M(4) of which M(1) and M(3)
referred to row interchanges. Thus therefore there were two row interchanges
and hence
5 9
det A = (−1)2 (2) = 9.
2 5
In example 2 also we had M(1), M(3) as row interchange matrices and
therefore det A = (-1)2 (0.215512) (0.361282) (0.188856) = 0.013608
LU decomposition:
is not a lower triangular matrix in general and hence using partial pivoting we
cannot get LU decomposition in general.
18
1.4 GAUSS JORDAN METHOD
Remark:
1.5 L U DECOMPOSITIONS
LUx = y,
i.e.,
Lz = y ……………..(2)
Now (2) is a triangular system – infact lower triangular and hence we can solve it
by forward substitution to get z.
Substituting this z in (1) we get an upper triangular system for x and this can be
solved by back substitution.
19
Further if A = LU is a LU decomposition then, det. A can be
calculated as
det. A = det. L . det. U
= l11 l22 ….lnn u11u22 …..unn
where lii are the diagonal entries of L, and uii are the diagonal entries of U.
I. TRIDIAGONAL MATRIX
Let
b1 a2 0 0 .... 0
c1 b2 a3 0 .... 0
0 c2 b3 a4 .... 0
A = .... .... .... .... .... ....
.... .... .... .... .... ....
0 .... 0 cn − 2 bn −1 an
0 bn
.... .... 0 cn −1
20
b1 a2 0 .... 0
c1 b2 a3 .... 0
.... .... .... .... ....
δi =
.... .... .... .... ....
0 .... ci − 2 bi −1 ai
0 .... 0 c i −1 bi
We define δ0 = 1
δi δ
= bi − c i −1 a i i − 2
δ i −1 δ i −1
δi
Setting = ki this can be written as
δ i −1
ai
bi = k i + ci −1 ...............................( II )
k i −1
i.e. we need the lower triangular and upper triangular parts also to be
‘Tridiagonal’ triangular.
21
Note that if A = (aij) then because A is Tridiagonal, aij is nonzero only when
i and j differ by 1. i.e. only ai-1i, aii, aii+1 are nonzero. In fact,
ai-1i = ai
aii = bi …………….. (III)
ai+1i = ci
li +1i = wi
lii = 1 …………….. (IV)
lij = 0 if i) j>i or ii) j<i with i-j ≥ 2.
uii +1 = αi+1
uii = ui ……………………. (V)
Therefore,
n
a ij = ∑ lik u kj …………….. (VI)
k =1
Therefore
n
a i −1i = ∑ li −1k u ki
k =1
Therefore
αi = ai ………………….. (VII)
22
= Lii−1U i−1i + LiiU ii
Therefore
b i = w i − 1α i + u i …………….. (VIII)
n
a i +1i = ∑
k =1
li +1 k u k i
ci = wi u i
c i −1 a i
bi = + ui
u i −1
Therefore
c i −1 a i
bi = u i + …………….. (X)
u i −1
δi
ui = ki = …………….. (XI)
δ i −1
ci ciδ i−1
wi = = …………….. (XII)
ui δi
23
α i = ai …………….. (XIII)
(XI), (XII) and (XIII) completely determine the matrices L and U and hence
we get the LU decomposition.
Note : We can apply this method only when δI are all nonzero. i.e. all the
principal minors have nonzero determinant.
Example :
2 −2 0 0 0
− 2 1 1 0 0
Let A= 0 −2 5 −2 0
0 0 9 −3 1
0 − 1
0 0 3
We have
b1 = 2 b2 = 1 b3 = 5 b4 = -3 b5 = -1
c1 = -2 c2 = -2 c3 = 9 c4 = 3
a2 = -2 a3 = 1 a4 = -2 a5 = 1
We have
δ0 = 1
δ1 = 2
δ2 = b2 δ1 – a2 c1 δ0 = 2-4 = -2
δ3 = b3 δ2 – a3 c2 δ1 = (-10) – (-2) (2) = -6
δ4 = b4 δ3 – a4 c3 δ2
= (-3) (-6) – (-18) (-2) = -18
δ5 = b5 δ4 – a5 c4 δ3
= (-1) (-18) – (3) (-6)
= 36.
Note δ1,δ2,δ3,δ4,δ5 are all nonzero. So we can apply the above method.
δ1 δ −2 δ −6
u1 = = 2; u 2 = 2 = = −1; u 3 = 3 = =3
δ0 δ1 2 δ2 − 2
24
δ 4 − 18 δ 5 36
u4 = = = 3 ; and u5 = = = −2
δ3 − 6 δ 4 − 18
From (XII) we get
c1 − 2
w1 = = = −1
u1 2
c2 − 2
w2 = = =2
u2 − 1
c3 9
w3 = = =3
u3 3
c4 3
w4 = = =1
u4 3
α 2 = a2 = −2
α 3 = a3 = 1
α 4 = a4 = −2
α 5 = a5 = 1
Thus;
1 0 0 0 0 2 −2 0 0 0
−1 1 0 0 0 0 −1 1 0 0
L = 0 2 1 0 0 U = 0 0 3 −2 0
;
0 0 3 1 0 0 0 0 3 1
0 1 0 − 2
0 0 1 0 0 0
25
In the above method we had made all the diagonal entries of L as 1. This
will facilitate solving the triangular system Lz = y (equation (2) on page 19).
However by choosing these diagonals as 1 it may be that the ui, the diagonal
entries in U become small thus creating problems in back substitution for the
system Ux = z (equation (1) on page 19). In order to avoid this situation
Wilkinson suggests that in any triangular decomposition choose the diagonal
entries of L and U to be of the same magnitude. This can be achieved as
follows:
We seek
A = LU
where
l1
w2 l2
L=
O
wn −1ln
u1 α 2 0 .... 0
0 u2 α 3 .... ....
U = .... .... .... .... ....
0 0 .... u n −1 αn
0 u n
0 .... 0
lii = li
Now li +1i = wi
uii = ui
ui +1i = α i +1
26
n
ai = a i −1i = ∑ li −1k uki = li −1i −1ui −1i = li −1α i
k −1
Therefore
ai = li −1α i …………….. (VII`)
n
bi = a ii = ∑ lik uki = lii −1ui −1i + lii uii
k −1
= wi −1α i + li ui
Therefore
n
ci = a i +1i = ∑ li +1k u ki
k −1
= li+1iuii
= wiu i
a i c i −1
= + liu i
l i −1 u i −1
ai ci −1
bi = + pi …………….. (X`)
pi −1
27
where
pi = l i ui
pi = ki =
δi
δ i −1
Therefore
li u i =
δi
δ i −1
δi
we choose li = δ i −1 …………….. (XIV)
δi δi
ui = sgn
δ i −1 δ i −1 …………….. (XV)
Thus li and ui have same magnitude. These then can be used to get wi and αi
from (VII`) and (IX`). We get finally,
δi δ δ
li = ; u i = sgn . i i …………….. (XI`)
δ i −1 δ i −1 δ i −1
ci
wi = ……………... (XII`)
ui
α i = ai l …………….. (XIII`)
i −1
28
δ0 = 1 δ1 = 2 δ2 = -2 δ3 = -6 δ4 = -18 δ5 =
36
b1 = 2 b2 = 1 b3 = 5 b4 = -3 b5 = -1
c1 = -2 c2 = -2 c3 = 9 c4 = 3
a1 = -2 a3 = 1 a4 = -2 a5 = 1
l2 = 1 u2 = −1
l3 = 3 u3 = 3
l4 = 3 u4 = 3
C1 − 2 C2 − 2
w1 = = =− 2 ; w2 = = =2 ;
u1 2 u2 −1
C3 9 C4 3
w3 = = =3 3 ; w4 = = = 3
u3 3 u4 3
a2 − 2 a3 1
α2 = = =− 2 ; α3 = = =1 ;
l1 2 l2 1
29
a4 − 2 a5 1
α4 = = ; α5 = =
l3 3 l4 3
2 − 2 0 0 0
2 −2 0 0 0 2 0 0 0 0
0
−2 1 −1 1 0 0
1 0 0 − 2 1 0 0 0
0 0 3 −2 0
A = 0 −2 5 −2 0 = 0 2 3 0 0 3
9 −3 1 0 0 1
0 0 0 3 3 3 0 0 0 3
0 0 0 3 −1 3
2
14444 − 2
0 0 0 3
4244444 3 0 0 0 0
L 144444 4 2444444 3
U
in which the L and U have corresponding diagonal elements having the same
magnitude.
30
1.6 DOOLITTLE’S LU DECOMPOSITION
determined as follows :
n
a 11 = ∑
k =1
l1k u k 1
= u 11 B B since l 11 = 1.
B B
Therefore
u11 = a11
In general,
n
a1 j = ∑
k =1
l 1 k u kj
= u 1j B B since l 11 = 1. B B
⇒ u 1j = a 1j . . . . . . . . . (I)
B B B B
Thus the first row of U is the same as the first row of A. The first column of L is
determined as follows:
n
a j1 = ∑
k =1
l jk u k 1
= l j1 u 11 B B B B since u k1 = 0 if k>1
B B
⇒ l j1 = a j1 /u 11 . . . . . . . . . (II)
B B B B B B
31
Thus (I) and (II) determine respectively the first row of U and first column of L.
The other rows of U and columns of L are determined recursively as given below:
Suppose we have determined the first i-1 rows of U and the first i-1 columns of L.
Now we proceed to describe how one then determines the i th row of U and i th P P P P
column of L. Since first i-1 rows of U have been determined, this means, u kj ; are B B
all known for 1 ≤ k ≤ i-1 ; 1 ≤ j ≤ n. Similarly, since first i-1 columns are known
for L, this means, l ik are all known for 1 ≤ i ≤ n ; 1 ≤ k ≤ i-1.
B B
Now
n
a ij = ∑
k =1
l ik u kj
i
= ∑
k =1
l ik u kj
since l ik = 0 for k>i B B
i −1
= ∑l
k =1
ik u kj + l ii u ij
i −1
= ∑l
k =1
ik u kj + u ij since l ii = 1. B B
i−1
⇒ u ij = a ij − ∑
k =1
l ik u kj . . . . . ... . . . .(III)
Note that on the RHS we have a ij which is known from the given matrix. Also the B B
sum on the RHS involves l ik for 1 ≤ k ≤ i-1 which are all known because they B B
involve entries in the first i-1 columns of L ; and they also involve u kj ; 1 ≤ k ≤ i-1 B B
which are also known since they involve only the entries in the first i-1 rows of U.
Thus (III) determines the i th row of U in terms of the known given matrix and
P
P
quantities determined upto the previous stage. Now we describe how to get the
i th column of L :
P
P
n
a ji = ∑
k =1
l jk u ki
i
= ∑k =1
l jk u ki Since u ki = 0 if k>i B B
i −1
= ∑l
k =1
jk u ki + l ji u ii
1 i−1
⇒ l ji =
u ii
a ji − ∑
k =1
l jk u ki …..(IV)
32
Once again we note the RHS involves u ii , which has been determined using (III); B B
a ji which is from the given matrix; l jk ; 1 ≤ k ≤ i-1 and hence only entries in the first
B B B B
i-1 columns of L; and u ki , 1 ≤ k ≤ i-1 and hence only entries in the first i-1 rows of B B
U. Thus RHS in (IV) is completely known and hence l ji , the entries in the i th B B P
P
l j1 = a j1 /u 11
B B B B B B 1 st column of L.
P
P
1 i −1
l ji =
u ii
a ji − ∑ k =1
l jk u ki ; j = i, i+1, i+2,…..,n
(Note for j<i we have l ji = 0) B B
U Example:
Let
2 1 −1 3
− 2 2 6 − 4
A=
4 14 19 4
6 −
0 6 12
Let us determine the Doolittle decomposition for this matrix.
U First step:
1 st row of U : same as 1 st row of A.
P P P P
Therefore u 11 = 2 ; u 12 = 1 ; u 13 = -1 ; u 14 = 3 B B B B B B B B
U 1 st column of L:
UPU
UPU
U
l 11 = 1;
B B
l 21 = a 21 /u 11 = -2/2 = -1.
B B B B B B
l 31 = a 31 /u 11 = 4/2 = 2.
B B B B B B
33
l 41 = a 41 /u 11 = 6/2 = 3.
B B B B B B
U Second step: U
u 22 = a 22 – l 21 u 12 = 2 – (-1) (1) = 3.
B B B B B B B B
u 23 = a 23 – l 21 u 13 = 6 – (-1) (-1) = 5.
B B B B B B B B
l 22 = 1.
B B
l 32 = (a 32 – l 31 u 12 ) /u 22
B B B B B B B B B
B B = [14 – (2)(1)]/3 = 4.
l 42 = (a 42 – l 41 u 12 ) /u 22
B B B B B B B B B
B B = [0 – (3)(1)]/3 = -1.
U Third Step: U
3 rd row of U: u 31 = 0
P
P
B B (because U is upper triangular )
u 32 = 0B B
u 33 = a 33 – l 31 u 13 – l 32 u 23
B B B B B B B B B B B B
3 rd column of L : l 13 = 0
P
P
B B (because L is lower triangular)
l 23 = 0
B B B B
l 33 =1
B B
l 43 = (a 43 – l 41 u 13 – l 42 u 23 )/ u 33
B B B B B B B B B B B B B B
34
U Fourth Step: U
4 th row of U: u 41 = 0
P P B B
u 44 = a 44 – l 41 u 14 – l 42 u 24 – l 43 u 34
B B B B B B B B B B B B B B B B
l 44 = 1.
B B
Thus
1 0 0 0 2 1 −1 3
−1 1 0 0 0 3 5 −1
L= ; U = . . . . . . . . . . .(V)
2 4 1 0 0 0 1 2
3 −1 1 0 0 0 − 2
2
and
A = LU.
This gives us the LU decomposition by Doolittle’s method for the given A.
As we observed in the case of the LU decomposition of a tridiagonal matrix;
it is not advisable to choose the l ii as 1; but to choose in such a way that the B B
diagonal entries of L and the corresponding diagonal entries of U are of the same
magnitude. We describe this procedure as follows:
Once again 1 st row and 1 st column of U & L respectively is our first concern:
P
P
P
P
U Step 1: U a 11 = l 11 u 11
B B B B B
a1 j
⇒ u ij =
l1 1
Thus note that u 1j have been scaled now as compared to what we did earlier.
B B
Similarly,
35
a
l j1 = j1
u11
These determine the first row of U and first column of L. Suppose we have
determined the first i-1 rows of U and first i-1 columns of L. We determine now
the i th row of U and i th column of L as follows:
P
P
P
P
n
a ii = ∑
k =1
l ik u ki
i
= ∑
k =1
l ik u k i for l ik = 0 if k>i
B B
i−1
= ∑k =1
l ik u ki + l ii u ii
Therefore
i −1
l ii u ii = a ii − ∑ l ik u ki = p i , say
k =1
i −1
Choose l ii = pi = a ii − ∑ l ik u ki
k =1
u ii = − sgn pi pi
n i
aij = ∑ lik ukj = ∑ lik u kj Q lik = 0 for k > i
k −1 k =1
i −1
= ∑l k =1
ik u kj + l ii u ij
i −1
⇒ u ij = a ij − ∑ l ik u kj l ii
B B
k =1
n
a ji = ∑l
k =1
jk u ki
36
i
= ∑l
k =1
jk u ki Q u k i = 0 if k > i
i −1
= ∑k =1
l jk u ki + l ji u ii
i −1
⇒ l ji = a ji − ∑ l jk u ki u ii ,
B B
k =1
l 11 u 11 = a 11 = 2 ∴ l 11 = 2 ; u 11 = 2
a12 1 a 1 a 3
u12 = = u13 = 13 = − ; u14 = 14 =
l11 2 l11 2 l11 2
1 1 3
u11 = 2 ; u12 = ; u13 = − ; u14 =
2 2 2
a 21 2
l 21 = =− =− 2
u11 2
a 31 4
l 31 = = =2 2
u 11 2
a 41 6
l 41 = = =3 2
u 11 2
Therefore
l11 = 2
l 21 = − 2
l31 = 2 2
l 41 = 3 2
U Second Step: U
l 22 u 22 = a 22 − l 21u12
37
(
1
=2− − 2 =3 )
2
∴ l 22 = 3; u 22 = 3
u23 =
( a23 − l21u13 )
l22
1
6 − − 2 −
(
2
)
= = 5
3 3
u24 =
( a24 − l21u14 )
l22
3
(
( −4 ) − − 2
2
)
= =− 1
3 3
Therefore
5 1
u21 = 0; u22 = 3; u23 = ; u 24 = −
3 3
l32 =
( a32 − l31u12 )
u22
1
14 − 2 2 (
2
)
=
3
=4 3
l42 =
( a42 − l41u12 )
u22
1
(
0 − 3 2
2
)
=
3
=− 3
38
Therefore
l12 = 0
l 22 = 3
l32 = 4 3
l 42 = 3
U Third Step: U
( )
1
= 19 − 2 2 −
5
− 4 3 ( )
2 3
=1
∴ l33 = 1; u 33 = 1
u34 =
( a34 − l31u14 − l32u24 )
l33
1
(
3
4 − 2 2
2
)
− 4 3 −
3
( )
=
1
=2
l43 =
[ a43 − l41u13 − l42u23 ]
u33
5
(
1
−6 − 3 2 −
2
− − 3
)3
( )
=
1
=2
Therefore
l 13 = 0
l = 0
23
l 33 = 1
l 43 = 2
39
U Fourth Step: U
( )
3
= 12 − 3 2 (
1
− − 3 − )
− (2 )(2 )
2 3
= -2
∴ l44 = 2; u 44 = − 2
∴ u 41 = 0; u 42 = 0; u 43 = 0; u 44 = − 2
l 14 = 0
l = 0
24
l 34 = 0
l 44 = 2
U Note: Compare this with the L and U of page 35. What is the difference.
U
th
diagonal 1 by 1 and 4 diagonal –2 by - 2 .
P
P
These then give the
diagonals of the U we have obtained above.
40
(2) Divide each entry to the right of a diagonal in the U of page 35 by these
replaced diagonals.
41
1.7 DOOLITTLE’S METHOD WITH ROW INTERCHANGES
We have seen that Doolittle factorization of a matrix A may fail the moment
at stage i we encounter a uii which is zero. This occurrence corresponds to the
occurrence of zero pivot at the ith stage of simple Gaussian elimination method.
Just as we avoided this problem in the Gaussian elimination method by
introducing partial pivoting we can adopt this procedure in the modified Doolittle’s
procedure. The Doolittle’s method which is used to factorize A as LU is used
from the point of view of reducing the system
Ax = y
to two triangular systems
Lz = y
Ux = z
as already mentioned on page 19.
Thus instead of actually looking for a factorization A = LU we shall be
looking for a system,
A*x = y*
and for which A* has LU decomposition.
We illustrate this by the following example: The basic idea is at each stage
calculate all the uii that one can get by the permutation of rows of the matrix and
choose that matrix which gives the maximum absolute value for uii.
As an example consider the system
Ax = y
where
3 1 − 2 − 1 3
2 − 2 2 3 − 8
A=
1 5 − 4 − 1
y=
3
3 1
− 1
2 3
We want LU decomposition for some matrix that is obtained from A by row
interchanges.
We keep lii = 1 for all i .
Stage 1:
1st diagonal of U. By Doolittle decomposition,
u11 = a11 = 3
If we interchange 2nd or 3rd or 4th rows with 1st row and then find the u11 for the
new matrix we get respectively u11 = 2 or 1 or 3. Thus interchange of rows does
42
not give any advantage at this stage as we have already got 3, without row
interchange, for u11.
3 1 −2 −1
0 u u23 u24
U is of the form
22
; A and Y remaining unchanged.
0 0 u33 u34
0 0 0 u44
Stage 2
u 22 = a22 − l 21u12
2
= − 2 − (1) = −
8
3 3
Suppose we interchange 2nd row with 3rd row of A and calculate u22 : our new a22
is 5.
But note that l21 and l31 get interchanged. Therefore new l21 is 1/3.
Suppose instead of above we interchange 2nd row with 4th row of A:
New a22 = 1 and new l21 = 1 and therefore new u22 = 1 – (1) (1) = 0
43
Of these 14/3 has largest absolute value. So we prefer this. Therefore we
interchange 2nd and 3rd row.
3 1 −2 − 1 3
1 5 −4 − 1 3
NewA = ; Newy =
2 −2 2 3 − 8
3 3 − 1
1 2
1 0 0 0 3 1 −2 −1
13 1 0 0 0 14 3 * *
NewL = ; NewU =
23 * 1 0 0 0 * *
0
0 0 *
1 * * 1
Now we do the Doolittle calculation for this new matrix to get 2nd row of U and 2nd
column of L.
u 23 = a 23 − l 21u13
1
= (− 4 ) − (− 2 ) = −
10
3 3
u 24 = a 24 − l 21u14
1
= (− 1) − (− 1) = −
2
3 3
2nd column of L:
l 32 = [a 32 − l 31u12 ] ÷ u 22
2 14
= (− 2 ) − (1 ) ÷
3 3
4
= −
7
l42 = [a42 − l41u12 ] ÷ u11
= [3 − (1 )(1 )] ÷
14
3
=0
44
1 0 0 0
1
3
1 0 0
Therefore new L has form
2 −4 1 0
3 7
1 0 * 1
3 1 −2 −1
0 14 3 − 10 −2
New U has form 3 3
0 0 * *
0 *
0 0
Note: We had three choices of u22 to be calculated, namely –8/3, 14/3, 0 before
we chose 14/3. It appears that we are doing more work than Doolittle. But this is
not really so. For, observe, that the rejected u22 namely – 8/3 and 0 when divided
by the chosen u22 namely 14/3 give the entries of L below the second diagonal.
3rd Stage:
3rd diagonal of U:
u 33 = a 33 − l 31u13 − l 32 u 23
2 4 10
= 2 − (− 2) − − −
3 7 3
10
=
7
Suppose we interchange 3rd row and 4th row of new A obtained in 2nd stage. We
get new a33 = 2.
But in L also the second column gets 3rd and 4th row interchanges
Therefore new l31 = 1 and new l32 = 0
Therefore new u33 = a33 – l31 u13 – l32 u23
10
= 2 − (1)(− 2) + (0 ) −
3
= 4.
45
Of these two choices of u33 we have 4 has the largest magnitude. So we
interchange 3rd and 4th rows of the matrix of 2nd stage to get
3 1 − 2 − 1 3
1 5 − 4 − 1 3
NewA = NewY =
3 1 2 3 −1
2 − 2 2 3 − 8
1 0 0 0 3 1 −2 −1
13 1 0 0 0 14 3 − 10 −2
NewL = ; NewU =
3 3
1 0 1 0
0 0 4 *
2
−4 * 1 0 0 0 *
3 7
Now for this set up we calculate the 3rd stage entries as in Doolittle’s method:
u 34 = a34 − l 31u14 − l32 u 24
2
= 3 − (1)(− 1) − (0 ) − = 4
3
l 43 = (a 43 − l 41u13 − l 42 u 23 ) ÷ u 33
2 4 10
= 2 − (− 2) − − − ÷ 4
3 7 3
= 5/14.
1 0 00 3 1 −2 −1
13 1 0 0 0 14 3 − 10 −2
∴ NewL = ; NewU =
3 3
1 0 1 0
0 0 4 4
2
−4 5 1 0 0 0 *
3 7 14
46
4th Stage
u 44 = [a44 − l 41u14 − l 42 u 24 − l 43u 34 ]
2 4 2 5
= 3 − (− 1) − − − − (4)
3 7 3 14
= 13/7.
3 1 −2 − 1 3
1 5 −4 − 1 3
∴ NewA = A * = NewY = Y *
=
3 1 2 3 − 1
2 −2 3 − 8
2
New L = L* , New U = U*
1 0 0 0 3 1 −2 −1
13 1 0 0 0 14 3 − 10 −2
3 3
L =
*
;U =
*
,
1 0 1 0 0 0 4 4
2 −4 5 1 0 0 0 13
3 7 14 7
and A* = L*U*
A*x=y*
L * z = y*
U*x = z
47
Z1 =3
1
z1 + z 2 = 3 ⇒ z 2 = 3 − 1 = 2
3
z1 + z 3 = − 1 ⇒ z 3 = − 1 − z 1 = − 4
2 4 5
z1 − z 2 + z 3 + z 4 = −8
3 7 14
2 4 5
(3 ) − (2 ) + (− 4 ) + z 4 = − 8
3 7 14
52
⇒ z4 = −
7
3
2
∴z = −4
52
−
7
13 52
x4 = − therefore x4 = -4.
7 7
therefore x3 = 3
14 10 2
x2 − x3 − x 4 = 2
3 3 3
48
10 2
x 2 − (3) − (− 4 ) = 2
14
3 3 3
⇒ x2 = 2
3 x1 + x 2 − 2 x3 − x 4 = 3
⇒ 3x1 + 2 − 6 + 4 − 3 ⇒ x1 = 1
1
2
x =
3
− 4
Some Remarks:
49
Let A = LU be its LU decomposition
∴ u 11 = a 11
n n
a1i = ∑ l1k u ki = ∑ u k1u ki
k =1 k =1
u11 = a11
Having determined the 1st i-1 rows of U; we determine the ith row of U as follows:
n n
a ii = ∑
k =1
l ik u k i = ∑ k =1
u k l 2 s i n c e l ik = u k i
50
i
= ∑k =1
u ki
2
s in c e u ki = 0 fo r k > i
i −1
= ∑uk =1
ki
2
+ u ii 2
i −1
∴ u ii 2 = a ii − ∑k =1
u ki 2
i −1
∴ u ii = a ii − ∑k =1
u ki 2 ;
( Note that uki are known for k ≤ i -1,because 1st i-1 rows have already been
obtained).
n n
a ij = ∑l
k =1
ik u kj = ∑u
k =1
ki u kj Now we need uij for j > i
i
= ∑ u ki u kj Because uki = 0 for k > i
k =1
i −1
= ∑u
k =1
ki u kj + u ii u ij
Therefore
i −1
u ij = a ij −
∑
k =1
u ki u kj ÷ u ii
51
i −1
u ii = a ii − ∑ u ki
2
k =1
∴
u = a −
i −1
ij ij ∑ u ki kj ÷ u ij
u
k = 1
determines the ith row of U in terms of the previous rows. Thus we get U and L is
U1. This is called CHOLESKY decomposition.
Example:
1 −1 1 1
−1 5 − 3 3
Let A =
1 −3 3 1
1
3 1 10
1st row of U
u11 = a11 = 1
u12 = a12 ÷ u11 = − 1
u13 = a13 ÷ u11 = 1
u = a14 ÷ u11 = 1
14
2nd row of U
u = a − u 212 = 5 − 1 = 2
22 22
3rd row of U
52
u = a − u 213 − u 2 23 = 3 − 1 − 1 = 1
33 33
4th row of U
u 44 = a44 − u 214 − u 2 24 − u 2 34 = 10 − 1 − 4 − 4 = 1
1 −1 1 1 1 0 0 0
0 2 −1 2 −1 2 0 0
∴U = ∴U 1 = L =
2 1 0
and
0 0 1 1 −1
0 0 1 1 2 1
0 2
A = LU
= LLT
= UTU
53
2. ITERATIVE METHODS FOR SOLVING LINEAR SYSTEM OF EQUATIONS
We have an nxn matrix M and we want to get the solution of the system
x = Mx + y ……………………..(1)
We obtain the solution x as the limit of a sequence of vectors, {x k } which are obtained as
follows:
We start with any initial vector x(0), and calculate x(k) from,
We shall mention that a necessary and sufficient condition for the sequence of vectors
(k)
x to converge to a solution x of (1) is that the spectral radius M sp of the iterating
matrix M is less than 1 or if M for some matrix norm. (We shall introduce the notion of
norm formally in the next unit).
We shall now consider some iterative schemes for solving systems of linear equations,
Ax = y …………….(3)
54
We denote by D, L, U the matrices
0 0 K K 0
a21 0 K K 0
L = a31 a32 0 K 0 ..................................(7)
K K K K K
a K an ,n −1 0
n1 an 2
Note that,
A = D + L + U……………………… (9).
We assume that aii ≠ 0 ; i = 1, 2, ……, n …………(10)
so that D-1 exists.
We now describe two important iterative schemes, in the next section, for solving the
system (3).
55
2.2 JACOBI ITERATION
x1(0)
(0)
= 2 . . . . . . . .(12)
x
x(
0)
M
(0)
xn
and substitute this vector for x in the RHS of (11) and calculate x1,x2, ….., xn and this
vector is called x(1). We now substitute this vector in the RHS of (11) to calculate again
x1, x2, ….., xn and call this new vector as x(2) and continue this procedure to calculate the
sequence {x (k) } . Thus,
Dx = - (L + U) x + y …………………. (13)
giving
x = J x + yˆ ……………… (14)
where
J = -D-1 (L + U) …………….(15)
and, we get
57
as the iterative scheme. This is similar to (2 in section 2.1) with the iterating matrix M as
J = -D-1 (L + U); J is called the Jacobi Iteration Matrix. The scheme will converge to the
solution x of our system if J sp < 1 . We shall see an easier condition below:
We have
1
a11
1
-1 a22
D =
O
1
ann
and therefore
0 −
a12
−
a13
.... −
a1n
a11 a11 a11
− a21 a23 a2n
0 − .... −
J = −D ( L + U ) =
−1 a22 a22 a22
.... .... .... .... ....
− an1 a
− n2 −
an,n−1
0
....
ann ann ann
Ri = ∑
aij
=
(a i1 + ai 2 + .... + ai ,i −1 + ai ,i +1 + .... + ain )
j ≠i aii aii
then
J ∞
= max{R1 ,....., Rn } < 1
58
ai1 + ai 2 + ..... + ai ,i −1 + ai ,i +1 + ..... + ain < aii
i.e. in each row of A the sum of the absolute values of the non diagonal entries is
dominated by the absolute value of the diagonal entry (in which case A is called ‘strictly
row diagonally dominant’). Thus the Jacobi iteration scheme for the system (3)
converges if A is strictly row diagonally dominant (Of course, this condition may not be
satisfied) and still Jacobi iteration scheme may converge if J sp < 1.
Example 1:
x1 + 2x2 – 2x3 = 1
x1 + x2 + x3 = 0 ………….(I)
2x1 + 2x2 + x3 = 0
Let us apply the Jacobi iteration scheme with the initial vector as
0
x (0)
= θ = 0 ………….(II)
0
1 2 − 2 1 0 0
We have A = 1 1 1 ; D = 0 1 0
2 1 0 1
2 0
0 2 − 2 1
L +U = 1 0 1 ; y = 0
2 0 0
2
0 − 2 + 2 1
J = − D −1 (L + U ) = − 1 0 − 1 ;
−1
yˆ = D y = 0
− 2 − 2 0 0
0
(0 )
x = 0
0
59
x( ) = Jx(
k−1)
+ yˆ,
k
k = 1, 2,.......
1
(1)
( 0)
∴ x = Jx + yˆ = yˆ = 0 since x (0) is the zero vector.
0
0 − 2 + 2 1 1
(2 ) (1)
x = Jx + yˆ = − 1 0 − 1 0 + 0
− 2 − 2 0 0 0
0 1 1
= − 1 + 0 = − 1
− 2 0 − 2
0 − 2 2 1 1
(3 ) (2 )
x = Jx + yˆ = − 1 0 − 1 − 1 + 0
− 2 − 2 0 − 2 0
− 2 1 − 1
= 1 + 0 = 1
0 0 0
0 − 2 2 − 1 1
x (4 ) = Jx (3) + yˆ = − 1 0 − 1 1 + 0
− 2 − 2 0 0 0
− 2 1 − 1
= 1 + 0 = 1 = x (3 )
0 0 0
60
∴ The solution is
−1
(k) (3)
x = lim x =x = 1
k →∞
0
Example 2:
1 0 0
8 0 0 8
We have D = 0 − 8 0 ∴ D −1 = 0 −1 0
8
0 0 9
0 0 1
9
0 − 0.25
+ 0.25
J = − D −1 (L + U ) = + 0.125 0.375
0
− 0.22222 − 0.11111 0
1
−1
yˆ = D y = − 2 .375
3.33333
61
a33 = 9 and a31 + a32 = 2 + 1 = 3 ∴ a 33 > a 31 + a 32
Thus we have strict row diagonally dominant matrix A. Hence the Jacobi iteration
scheme will converge. The scheme is,
0
x 0 = 0
0
x ( k ) = Jx ( k −1) + yˆ
0 − 0.25 0.25
= 0.125 0 0.375 x ( k −1) + yˆ
− 0.22222 − 0.11111 0
1
(1)
x = yˆ = − 2 .375
3 .33333
We continue the iteration until the components of x(k) and x(k+1) differ by at most, say;
3x10-5 , that is, x ( k +1 ) − x ( k ) ≤ 3 x10 − 5 , we get x (1) − x (0 ) = 3 .33333 . So we
∞ ∞
continue
2.42708
(2 ) (1)
x = Jx + yˆ = − 1.00000 x (2 ) − x (1 ) = 1.42708 ≥∈
∞
3.37500
2 .09375
(3 ) (2 )
x = Jx + yˆ = − 0 .80599 ; x (3 ) − x ( 2 ) = 0.46991 ≥∈
∞
2 .90509
1 .92777
(4 ) (3 )
x = Jx + yˆ = − 1 .02387 ; x ( 4 ) − x (3 ) = 0 .21788 ≥∈
∞
2 .95761
62
1 .99537
(5 ) (4 )
x = Jx + yˆ = − 1 .02492 ; x (5 ) − x ( 4 ) = 0 .06760 ≥∈
∞
3 .01870
2 .01091
(6 )
x = Jx (5 ) + yˆ = − 0 .99356 ; x ( 6 ) − x (5 ) = 0.03136 ≥∈
∞
3 .00380
1 .99934
(7 ) (6 )
x = Jx + yˆ = − 0 .99721 ; x ( 7 ) − x (6 ) = 0 .01157 ≥∈
∞
2 .99686
1 .99852
(8 )
x = Jx (7 ) + yˆ = − 1 .00126 ; x (8 ) − x ( 7 ) = 0 .00405 ≥∈
∞
2 .99984
2 .00027
(9 ) (8 )
x = Jx + yˆ = − 1 .00025 ; x (9 ) − x (8 ) = 0 .00176 ≥∈
∞
3 .00047
2 .00018
x (10 ) = Jx (9 ) + yˆ = − 0 .99979 ; x (10 ) − x (9 ) = 0 .00050 ≥∈
∞
2 .99997
1 .99994
(11 ) (10 ) ( ) ( )
x = Jx + yˆ = − 0 .99999 ; x 11 − x 10 = 0 .00024 ≥∈
∞
2 .99994
1 .99998
(12 ) (11 )
x = Jx + yˆ = − 1 .00003 ; x (12 ) − x (11 ) = 0 .00008 ≥∈
∞
3 .00001
63
2 .00001
(13 ) (12 )
x = Jx + yˆ = − 1 .00000 ; x (13 ) − x (12 ) = 0 .00003 =∈
∞
3 .00001
64
2.3 GAUSS – SEIDEL METHOD
Ax = y …………….. (I)
In the Jacobi scheme we used the values of x2( k ) , x3( k ) ,L , xn( k ) obtained in the kth
iteration, in place of x2, x3, ….., xn in the first equation,
a11 x1 + a12 x2 + ..... + a1n xn = y1
a11 x1( k +1) = − a12 x2( k ) − a13 x3( k ) ..... − a1n xn( k ) + y1
aii xi( k +1) = − ai1 x1( k ) − ai 2 x2( k ) − ...... − ai ,i −1 xi(−k1) − ai ,i +1 xi(+k1) − ..... − ain xn( k ) + yi LL (*)
What Gauss – Seidel suggests is that having obtained x1( k +1) from the first
equation use this value for x1 in the second equation to calculate x2( k +1) from
a22 x2( k +1) = − a21 x1( k +1) − a23 x3( k ) − ...... − a2 n xn( k ) + y2
and use these values of x1( k +1) , x2( k +1) , in the 3rd equation to calculate x3( k +1) , and so
on. Thus in the equation (*) use x1( k +1) , L , xi(−k1+1) in place of x1( k ) , x2( k ) , L , xi(−k1) to get
the following modification of the i -th equation to calculate xi( k +1) :
aii xi( k +1) = − ai1 x1( k +1) − ai 2 x2( k +1) − ...... − ai ,i −1 xi(−k1+1) − ai ,i +1 xi(+k1) − ai ,i + 2 xi(+k2) ..... − ain xn( k ) + yi
In matrix notation we can write this as,
Dx ( k +1) = − Lx ( k +1) − Ux ( k ) + y
65
Thus we get the Gauss – Seidel iteration scheme as,
x (k +1) = Gx ( k ) + yˆ ……..(II)
where,
G = -(D+L)-1U
yˆ = (D + L ) y
−1
Example 3:
x1 + 2x2 – 2x3 = 1
x1 + x2 + x3 = 0 ,
2x1 + 2x2 + x3 = 0
considered in example 1 on page 59; and for which the Jacobi scheme gave the
exact solution in the 3rd iteration. (see page 60). We shall now try to apply the
Gauss – Seidel scheme for this system. We have,
1 2 − 2 1
A = 1 1 1 ; y = 0
2 1 0
2
1 0 0 0 −2 2
D + L = 1 1 0 ; −U = 0 0 −1
2 1 0 0
2 0
66
1 0 0
(D + L ) −1
= −1 1 0
0 −2 1
Thus,
1 0 0 0 −2 2
G = −( D + L) U = −1 1 0 0 0 −1
−1
0 −2 1 0 0 0
0 −2 2
G = 0 2 − 3
0 2
0
G sp
= 2 >1
Hence the Gauss – Seidel scheme for this system will not converge. Thus for
this system the Jacobi scheme converges so rapidly giving the exact solution in
the third iteration itself whereas the Gauss – Seidel scheme does not converge.
Example 4:
1 1
x1 − x2 − x3 = 1
2 2
x1 + x 2 + x 3 = 0
1 1
− x1 − x 2 + x 3 = 0
2 2
67
1 −1 −1
2 2 1
A= 1 1 1 ; y = 0
0
− 1 −1 1
2 2
1 0 0 1 0 0
(D + L)
−1
D+L= 1 1 0 ; = −1 1 0 ,
− 1 −1 1 0 1 1
2 2 2
0 1 1
2 2
−U = 0 0 −1 .
0 0 0
Thus,
0 0
1 1 1
0 2 2
G = − ( D + L ) U = −1
−1
1 00 0 −1
0 1 10 0 0
2
0 1 1
2 2
∴G = 0 − 12 − 3 2 ............(*)
0 0 − 1
2
x (k +1 ) = Gx ( k ) + yˆ
0
x (0 ) = 0
0
68
where
1 0 01 1
yˆ = ( D + L )
−1
y = −1 1 0 0 = −1 ;
0 1 10 0
2
and
where G is given (*).
Notice that G is upper triangular and hence we readily get the eigenvalues of G
as its diagonal entries. Thus the eigenvalues of G are, λ1 = 0, λ2 = -1/2, λ3 = -
1
1/2. Hence G sp
= < 1 . Hence in this example the Gauss – Seidel scheme
2
will converge.
Let us now carry out a few steps of the Gauss – Seidel iteration, since we
have now been assured of convergence. (We shall first do some exact
calculations).
0 1 1
x (1 ) = Gx (0 ) + yˆ = G 0 + − 1 = − 1
0 0 0
1 1
(2 ) (1 )
x = Gx + yˆ = G − 1 + − 1
0 0
0 1 1
2 2 1 1
= 0 − 1 − 3 −1 + −1
2 2
0 0 − 1 0 0
2
1− 1
2
= − 1− 1
(2 )
0
69
1− 1 + 1 2
( )
2 2
x (3 ) = Gx ( 2 )
+ y = − 1− 2 +
ˆ 1 1
22
0
1− 2 +
1 1 2 − ..... +
(− 1)k −1
k −1
2 2
x (k )
= − 1 − 2 +
1 1 − ..... + (− 1 ) k −1
22 2 k −1
0
Clearly,
1 − 1 + 1 2 − 1 3 + 1 4 .....
2 2 2 2
x(
k)
(
→ − 1 − 1 2 + 1 2 − 1 3 .....
2 2
)
0
2 3
(k )
x → − 2 3
0
Of course, here ‘we’ knew ‘a priori’ that the sequence is going to sum up
neatly for each component and so we did exact calculation. If we had not noticed
this we still would have carried out the computations as follows:
1
x (1 ) = Gx (0 ) + yˆ = − 1 as before
0
70
0 .5
(2 ) (1)
x = Gx + yˆ = − 0 .5
0
0 .625
(3 ) (2 )
x = Gx + yˆ = − 0 .625
0
0 .6875
(4 ) (3 )
x = Gx + yˆ = − 0 .6875
0
0 .65625
x (5 ) = Gx (4 ) + yˆ = − 0 .65625 ; x (5 ) − x ( 4 ) = 0 .03125
∞
0
0 .671875
x (6 ) = Gx (5 ) + yˆ = − 0 .671875 ; x (6 ) − x (5 ) = 0 .025625
∞
0
0 . 664062
(7 )
x = − 0 . 664062 ; x ( 7 ) − x (6 ) = 0 .007813
∞
0
0 . 667969
x (8 ) = − 0 . 667969 ; x (8 ) − x (7 ) = 0 .003907
∞
0
0 . 666016
(9 )
x = − 0 . 666016 ; x (9 ) − x (8 ) = 0 .001953
∞
0
0 . 666504
(10 )
x = − 0 . 666504 ; x (10 ) − x (9 ) = 0 .000488
∞
0
71
Since now error is < 10-3 we may stop here and take x(10) as our solution for the
system. Or we may improve our accuracy by doing more iterations, to get,
0 . 666656
(14 )
x = − 0 . 666656 x (14 ) − x (13 ) = 0 .000031 < 10 −4
∞
0
and hence we can take x(14) as our solution within error 10-4.
Let us now try to apply the Jacobi scheme for this system. We have
1 −1 −1
2 2
A= 1 1 1 ; and therefore,
− 1 −1 1
2 2
0 1 1
2 2
J = −1 0 −1
1 1 0
2 2
λ −1 −1
2 2 1 λ
λI − J = + 1 λ + 1 = λ + λ 2 − + 1
2 2
−1
2
−1
2
λ
1 1 1
λ1 = − ; λ2 = + i 15 ; λ3 = − i 15
2 2 4 2 4
1 1 15
∴ λ1 = ; λ2 = λ3 = + = 16 = 2
2 4 4 4
72
∴ J sp
= 2 which is >1. Thus the Jacobi scheme for this system will not
converge.
Thus, in example 3 we had a system for which the Jacobi scheme converged
but Gauss – Seidel scheme did not converge; whereas in example 4 above we
have a system for which the Jacobi scheme does not converge, but the Gauss –
Seidel scheme converges. Thus, these two examples demonstrate that, in
general, it is not ‘correct’ to say that one scheme is better than the other.
Example 5:
2x1 – x2 =y1
-x1 + 2x2 – x3 = y2
-x2 + 2x3 –x4 =y3
-x3 + 2x4 = y4
Here
2 −1 0 0
−1 2 −1 0
A= ,
0 − 1 2 − 1
0 −
0 1 2
0 1 0 0
2
1 0 1 0
J = 2 2
0 1 0 1
2 2
0 0 1 0
2
16 λ 4 - 12 λ 2 + 1 = 0 ………………(CJ)
Set λ 2 = α
73
Therefore
Hence
J sp
= 0.8090 ; and the Jacobi scheme will converge.
2 0 0 0
−1 2
(D + L ) =
0 0
0 −1 2 0
0 0 −1 2
0 1 0 0
0 0 1 0
−U =
0 0 0 1
0 0
0 0
1 0 0 0
2
1 1 0 0
(D + L ) =
−1 4 2
1 1 1 0
8 4 2
1 1 1 1
16 8 4 2
74
1 0 0 0
2 0 1 0 0
1 1 0 0
0 0 1 0
G = − (D + L ) U =
−1 4 2
1 1 1 0 0 0 0 1
8 4 2
1 1 1 1 0 0 0 0
16 8 4 2
0 1 0 0
2
0 1 1 0
= 4 2
0 1 1 1
8 4 2
1 1 1
0
16 8 4
λ 2 (16λ2 − 12λ + 1) = 0
λ2 = 0 ; and
0.0955, 0.6545.
Thus,
G sp
= 0.6545 < 1
75
G sp
= J
2
sp ; G sp
< J sp
Thus the Gauss – Seidel scheme converges faster than the Jacobi scheme.
76
2.4 SUCCESSIVE OVERRELAXATION (SOR) METHOD
Ax = y ………..(I)
We take a scalar parameter ω ≠ 0 and multiply both sides of (I) to get an equivalent
system,
ωAx = ωy ………………(II)
A = (D + L + U )
We write (II) as
i.e.
i.e.
i.e.
i.e.
x = - (D + ωL)-1 [(ω-1)D + ωU]x + ω [D + ωL]-1y.
x ( k + 1) = M ω x ( k ) + yˆ
……………(III)
x(
0)
= zero vector; initial guess
where,
M ω = −(D + ωL)
−1
[(ω −1)D + ωu]
77
and
yˆ = (D + ω L ) ω y
−1
Notice that if ω = 1 we get the Gauss – Seidel scheme. The strategy is to choose ω such
that M ω sp is < 1, and is as small as possible so that the scheme converges as rapidly as
possible. This is easier said than achieved. How does one choose ω? It can be shown
that convergence cannot be achieved if ω ≥ 2. (We assume ω > 0). ‘Usually’ ω is chosen
between 1 and 2. Of course, one must analyse M ω sp as a function of ω and find that
value ω0 of ω for which this is minimum and work with this value of ω0.
Example 6:
1
1−ω ω 0 0
2
1ω − 1ω 2 1
1−ω + ω 2
1
ω 0
= 2 2 4 2
1 2 1 3 1 1 2 1 3 1 1
ω − ω ω− ω + ω 1−ω + ω 2 ω
4 4 2 2 8 4 2
1 1
ω − ω 4 1 2 1 3 1 4 1 1 2 1 3 1 2
3
ω − ω + ω ω− ω + ω 1−ω + ω
8 8 4 4 16 2 2 8 4
Thus the eigenvalues of Mω are roots of the above equation. Now when is λ = 0 a root?
If λ = 0 we get, from (CMω),
78
(ω − 1 + λ )2 (ω −1+ λ)
2 2
16 − 12 +1 = 0
ω 2
λ ω 2
λ
Setting
µ =
2 (ω − 1+ λ )2
we get
2
ω λ
16 µ 4 − 12µ 2 + 1 = 0
µ = ± 0 . 3090 ; ± 0 . 8090 .
Now
(ω − 1 + λ )2 = µ 2
= 0.0955 or 0.6545 ……….(*)
ω λ
2
1
1 2
λ = µ 2ω 2 − (ω − 1) ± µω µ 2ω 2 − (ω − 1)
1
2 4
Thus
M ω sp
when ω = 1.2 is 0.4545
79
We can show that in this example when ω = ω0 = 1.2596, the spectral radius M ω 0 is
smaller than M ω for any other ω. We have
M 1.2596 = 0.2596
Thus the SOR scheme with ω = 1.2596 will be the method which converges fastest.
Note:
We had M 1 .2 sp
= 0 .4 5 4 5
and
M 1.2596 sp
= 0.2596
Thus a small change in the value of ω brings about a significant change in the spectral
radius M ω sp .
80
3. REVIEW OF PROPERTIES OF EIGENVALUES AND EIGENVECTORS
Ax = αx
Example:
Let
− 9 4 4
A = − 8 3 4
− 16 7
8
α = −1
Consider
1
x = 2 .
0
We have
− 9 4 4 1 − 1 1
Ax = − 8 3 4 2 = − 2 = −1 2
− 16 8 7 0 0 0
= (− 1)x = αx
Hence α = −1 is such that there exists a nonzero vector x such that Ax = αx.
Thus α is an eigenvalue of A.
1
Similarly, if we take α = 3, x = 1 we find that
2
Ax = αx. Thus, α = 3 is also an eigenvalue of A.
81
{
Wα = x ∈ C n : Ax = αx }
Then we have the following properties of Wα :
Example: Consider the A in the example on page 81. We have seen that α = -1
is an eigenvalue of A. What is W( −1) , the eigensubspace corresponding to –1?
−8 4 4
M = A +I = −8 4 4
− 16 8
8
We now can use our row reduction to find the general solution of the system.
1 − 1 − 1
R 2 − R1
−8 4 4 1 2 2
M → 0 0
− R1
0 8
→ 0 0 0
R 3 − 2 R1
0 0 0 0 0 0
82
1 1
Thus, x 1 = x2 + x3
2 2
1 x2 + 1 x3
2 2 1 1 1
= 1
x2 x2 2 + x3 0
2 0 2 2
x3
1 1
= A1 2 + A 2 0
0 2
1 1
A1 2 + A2 0 .
0 2
1 1
Note: The vectors 2 , 0 form a basis for ω-1 and therefore
0 2
dim W( −1) = 2.
What is W(3) the eigensubspace corresponding to the eigenvalue 3 for the above
matrix?
83
− 12 4 4
N = A − 3I = − 8 0 4
− 16 4
8
− 12 4 − 12 4
4
4
R 1 − 2 R 3 and R1 − 4 R 1
N 3 3 → 0 −8 4
R + R
→ 0 − 8
4 3 4
3 3
3 3
0 8 −
4 0 0 0
3 3
∴ 12 x1 = 4 x 2 + 4 x 3
8 4
x2 = x3 ∴ x3 = 2 x2
3 3
∴12 x1 = 4 x 2 + 8 x 2 = 12 x 2
∴ x 2 = x1
∴ x 2 = x1 ; x 3 = 2 x 2 = 2 x1
∴ The general solution is
x1 1
x1 = x1 1
2x 2
1
1
κ 1
2
84
1
Note: The vector 1 forms a basis for W(3) and hence
2
dim. W(3) = 1.
85
We have,
λ − a 11 − a 12 K − a 1n
− a 21 λ − a 22 K − a 2n
C (λ ) = K K K K
K K K K
− a n1 − a n2 K λ − a nn
= λ n − (a11 + K + a nn )λ n −1 + K + (− 1) det . A
n
−9 4 4
A = −8 3 4
− 16 7
8
λ+9 −4 −4
∴ C (λ ) = det .( λ I − A ) = 8 λ −3 −4
16 −8 λ −7
λ +1 − 4 −4
→ λ + 1 λ − 3 − 4
C1 + C 2 + C 3
λ +1 −8 λ − 7
1 − 4 − 4
= (λ + 1 )1 λ − 3 − 4
1 − 8 λ − 7
86
R 2 − R1
1 −4 −4
→
R 3 − R1
= ( λ + 1) 0 λ +1 0
0 −4 λ −3
= (λ + 1 )(λ + 1 )(λ − 3 )
= (λ + 1 ) (λ − 3 )
2
where λ1, λ2, . . . . . ., λk are the distinct roots; these distinct roots are the distinct
eigenvalues of A and the multiplicities of these roots are called the algebraic
multiplicities of these eigenvalues of A. Thus when C(λ) is as in (2), the distinct
eigenvalues are λ1, λ2, . . . . . ., λk and the algebraic multiplicities of these
eigenvalues are respectively, a1, a2, . . . . . , ak.
Thus the distinct eigenvalues of this matrix are λ1 = -1 ; and λ2 = 3 and their
algebraic multiplicities are respectively a1 = 2 ; a2 = 1.
and is defined as
{
Wλ = x ∈ C n : Ax = λ i x
i
}
87
The dimension of Wλ is called the GEOMETRIC MULTIPLICITY of the
i
Again for the matrix on page 81, we have found on pages 83 and 84
respectively that, dim W(−1) = 2 ; and dim. W(3) = 1. Thus the geometric
multiplicities of the eigenvalues λ1 = -1 and λ2 = 3 are respectively g1 = 2 ; g2 = 1.
Notice that in this example it turns out that a1 = g1 = 2 ; and a2 = g2 = 1. In
general this may not be so. It can be shown that for any matrix A having C(λ) as
in (2),
1 ≤ gi ≤ ai ; 1 ≤ i ≤ k . . . . . . . . . . . .(3)
(λ − α 1 )( λ − α 2 ) K ( λ − α i −1 )( λ − α i +1 ) K ( λ − α s )
p i (λ ) =
(α i − α 1 )(α i − α 2 ) K (α i − α i −1 )(α i − α i +1 ) K (α i − α s )
(λ − α j )
= ∏ for i = 1,2, . . . . . . ., s . . . . . . .. (4)
1≤ j ≤ s
j≠i
(α i − α j )
1if i = j
( )
pi α j = δij = . . . . . . . . . . (5)
0 if i ≠ j
88
We call these the Lagrange Interpolation polynomials. If p(λ) is any polynomial
of degree ≤ s-1 then it can be written as a linear combination of p1(λ),p2(λ), . . .,
ps(λ) as follows:
p (λ ) = p (α 1 ) p1 (λ ) + p (α 2 ) p 2 (λ ) + L + p (α s ) p s (λ ) . . . . (6)
s
= ∑ p (α ) p (λ )
i =1
i i
A 2 φ i = A ( A φ i ) = A ( λ i φ i ) = λ i A φ i = λ2 i φ i
A 3φi = A ( A 2 φi ) = A (λ i 2 φ i ) = λ i 2 i Aφ i = λ i 3 φ i
and by induction we get
Now,
p ( λ ) = a 0 + a1 λ + K K + a s λ s
= ( a 0 + a 1λ i + K K + a s λ i s ) φ i
= p ( λ i )φ i .
89
Thus,
Now if in (4) & (5) we take s = k ; αi = λi, (I=1,2,..,s) then we get the Lagrange
Interpolation polynomials as
(λ − λ j )
p i (λ ) = ∏ ; i = 1,2,….., k …………(9)
1≤ j ≤ k
j≠ i
(λ i − λ j )
and
p i (λ j ) = δ ij …………(10)
Now,
For 1≤ i ≤ k,
⇒ Ciφi = θ ;1 ≤ i ≤ k ; by (10)
⇒ Ci = 0;1 ≤ i ≤ k since φi are nonzero vectors
Thus
90
C1φ1 + C 2φ 2 + .... + C k φ k = θ n ⇒ C1 = C 2 = .... = C n = 0 proving (8). Thus
we have
PROPERTY II
91
3.2 SIMILAR MATRICES
We shall now introduce the idea of similar matrices and study the properties
of similar matrices.
DEFINITION
P-1 A P = B
We then write,
A∼B
⇒ A = P B P-1
Thus
A∼B ⇒ B∼A
A ∼ B, B ∼ C ⇒ A ∼ C.
(4) Properties (1), (2) and (3) above show that similarity is an equivalence
relation on the set of all nxn matrices.
(5) Let A and B be similar matrices. Then there exists a nonsingular matrix P
such that
A = P-1 B P.
91
Now, let CA(λ) and CB (λ) be the characteristic polynomials of A and B
respectively. We have,
C A (λ ) = λI − A = λI − P −1 BP
= λP −1 P − P −1 BP
= P −1 (λI − B )P
= P −1 λI − B P
= λ I − B since P −1 P = 1
= CB ( λ )
(6) Let A and B be similar matrices. Then there exists a nonsingular matrix P
such that
A = P-1 B P
(
P −4
Ak = 1 1
BP
44
)(
−1
P4 BP4
2 .....
4P
) (
−1
44 BP
3
)
k times
Therefore,
Ak = On ⇔ P-1 Bk P = On
⇔ Bk = On
7) Let A and B be any two square matrices of the same order, and let
p(λ) = a0 + a1λ + ….. + akλ be any polynomial.
Then
92
p ( A ) = a 0 I + a1 A + ..... + a k A k
= a0 I + a1 P −1 BP + a 2 P −1 B 2 P + ..... + a k P −1 B k P
[ ]
= P −1 a 0 I + a1 B + a 2 B 2 + ..... + a k B k P
= P −1 p (B )P
Thus
p ( A) = O n ⇔ P −1 p (B )P = On
⇔ p ( B ) = On
(8) Let A be any matrix. By A(A) we denote the set of all polynomials p(λ) such
that
The next simple matrix we know is the identity matrix In. Now A ∼ In ⇔
there is a nonsingular P such that A = P-1 In P ⇔ A = In.
93
Thus “THE ONLY MATRIX SIMILAR TO In IS ITSELF ”.
Similarly the only matrix similar to a scalar matrix kIn , (where k is a scalar),is kIn
itself.
λ1
λ2
D=
O
λn
(λI not necessarily distinct).
P-1 A P = D
AP = PD ………..(1)
p1 i
p2i
L etPi = denote the ith column of P.
M
p ni
94
Now the ith column of AP is
p1i λi p1i
p 2 i λi = λ p 2 i = λ P
M i
M i i
p ni λi pni
Thus the ith column of P D, the r.h.s. of (1), is λi Pi. Since l.h.s. = r.h.s. by (1) we
have
Note that since P is nonsingular no column of P can be zero vector. Thus none
of the column vectors Pi are zero. Thus we conclude that,
Note:
95
Conversely, it is now obvious that if A has n linearly independent
eigenvectors then A is similar to a diagonal matrix D and if P is the matrix whose
ith column is the eigenvector, then D is P-1 A P and ith diagonal entry of D is the
eigenvalue corresponding to the ith eigenvector.
ω i = {x : Ax = λi x}
Therefore, we have,
Example:
− 9 4 4
A = − 8 3 4
− 16 8 7
C( λ ) = ( λ +1)2 ( λ - 3)
Thus λ 1 = -1 ; a1 = 2
λ 2 = 3 ; a2 = 1
96
On pages 83 and 84 we found,
W 1 = eigensubspace corresponding to λ = -1
1 1
= x : x = A1 2 + A2 0
0 2
W 2 = eigensubspace corresponding to λ = 3
1
= x : x = k 1
2
Thus dim W 1 = 2 ∴ g1 = 2
dim W 2 = 1 ∴ g2 = 1
1 1 1
P = 2 0 1
0 2 2
1 0 − 1
2
Then P
−1
= 2 −1 − 1
2 ; and it can be verified that
− 2 1 1
97
1 0 − 1 − 9 4 1 1
2 4
1
P AP = 2
−1
−1 − 1 −8
2
3 4 2 0 1
− 2 1 1 − 16 8 7 0 2 2
−1 0 0
= 0 −1 0 a diagonal matrix., whose diagonal entries are the
0 3
0
eigenvalues of the matrix A.
x1 y1
x2 y
If x = ; y = 2 are any two vectors in Cn, we define the INNER
M M
x y
n n
PRODUCT OF x with y (which is denoted by (x,y)) as,
(x , y ) = x1 y 1 + x 2 y 2 + K + x n y n
n
= ∑
i =1
xi y i
Example 1:
i 1
If x = 2 + i ; y = 1 − i ; then,
−1 i
( x , y ) = i .1 + (2 + i )(1 − i ) + (− 1 )(i )
98
= i + (2 + i )(1 + i ) + (− 1 )(− i )= 1 + 5 i
()
Whereas ( y , x ) = 1 i + (1 − i )(2 + i ) + (i )(− 1 ) = 1 − 5i
⇔ x i = 0 ;1 ≤ i ≤ n
⇔ x = θn
Thus,
(x,x) is real and ≥ 0 and (x,x)= 0 ⇔ x = θn
n n
(2) (x , y ) = ∑ xi y i =
∑ yi xi
i =1 i =1
= (y , x )
Thus,
(x , y ) = ( y , x )
(3) For any complex number α, we have,
n n
(α x , y ) = ∑ (α x i ) y i =α ∑ xi y i
i =1 i =1
= α ( x, y )
Thus
(αx,y) = α (x,y) for any complex number α.
We note,
( x , α y ) = (α y , x ) by (2)
= α ( y , x ) = α (y , x ) = α (x , y )
n
(4) (x + y, z ) = ∑ (x i + y i )z i
i=1
99
n n
= ∑ xi z i + ∑ yi z i
i =1 i =1
= ( x, y ) + ( x, z )
Thus
1 − 1
Example (1) If x = i ; y = i ,
− i 0
then,
(x, y ) = 1(− 1) + i (i ) + (− i )(0 )
= -1 + 1 = 0
Thus x and y are orthogonal.
1 − 1
(2) If x = i , y = a
− i 1
then
(x, y ) = −1 + ai − i
∴ x, y orthogonal
⇔ − (1 + i ) + a i = 0
1+ i
⇔ a = = − i (1 + i ) = 1 − i
i
⇔ a =1+ i
100
3.3 HERMITIAN MATRICES
1 i
Example 1: A =
− i i
1 − i
Transpose of A =
i i
1 i
∴ A* =
− i − i
1 i
Example 2: A =
− i 2
1 − i
Transpose of A =
i 2
1 i
∴ A* =
− i 2
x1 y1
x2 y
(2) Let x = ; y = 2 be any two vectors in Cn and A a Hermitian matrix.
M M
x y
n n
101
Let
( Ax )1 ( Ay )1
( Ax )2 ( Ay ) 2
Ax = ; Ay = M
M
( Ax ) ( Ay )
n n
We have
n n
( Ax )i = ∑a ij x j ; ( Ay )j = ∑a ji yi.
j =1 i =1
Now
n
( Ax , y ) = ∑ ( Ax )i y i
i =1
n n
= ∑ ∑
a ij x j y i
i =1 j =1
n
n
= ∑j =1
x j ∑ a ij y i
i =1
n
n
= ∑j =1
x j ∑ a ij y i
i=1
n
n
= ∑j =1
x j∑ a
i =1
ji yi
(Qaij = a ji since A = A* )
(Ay )
n
= ∑j =1
x j j
= (x, Ay)
102
(3) Let λ be any eigenvalue of a Hermitian matrix A. Then there is an x ∈ Cn, x ≠
θn such that
Ax = λx.
Now, since A is Hermitian we have,
λ ( x , x ) = (λ x , x ) = ( Ax , x )
= ( x , Ax )
= (x , λ x )
= λ (x , x )
(
∴ λ − λ )(x , x ) = 0 . But (x , x ) ≠ 0Q x ≠ θ n
∴ λ − λ = 0∴ λ = λ ∴ λ is real.
Ax = λx and Ay = µy
and λ, µ are real by (3).
Now,
λ (x , y ) = (λ x , y )
= (Ax , y )
= (x , Ay ) by ( 2 )
= (x , µ y )
= µ (x , y )
= µ (x , y ) since µ is real .
Hence we get
(λ − µ )(x , y ) = 0 .
Butλ ≠ µ
So we get (x,y) = 0 ⇒ x and y are orthogonal.
103
3.4 GRAMM – SCHMIDT ORTHONORMALIZATION
Let ψ 1 = U 1 ;
ψ1 ψ1
φ1 = = Note φ1 = 1
ψ1 (ψ 1 ,ψ 1 )
(We have used the symbol x to denote the norm ( x, x ) of a vector x)
Next, let,
ψ 2 = U 2 − (U 2 , φ1 )φ1
Note that
(ψ 2 , φ 1 )
= ( U 2 , φ1 ) − ((U 2 , φ1 )φ1 , φ1 )
= ( U 2 , φ1 ) − ( U 2 , φ1 )(φ1 , φ1 )
= ( U 2 , φ1 ) − ( U 2 , φ1 ) (since (φ1 , φ1 ) = 1
Hence we get,
ψ 2 ⊥ φ1.
Let
ψ2
φ2 = ; clearly φ 2 = 1, φ1 = 1, (φ1 , φ 2 ) = 0
ψ2
Also
x = α1 U1 + α2 U2 then
⇔ x = α1ψ 1 + α 2 (ψ 2 + (U 2 , φ1 )φ1 )
⇔ x = α 1 ψ 1 φ1 + α 2 [ ψ 2 φ2 + (U 2 ,φ1 )φ1 ]
104
⇔ x = β 1φ1 + β 2φ 2 , where
β1 = α 1 ψ 1 + α 2 (U 2 ,φ1 )
β2 = α2 ψ 2
( ) (ψ i , φp ) = 0
i −1
ψi = U i − ∑ U p , φp φp Clearly 1≤ p ≤ i-1
p =1
and
ψi
φi =
ψi
Example:
1 2
1
1 3
Let U 1 = ;U 2 = 1 ;U 3 =
1
− 1 1
0 0 0
be l.i. vectors in R4. Let us find an orthonormal basis for the subspace ω
spanned by U1, U2, U3 using the Gramm – Schmidt process.
105
1 1
1 ψ1 1 1
ψ 1 = U 1 = ; φ1 = =
1
(ψ 1 , ψ 1 ) 3 1
0 0
∴
1
3
1
φ1 = 3
1
3
0
ψ 2 = U 2 − (U 2 , φ1 )φ1
1
1 3
1
1 1 1 1
= − + − 3
−1 3
3 3
1
0
3
0
1
1 3
1
1 3
= −
− 1 1
0 3
0
2
3
2 4 4 16 2 6
= 3 and ψ 2 = + + =
− 4 9 9 9 3
3
0
106
2 1 6
3
ψ2 3 2 1
∴φ2 = = 3 = 6
ψ2 2 6 − 4 −2
3 6
0 0
Thus
1
6
1
φ2 = 6
− 2
6
0
Finally,
ψ 3 = U 3 − (U 3 , φ1 )φ1 − (U 3 , φ 2 )φ 2
1 1
2 3 6
1 1
3 6 3
= − 3 − 6
1 −2
1 3 6
0 3 6
0 0
2 2 1 2
3 2 1
= − − 2
− 1
1 2
0 0
0
− 1
2
1
= 2
0
0
107
ψ3 = 1 +1 =
4 4
1 = 1
2 2
− 1 − 1
2 2
ψ3 1 1
∴ φ3 = = 2 2 = 2
ψ3 0 0
0
0
Thus the required orthonormal basis for W, the subspace spanned by U1,U2, U3
is φ1, φ2, φ3, where
1 1 − 1
3 6
2
1 1 1
φ1 =
3 ;φ 2 =
6 ;φ 3 = 2
0
−
1 2
3 6
0
0 0
Note that these φi are mutually orthogonal and have, each, ‘length’ one.
….., λk are its distinct eigenvalues and a1, ….., ak are their algebraic multiplicities.
If W i is the characteristic subspace, (eigensubspace), corresponding to the
eigenvalue λi ; that is,
ω i = {x : Ax = λ i x }
We choose any basis for ωi and orthonormalize it by G-S process and get an
orthonormal basis for ωi. If we now take all these orthonormal - basis vectors for
ω1, . .,ωk and write them as the columns of a matrix P then
108
P*AP
will be a diagonal matrix.
Example :
6 −2 2
A = − 2 3 − 1
2 −1 3
Notice
A* = A1 = A1 = A.
Characteristic Polynomial of A:
λ −6 2 −2
λI − A = 2 λ −3 1
−2 1 λ −3
λ −2 2 (λ − 2 ) 0
→R1 + 2 R 2
2 λ −3 1
−2 1 λ −3
1 2 0
= (λ − 2 ) 2 λ − 3 1
− 2 1 λ − 3
R 2 − 2 R1
1 2 0
→ = (λ − 2 ) 0 λ −7 1
R 3 + 2 R1
0 5 λ −3
= (λ − 2 )[(λ − 7 )(λ − 3) − 5]
109
[
= (λ − 2 ) λ2 − 10λ + 16 ]
= (λ − 2)(λ − 2)(λ − 8)
= (λ − 2 ) (λ − 8 )
2
Thus
C (λ ) = (λ − 2 ) (λ − 8)
2
∴ λ1 = 2 a1 = 2
λ2 = 8 a2 = 1
W1 = {x : Ax = 2x}
= {x : ( A − 2 I )x = θ }
(A – 2I) x = θ
4 −2 2 x1 0
i.e. − 2 1 − 1 x 2 = 0
2 −1 1 x 3 0
⇒ 2x1 – x2 + x3 = 0
⇒ x3 = - 2x1 + x2
x1
∴x = x2 ; x , x
1 2 arbitrary
−2 x + x
1 2
α
∴ ω1 = x : x = β ; α , β scalars
− 2α + β
110
∴ A basis for W 1 is
1 0
U 1 = 0 ; U 2 = 1
− 2 1
1 ψ
ψ1 = U1 = 0 ψ1 = 5 φ1 = 1
− 2 ψ 1
1
5
∴ φ1 = 0
2
− 5
ψ 2 = U 2 − (U 2 , φ1 )φ1
1
0
2 5
= 1 − − 0
1 5 2
− 5
2
0 5
= 1 + 0
1 − 4
5
2
5
= 1
1
5
111
4 1 30 30
ψ2 = +1+ = =
25 25 25 5
2
2
ψ2 5 5 30
∴φ2 = = 1 = 5
ψ2 30 1 30
5 1
30
W2 = {x : Ax = 8x}
= {x : ( A − 8I )x = θ }
So we have to solve
(A-8I) x = θ i.e.
− 2 −2 2 x1 0
− 2 −5 − 1 x 2 = 0
2 −1 − 5 x 3 0
2γ 2
γ = γ − 1
γ 1
2
∴ Basis : U 3 = − 1
1
2
ψ 3 = U 3 = − 1
1
112
2
2
6
ψ3 1
φ3 = = −
1 = − 1
ψ3 6 6
1 1
6
1 2 2
5 30 6
∴ If P = 0 5 − 1
30 6
2 1 1
−
5 30 6
Then
P* = P1 and
2 0 0
P AP = P AP = 0
* 1
2 0 ;
0 8
0
a diagonal matrix.
113
3.5 VECTOR AND MATRIX NORMS
x
R 2 = x = 1 ; x1 , x 2 ∈ R ,
x2
x
our ‘usual’ two-dimensional plane. If x = 1 is any vector in this space we
x2
define its ‘usual’ ‘length’ or ‘norm’ as
x = x12 + x22
We observe that
114
Examples of Vector Norms on Cn and Rn
x1
x2
Let x = be any vector x in Cn (or Rn)
M
x n
[ ] 2
1 n
= ∑ x i
2
= x1 + x 2 + ..... + x n
2 2 2 2
(1) x
i =1
2
n
(2) x 1
= x1 + x 2 + .... + x n = ∑ x i
i =1
1
n
= ∑ xi
p p
(3) x
i =1
p
(4) x ∞
= max .{ x1 , x 2 ,....., xn }
All these can be verified to satisfy the above mentioned properties (i), (ii) and (iii)
required of a norm. Thus these give several types of norms on Cn and Rn.
Example:
1
(1) Let x = − 2 in R3
− 1
Then
x 1 = 1+1+ 2 = 4
115
1
x 2
= (1 + 1 + 4) 2 = 6
x ∞
= max .{1,1, 2} = 2
( )
1 1
x 4
= 1 +1 +
4 4
24 4 = 18 4
1
(2) Let x = i in C3
− 2i
Then
x 1 = 1+ 2 +1 = 4
1
= (1 + 4 + 1) = 6
2
x 2
x ∞
= max .{1, 2,1} = 2
1
( )
1
3
= 1 + 2 +1 = 10
3 3 3 3
x 3
{x } (k ) ∞
Consider a sequence k =1
of vectors in Cn (or Rn)
x1( k )
(k )
x
x (k ) = 2
M
( k )
xn
x1
x
Suppose x= 2 ∈ C ( or R )
n n
M
x
n
116
DEFINITION:
Example:
i
k
Let x ( k ) = 1 − 2 k be a sequence of vectors in R3.
1
k 2 + 1
0
1
Let x = 1 . Here x1( k ) = → 0 = x1
0 k
2
x 2( k ) = 1 − → 1 = x2
k
1
x3( k ) = 2 → 0 = x3
k +1
∴ x ( k ) i → xi for i =1,2,3.
∴ x (k ) → x
x ( k ) → x
117
1
k
= 1 − in R3 as before and,
2
x(k)
k
1
2
k + 1
0
x = 1
0
We have
1
k
x (k ) − x = −
2
k
1
2
k + 1
Now
1 2 1
x (k ) − x = + + 2 →0
1 k k k +1
∴ x ( k ) →
1
x
Similarly
1 2 1 2
x (k ) − x = max . , , 2 = → 0
∞
k k k + 1 k
∴ x (k )
∞ → x
1
1 2 1 2
x (k ) −x = 2 + 2 + 2
→0
2
k k (
k +1
2
)
∴ x ( k ) 2 → x
118
Also,
1
1 2 p 1 p
x (k ) − x = p + + p
→0
p
k k ( )
k 2 + 1
∴ x(k )
p → x ∀ p ;1 ≤ p ≤ ∞
It can be shown that
“ IF A SEQUENCE {x (k ) }OF VECTORS IN Cn (or Rn) CONVERGES TO A
VECTOR x IN Cn (or Rn) WITH RESPECT TO ONE VECTOR NORM THEN THE
SEQUENCE CONVERGES TO x WITH RESPECT TO ALL VECTOR NORMS
AND ALSO THE SEQUENCE CONVERGES TO x ACCORDING TO
DEFINITION ON PAGE 113 . CONVERSELY IF A SEQUENCE CONVERGES
TO x AS PER DEFINITION ON PAGE 113 THEN IT CONVERGES WITH
RESPECT TO ALL VECTOR NORMS”.
MATRIX NORMS
Let M be the set of all nxn matrices (real or complex). A matrix norm is a
function from the collection of matrices to the real numbers, whose value at any
matrix A is denoted by A having the following properties:
(i) A ≥ 0 for all matrices A
A = 0 if and only if A = On,
Ax
Suppose . is a vector norm. Then, consider (where A is an nxn
x
matrix); for x ≠ θn. This given us an idea to by what proportion the matrix A has
119
distorted the length of x. Suppose we take the maximum distortion as we vary x
over all vectors. We get
max Ax
x ≠ θn x
max Ax
A =
x ≠ θn x
We can show this is a matrix norm and this matrix norm is called the matrix norm
subordinate to the vector norm . We can also show that
max Ax max
A = = Ax
x ≠ θn x x =1
For example,
max
A1 = Ax 1
x 1 =1
max
A = Ax
2 x 2
=1 2
max
A = Ax ∞
∞ x ∞
=1
max
A = Ax
p x p
=1 p
How hard or easy is it to compute these matrix norms? We shall give some idea
of computing A 1 , A ∞ and A 2 for a matrix A.
Let
120
a 11 a 12 ..... a1n
a 21 a 22 ..... a2n
A=
..... ..... ..... .....
a a nn
n1 an2 .....
The sum of the absolute values of the entries in the ith column is called the
absolute column sum and is denoted by Ci. We have
n
C1 = a11 + a 21 + a 31 + ..... + a n1 = ∑ ai1
i =1
n
C 2 = a12 + a 22 + a32 + ..... + an 2 = ∑ ai 2
i =1
….. ….. ….. ….. ….. ….. …..
n
Cj = ∑
i =1
a ij ; 1≤j≤n
Let
C = max .{C1 , C2 ,.....,Cn }
This is called the maximum absolute column sum. We can show that,
= max n
∑ a
1 ≤ j ≤ n i = 1
ij
For example, if
1 2 − 3
A = −1 0 1 ,
− 3 − 4
2
then
C 1 = 1 + 1 + 3 = 5;
C 2 = 2 + 0 + 2 = 4; and C = max. {5, 4, 8} = 8
C3 = 3 + 1 + 4 = 8
121
∴ A1 =8
n
R2 = a21 + a22 + ..... + a2 n = ∑ a2 j
j =1
n
R i = a i1 + a i 2 + ..... + a in = ∑a j =1
ij
A ∞ = R = max{R1 ,....., Rn }
n
∑ a ij
= max
1 ≤ i ≤ n j =1
For example, for the matrix
1 2 − 3
A = −1 0 1 , we have
−3 − 4
2
R1 = 1 + 2 + 3 = 6;
R2 = 1 + 0 + 1 =2; and R = max {6, 2, 9}= 9
R3 = 3 + 2 + 4 = 9
∴ A ∞ =9
122
The computation of A 1 and A ∞ for a matrix are thus fairly easy.
However, the computation of A 2
is not very easy; but somewhat easier in the
case of the Hermitian matrix.
Let
P = max .{λ 1 , λ 2 , K , λ k }
This is called the spectral radius of A and is also denoted by A sp
A 2=P= A sp
6 −2 2
A = − 2 3 − 1
2 −1 3
∴ A xp
= P = max .{2,8}
=8
∴ A2= A sp
=8
123
We can show that
A 2 = µ = max{µ1 ,....., µ n }
If follows from the matrix norm definition subordinate to a vector norm, that
max Ax
A =
x ≠ θn x
Ax max Ax
≤ = A
x x ≠ θn x
and therefore
Ax ≤ A x for all x ≠ θn
Ax ≤ A x
124
125
UNIT 4
EIGEN VALUE COMPUTATIONS
125
4.1 COMPUTATION OF EIGEN VALUES
In this section we shall discuss some standard methods for computing the
eigenvalues of an nxn matrix. We shall also briefly discuss some methods for
computing the eigenvectors corresponding to the eigenvalues.
We shall first discuss some results regarding the general location of the
eigenvalues.
Let A = (aij) be an nxn matrix; and let λ1, λ2, ….., λn be its eigenvalues
(including multiplicities). We defined
P = A xp
= max {λ 1 , λ 2 ,....., λ n }
Thus if we draw a circle of radius P about the origin in the complex plane, then
all the eigenvalues of A will lie on or inside this closed disc. Thus we have
(A) If A is an nxn matrix then all the eigenvalues of A lie in the closed disc
{λ : λ ≤ P}in the complex plane.
This result give us a disc inside which all the eigenvalues of A are located.
However, to locate this circle we need P and to find P we need the eigenvalues.
Thus this result is not practically useful. However, from a theoretical point of
view, this suggests the possibility of locating all the eigenvalues in some disc.
We shall now look for other discs which can be easily located and inside which
the eigenvalues can all be trapped.
126
n
A 1 = Max
{ ∑ a ij
1≤ j≤ n i =1
. Thus we have,
(B) If A is an nxn matrix then all its eigenvalues are trapped in the closed disc
{ } { }
λ : λ ≤ A ∞ or the disc λ : λ ≤ A 1 . (The idea is to use A ∞ if it is smaller
than A 1 , and A 1 if it is smaller than A ∞ ).
COROLLORY
(C) If A is Hermitian, all its eigenvalues are real and hence all the eigenvalues
lie in the intervals,
{λ : −P ≤ λ ≤ P} by (A)
{λ : − A ∞
≤λ≤ A ∞
}
{λ : − A 1
≤λ≤ A 1
} by (B).
Example 1:
1 −1 2
Let A = −1 2 3
1 0
2
∴ A ∞ = MARS = 6
127
∴ A 1 = MACS = 5
The above results locate all the eigenvalues in one disc. The next set of
results try to isolate these eigenvalues to some extent in smaller discs. These
results are due to GERSCHGORIN.
Now let Pi denote the sum of the absolute values of the off-diagonal entries of A
in the ith row.
G i : Centre ξ i ; radius Pi : {λ : λ − ξ i ≤ Pi }
128
Thus we get n discs G1, G2, ….., Gn. These are called the GERSCHGORIN
DISCS of the matrix A.
(D) All eigenvalue of A must lie within the union of these Gerschgorin discs.
Example 2:
1 −1 0
Let A = 0 4 1
3 − 5
1
P1 = 1 ; P2 = 1 ; P3 = 4
G1 (1,0) G2 (4,0)
G3 (-5,0)
129
Thus every eigenvalue of A must lie in one of these three discs.
Example 3:
10 4 1
Let A= 1 10 0 .5
1.5 − 3 20
(It can be shown that the eigenvalues are exactly λ1 = 8, λ2 = 12, λ3 = 20).
ξ1 = (10,0) ξ2 = (10,0) ξ3 = 20
P1 = 5 P2 = 1.5 P3 = 4.5
G 1 = {λ : λ − 10 ≤ 5}
G 2 = {λ : λ − 10 ≤ 1 . 5}
G 3 = {λ : λ − 20 ≤ 4 . 5}
GG3
G1 G3
130
Thus all the eigenvalues of A are in these discs. But notice that our exact
eigenvalues are 8,12 and 20. Thus no eigenvalue lies in G2; and one
eigenvalue lie in G3 (namely 20) and two lie in G1 (namely 8 and 12).
Example 4:
1 0 1
Let A = 1 2 0
1 5
0
Now,
G1 (1,0) G2 (2,0)
G3 (5,0)
Thus every eigenvalue of A must lie in the union of these three discs.
131
In example 2, all the Gerschgorin discs were isolated; and in examples 3
and 4 some discs intersected and others were isolated. The next Gerschgoin
result is to identify the location of the eigenvalues in such cases.
Thus in example 2 we have all three isolated discs and thus each disc will
trap exactly one eigenvalue.
In the case of Hermitian matrices, since all the eigenvalues are real, the
Gerschgorin discs, Gi = {λ : λ − a ii ≤ Pi } = {λ : λ − ξ i ≤ Pi } can be replaced by the
Gerschgorin intervals,
Gi = {λ : λ − ξ i ≤ Pi } = {λ : ξ i − Pi ≤ λ ≤ ξ i + Pi }
Example 5:
1 −1 1
Let A = −1 5 0
1 0 −1
2
132
Here; ξ1 = (1,0) P1 = 2
ξ2 = (5,0) P2 = 1
ξ3 = (-1/2,0) P3 = 1
G1 : -1≤ λ ≤ 3
G2 : 4 ≤ λ ≤ 6
G3 : -3/2 ≤ λ ≤ ½
-2 -1 0 1 2 3 4 5 6
Note that G1 and G3 intersect and give a connected region, -3/2 ≤ λ ≤ 3; and this
is isolated from G2 : 4 ≤ λ ≤ 6. Thus there will be two eigenvalues in –3/2 ≤ λ ≤
3 and one eigenvalue in 4 ≤ λ ≤ 6.
All the above results (A), (B), (C), (D), and (E) give us a location of the
eigenvalues inside some discs and if the radii of these discs are small then the
centers of these circles give us a good approximations of the eigenvalues.
However if these discs are of large radius then we have to improve these
approximations substantially. We shall now discuss this aspect of computing
the eigenvalues more accurately. We shall first discuss the problem of
computing the eigenvalues of a real symmetric matrix.
We shall first discuss the method of reducing the given real symmetric matrix to
a real symmetric tridiagonal matrix which is similar to the given matrix and then
computing the eigenvalues of a real symmetric tridiagonal matrix. Thus the
process of determining the eigenvalues of A = (aij), a real symmetric mtrix
involves two steps:
STEP 1:
133
STEP 2:
134
4.3 EIGENVALUES OF A REAL SYMMETRIC TRIDIAGONAL MATRIX
a1 b1 0 0 0 .... 0
b1 a 2 b2 0 0 ..... 0
0 b a3 b3 0 ..... 0
T = 2
Let ..... ..... ..... ..... ..... ..... .....
0 ..... ..... 0 bn − 2 a n −1 bn −1
0 a n
0 ..... ..... 0 bn −1
a1 − λ b1 0 ..... ..... 0
b1 a2 −λ b2 0 ..... 0
= ..... ..... ..... ..... ..... .....
0 ..... 0 bn −2 a n −1 − λ b n −1
0 ..... ..... 0 b n −1 an − λ
(Without loss of generality we assume bi ≠ 0 for all i. For if bi = 0 for some i then
the above determinant reduces to two diagonal blocks of the same type and
thus the problem reduces to that of the same type involving smaller sized
matrices).
We define Pi (λ) to be the ith principal minor of the above determinant. We have
135
P0 (λ ) = 1
P1 (λ ) = a1 − λ …….. (I)
Pi (λ ) = (a i − λ )Pi −1 (λ ) − b 2
i −1 Pi − 2 (λ )
Example:
P0 (1 ) = 1
P1 (1 ) = 2
P2 (1 ) = − 3
P3 (1 ) = − 2
P4 (1 ) = 6
P5 (1 ) = − 1
P6 (1 ) = 0
P7 (1 ) = 4
P8 (1 ) = − 2
136
P 0 (1 ), P1 (1 )
P 2 (1 ), P 3 (1 )
P 5 (1 ), P 6 (1 )
agree in sign.
(Because since P6 (1) = 0 we assign its sign to be the same as that of P5 (1) .
Thus three pairs of sign agreements are achieved. Thus N (C) = 3; and there
will be 3 eigenvalues of T greater than or equal to 1; and the remaining 5
eigen values are < 1.
It is this idea of result (F) that will be combined with (A), (B), (C), (D) and
(E) of the previous section and clever repeated applications of (F) that will
locate the eigenvalues of T. We now explain this by means of an example.
Example 7:
1 2 0 0
2 −1 4 0
Let T =
0 4 2 − 1
0 0 − 1 3
Here we have
and therefore,
T ∞
= MARS = 7
137
(Note since T is symmetric we have MARS = MACS and therefore
T 1 = T ∞ = T ). Thus by our result (C), we have that the eigenvalues are all in
the interval –7 ≤ λ ≤ 7
[ ]
-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
We see that G1, G2, G3 and G4 all intersect to form one single connected region
[-7, 7]. Thus by (E) there will be 4 eigenvalues in [-7, 7]. This gives therefore
the same information as we obtained above using (C). Thus so far we know
only that all the eigenvalues are in [-7, 7]. Now we shall see how we use (F) to
locate the eigenvalues.
First of al let us see how many eigenvalues will be ≥ 0. Let C = 0. Find N (0)
and we will get the number of eigenvalues ≥ 0 to be N (0).
Now
1− λ 2 0 0
2 −1 − λ 4 0
T − λI =
0 4 2−λ −1
0 0 −1 3−λ
P2 (λ ) = − (1 + λ )P1 (λ ) − 4 P0 (λ )
P0 (λ ) = 1 P1 (λ ) = 1 − λ P3 (λ ) = (2 − λ )P2 (λ ) − 16 P1 (λ )
P4 (λ ) = (3 − λ )P3 (λ ) − P2 (λ )
Now, we have,
138
P0 (0 ) = 1
P1 (0 ) = 1
P2 (0 ) = − 5
P3 (0 ) = − 26
P4 (0 ) = − 73
We have
P0 (0 ), P1 (0 )
P2 (0 ), P3 (0 ) as three consecutive pairs having sign
P3 (0 ), P4 (0 )
agreements.
∴ N (0 ) = 3
-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
Fig.1
139
P 0 (− 1 ) = 1
P1 (− 1 ) = 2
P 2 (− 1 ) = − 4
P 3 (− 1 ) = − 48
P 4 (− 1 ) = − 188
-7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7
(Fig.2)
Let us take the mid point of [-7, -1] in, which the negative eigenvalue lies.
So let C = -4.
We have
140
P0 (-5.5) = 1
P1 (-5.5) = + 6.5 ∴ N (-5.5) = 4. ∴ 4 eigenvalues ≥ -5.5.
P2 (-5.5) = 25.25 Combining this with (*) and fig. 2 we get
P3 (-5.5) = 85.375 that negative eigenvalue is in [-5.5 – 4].
P4 (-5.5) = 683.4375
We again take the mid pt. C and calculate N (C) and locate in which half of this
interval does this negative eigenvalue lie and continue this bisection process
until we trap this negative eigenvalue in as small an interval as necessary.
P0 (1) = 1
P1 (1) = 0 ∴ N (1) = 3
P2 (1) = - 4 ∴ all the eigenvalues are ≥ 1 ………….. (**)
P3 (1) = - 4
P4 (1) = - 4
C=2
P0 (2) = 1
P1 (2) = -1 ∴ N (2) = 2 ∴ There are two eigenvalues
P2 (2) = - 1 ≥ 2. Combining this with (**) we get one
P3 (2) = 16 eigenvalue in [1, 2) and two in [2, 7].
P4 (2) = 17
C=3
141
P4 (3) = - 4 one eigenvalue in [3, 7)
Let us locate the eigenvalue in [3, 7] a little better. Take C = mid point = 5
P0 (5) = 1
P1 (5) = - 4 ∴ N (5) = 1
P2 (5) = 20 ∴ this eigenvalue is ≥ 5
P3 (5) = 4
P4 (5) = -28
P0 (6) = 1
P1 (6) = - 5 ∴ N (6) = 0
P2 (6) = 31 ∴ No eigenvalue ≥ 6
P3 (6) = - 44 ∴ the eigenvalue is in [5, 6)
P4 (6) = 101
Each one of these locations can be further narrowed down by the bisection
applied to each of these intervals.
142
4.4 TRIDIAGONALIZATION OF A REAL SYMMETRIC MATRIX
Let A = (aij) be a real symmetric nxn matrix. Our aim is to get a real symmetric
tridiagonal matrix T such that T is similar to A. The process of obtaining this T is called
the Givens – Householder scheme. The idea is to first find a reduction process which
annihilates the off – tridiagonal matrices in the first row and first column of A and
repeatedly use this idea. We shall first see some preliminaries.
U1
U2
Let U= be a real nx1 nonzero vector.
M
U
n
Then, H = UUT is an nxn real symmetric matrix. Let α be a real number (which we
shall suitably choose) and consider
P = I − αH = I − αUU T ...........(I )
We shall choose α such that P is its own inverse. (Note that PT = P). So we need
P2 = I
i.e.
(I - α H) (I - α H) = I
i.e.
(I - α UUT) (I - α UUT) = I
143
α2 UUT UUT = 2 α UUT
Obviously, we choose α ≠ 0. Because otherwise we get P = I; and we don’t get any new
transformation.
Hence we need
α UUT UUT = 2 UUT
But
UTU = U21 + U22 + ….. + U2n = U 2
and hence
2
α= .......... ..( II )
UTU
is such that
P = P T = P −1.......... ....( IV )
144
Let s = nonnegative square root of s2.
Let
0
a 21 + s sgn .a 21
U = a 31
………. (VI)
M
a n1
Thus U is the same as the 1st column of A except that the 1st component is taken as 0 and
second component is a variation of the second component in the 1st column of A. All
other components of U are the same as the corresponding components of the1st column
of A.
Then
−1
UTU
α=
2
−1
(a + s sgn .a 21 )2 + a 31
2
+ a 241 + ..... + a 2n1
= 21
2
−1
a 221 + s 2 + 2 a 21 s + a 31
2
+ ..... + a 2n1
=
2
==
(
a 221 + a 31
2
)
+ ..... + a 2n1 + s 2 + 2s a 21
−1
2
145
(
a 221 + a 31
2
)
+ ..... + a 2n1 + s 2 + 2s a 21
−1
2
[
2 s 2 + s a 21
=
] −1
since s 2 = a 221 + a 31
2
+ L + a 2n1
2
1
=
s 2 + s a 21
1
∴α = (VII)
s + s a 21
2
P = I - α UUT
is similar to A and has off tridiagonal entries in 1st row and 1st column as 0.
Now we apply this procedure to the matrix obtained by ignoring 1st column and 1st row
of A2.
s 2 = a 2 32 + a 2 42 + ..... + a 2 n 2
146
(i.e. s2 is sum of squares of the entries below second diagonal of A2)
0
0
a + ( sign .a ) s
U = 32 32
a 42
M
a n2
1
α =
s + s a 32
2
P = I - α UUT
Then
A3 = PA2P
has off tridiagonal entries 1n 1st, 2nd rows and columns as zero. We proceed similarly
and annihilate all off tridiagonal entries and get T, real symmetric tridiagonal and
similar to A.
Example:
5 4 1 1
4 5 1 1
A=
1 1 4 2
1 4
1 2
147
A is a real symmetric matrix and is 4 x 4. Thus we get tridiagonalization after (4 – 2)
i.e. 2 steps.
Step 1:
s 2 = 4 2 + 1 2 + 1 2 = 18
s = 18 = 4 .24264
1 1 1
α= = =
s + s a 21 18 + (4.24264 )(4 ) 34 .97056
2
= 0.02860
0 0
a 21 + s sgn .a 21 4 + 4 .24264
U = =
a 31 1
a 41 1
1 0 0 0
0 − 0.94281 − 0.23570 − 0.23570
P = I − αUU t =
0 − 0.23570 0.97140 − 0.02860
0 − 0.23570 − 0.02860 0.97140
148
5 − 4 .24264 0 0
− 4 .24264 6 − 1 − 1
A2 = PAP =
0 −1 3 . 5 1 .5
0 −1 1 .5 3 .5
Step 2
s 2 = (− 1 ) + (1 ) = 2
2 2
s = 2 = 1 . 41421
1 1 1
α= = = = 0.29289
s + s a32
2
2 + (1.41421)(1) 3.41421
0 0 0
0 0 0
U = = =
a + s sgn .a 32 − 1 − 1.41421 − 2.41421
32
−1 − 1
a 42
P = I - α UUT
1 0 0 0
0 1 0 0
=
0 0 − 0 . 70711 − 0 .70711
0 − 0 . 70711 0 . 70711
0
A3 = PA2P
5 − 4 .24264 0 0
− 4.24264 6 1 .41421 0
=
0 1.41421 5 0
2
0 0 0
149
which is tridiagonal.
Thus the Givens – Householder scheme for finding the eigenvalues involves two steps,
namely,
STEP 2: Find the eigenvalues of T (by the method of sturm sequences and
bisection described earlier)
However, it must be mentioned that this method is used mostly to calculate the
eigenvalue of the largest modulus or to sharpen the calculations done by some other
method.
If one wants to calculate all the eigenvalues at the same time then one uses the
Jacobi iteration which we now describe.
150
4.5 JACOBI ITERATION FOR FINDING EIGENVALUES OF A REAL
SYMMETRIC MATRIX
Some Preliminaries:
a a12
Let A = 11 be a real symmetric matrix.
a12 a 22
Cos θ − sin θ
Let P = ; (where we choose θ ≤ π for
sin θ Cos θ 4
purposes of convergence of the scheme)
Note
Cos θ sin θ
P t = and P t P = PP t
= I
− sin θ Cos θ
Now
cos θ sin θ a 11 a 12 cos θ − sin θ
A 1 = P t AP =
− sin θ cos θ a 12 a 22 sin θ cos θ
cos θ sin θ a11 cos θ + a12 sin θ − a11 sin θ + a12 cos θ
=
− sin θ cos θ a12 cos θ + a 22 sin θ − a12 sin θ + a 22 cos θ
a11 cos 2 θ + 2a12 sin θ cosθ + a 22 sin 2 θ (− a11 + a22 )sin θ cosθ + a12 (cos 2 θ − sin 2 θ )
=
(
(− a11 + a 22 ) sin θ cosθ + a12 cos θ − sin θ
2 2
) a11 sin 2 θ − 2a12 sin θ cosθ + a 22 cos 2 θ
151
(I) gives
− a11 + a22
sin 2θ + a12 (cos 2θ ) = 0
2
a 11 − a 22
⇒ a 12 cos 2θ = sin 2θ
2
α
= , say . . . . . (II)
β
β = a 11 − a 22 . . . . . (IV)
∴ sec 2 2θ = 1 + tan 2
2θ
α2
= 1+ from (II)
β2
α2 + β2
=
β2
β2
∴ cos 2 2θ =
α2 +β2
β β
∴ cos 2θ = ⇒ 2 cos 2 θ − 1 =
α +β
2 2
α +β2
2
152
1 β
⇒ cos θ = 1 + . . . . . . . (V)
2 α2 +β 2
and
β 2
2 sin θ cos θ = sin 2θ = 1 − cos 2 2θ = 1−
α2 +β 2
α2 α
= =
α +β2 2
α2 + β2
α
∴ sin θ = . . . . . .(VI)
2 cos θ α 2 + β 2
cos θ − sin θ
P = with these values of cosθ, sinθ, then
sin θ cos θ
Let 1 ≤ q < p < n. (Instead of (1,2) position above choose (q, p) position)
Consider,
α = 2 a qp sgn( a qq − a pp ) . . . . . . . (A)
153
β = a qq − a pp . . . . . . . (B)
1 β
cos θ = 1 + . . . . . . . (C)
2
α +β2
2
1 α
sin θ = . . . . . . . (D)
2 cos θ α2 + β2
q p
1
1
O
cosθ − sin θ q
P=
1
O
sin θ cosθ p
O
1
then A1 = Pt AP has the entries in (q, p) position and (p, q) position as zero.
In fact A1 differs from A only in qth row, pth row and qth column and pth column
and it can be shown that these new entries are
a 1 qi = a qi cos θ + a pi sin θ
i ≠ q, p (qth row pth row) . .(E)
a 1 pi = − a qi sin θ + a pi cos θ
154
a 1 iq = a iq cos θ + a ip sin θ
i ≠ q, p (qth column pth column) . .(F)
a 1 ip = − a iq sin θ + a ip cos θ
Find 1 ≤ q < p ≤ n such that a qp is largest among the absolute values of all
the off diagonal entries in A.
pth row, qth column, pth column which are obtained from (E), (F), (G).
Example:
7 3 2 1
3 9 − 2 4
A =
2 − 2 − 4 2
1 3
4 2
155
∴ q = 2, p = 4.
= (2 )(1 )(4 ) = 8 .
β = a qq − a pp = 9 − 3 = 6
∴α 2 + β 2 = 100 ; α 2 + β 2 = 10
β
1 +
1
∴ cos θ =
2 α 2
+ β 2
1 6 4
= 1 + = = 0.8 = 0.89442
2 10 5
1 α 1 8
sin θ = =
2 cos θ α +β
2 2 2( 0.89442 ) 10
= 0.44721
1 0 0 0
0 0 . 89442 0 − 0 . 44721
∴ P =
0 0 1 0
0 0 . 89442
0 . 44721 0
156
A1 = PTAP will have a124 = a142 = 0.
Other entries that are different from that of A are a121, a122, a123 ; a141, a142, a143,
a144 ; (of course by symmetric corresponding reflected entries also change).
We have,
7 3.1305 2 − 0.44721
3.1305 11 − 0.89443 0.0000
∴ A1 =
2 − 0.89443 −4 2.68328
− 0.44721 1.00000
0 2.68328
∴ q = 1, p = 2.
β = a qq − a pp = a11 − a 22 = 7 − 11 = − 4 = 4
157
α = 2a gp sgn (aqq − a pp ) = 2(3.1305)(− 1)
= - 6.2610.
α 2
+β 2
= 55 . 200121
α 2
+β 2
= 7 . 42968
1 β
cos θ = 1 + = 0 .87704 ;
2 α 2 + β 2
1 α
sin θ = = −0.48043
2 cos θ α +β2
2
a 112 = a 1 21 = 0
158
5.28516 0 2.18378 − 0.39222
0 12.71484 0.17641 − 0.21485
2.18378 0.17641 −4 2.68328
− 0.39222 − 0.21485
2.68328 1
5.78305 0 0 0
0 12.71986 0 0
0 0 − 5.60024 0
2.09733
0 0 0
Note: At each stage when we choose (q, p) position and apply the above
transformation to get new matrix A1 then sum of squares of off diagonal entries
of A1 will be less than that of A by 2a2qp.
159
4.6 The Q R decomposition:
(q ( ) , q ( ) ) = 0
i j
if i ≠ j …………….. (B).
r1i
r2 i
M
r (i ) = rii .......... .......... ..(C )
0
M
0
We want A = QR.
160
a(1) = QR ’s first column
= Qr(1)
= r11q(1) by (D)
(a ( ) , q ( ) ) = r (q ( ) , q ( ) ) + r (q ( ) , q ( ) )
2 1
12
1 1
22
2 1
(
= r12 Q q (1) , q (1) = q (1 ) ) 2
2 = 1by ( A )
(
and q ( 2 ) , q (1) = 0by (B ) )
( )
∴ r12 = a ( 2 ) , q (1) ............(F )
∴ (*) gives
r22 q ( 2 ) = a (2 ) − r12 q (1 )
and ∴ r22 q (2 ) = a ( 2 ) − r12 q (1 )
2 2
and
q (2 ) =
1
r22
[ ]
a ( 2 ) − r12 q (1 ) .......... .......... ..(H )
(F), (G), (H) give 2nd columns of Q and R. We can proceed having got the first i
- 1 columns of Q and R we get ith columns of Q and R as follows:
( ) ( )
r1i = a (i ) , q (1) ; r2i = a (1) , q (2 ) ,............, ri −1i = a (i ) , q (i−1) ( )
161
rii = a (i ) − r1i q (1) − r2i a (2 ) ....... − ri −1i q (i −1)
2
q (i ) =
i (i )
rii
[
a − r1i q (1) − r2i q (2 ) .......... − ri −1 q (i −1) ]
Example:
1 2 1
A = 1 0 1
0 1 1
r11 = a (1) = 12 + 12 = 2
2
1
2
1 (1 ) 1
q (1 ) = a =
r11 2
0
1
2 2
(
r12 = a (2 ) , q (1 ) ) = 0 ,
1
=
2
= 2
1 2 2
0
2 1 1
(2 ) (1 )
r22 = a − r12 q = 0 − 1 = − 1 = 3
1 0 1
2
2 2
162
1
3
q (2 ) =
1
[a ( ) − r
2
12 q (1 ) ]
= −
1
3 3
1
3
1
1 2
(
r13 = a (3 ) , q (1) ) = 1 ,
1
=
2
= 2
1 2 2
0
(
r23 = a (3 ) , q (2 ) = ) 1
3
1
1 1 3
1 1
= 1 − 1 − −
1 0 3 3
1
3 2
1
−
3
1 1 1 4 2
= = + + = 2 =
3 9 9 9 3 3
2
3 2
and
163
q (3 ) =
r33
[
1 (3 )
a − r13 q (1) − r23 q (2 ) ]
1 − 1
−
3 6
3 1 1
= =
2 3 6
2
2
3 3
1 1 1
−
2 3 6 2 2 2
1 1 1 1
∴Q = − ; R = 0 3
2 3 6 3
1 2 2
0 0 0
3 3 3
and
1 2 1
QR = 1 0 1 = A
0 1 1
giving us QR decomposition of A.
The QR algorithm
A3 = Q3 R3.
A1 = Q1 R1
A2 = R1 Q1
and the ith step is
Ai = Ri-1 Qi-1
Ai = Qi Ri
164
Then Ai ‘ converges’ to an upper triangular matrix exhibiting the eigenvalues of
A along the diagonal.
165