You are on page 1of 29

Block 5 Basis of Multivariate Normal (MVN).

UNIT 1 RECAP OF LINEAR ALGEBRA


Structure
1.1

Introduction
Objectives

1.2

Real Symmetric Matrices


Spectral decomposition theorem

1.3

Positive definite and nonnegative definite matrices


Square root method

1.4

Idempotent matrices

1.5

Cochrans Theorem

1.6

Singular Value Decomposition

1.7

Summary

1.8

Reference

1.9

Solutions / Answers

1.1 INTRODUCTION
In this unit, we recapitulate certain concepts and results which will be useful in the
study of multivariate statistical analysis. We start with the study of real symmetric
matrices and the associated quadratic forms. We define a classification for the
quadratic forms and develop a method for determining the class to which a given
quadratic form belongs. Positive definite and nonnegative definite matrices (which
we shall notice in unit 2 as the dispersion matrices) are very important for the study of
multivariate distributions, and, in particular the multivariate normal distribution. In
the section 1.3 we obtain some characterizations of positive definite and nonnegative
definite matrices and study some of their useful properties. We give a method of
computing a square root of a matrices which play an important role in transforming
correlated random variables to uncorrelated random variables, as we shall see in the
unit 2. Idempotent matrices and Cochrans theorem play a key role in the distribution
of quadratic forms in independent standard normal variables, particularly, in
connection with the distribution of quadratic forms to be independent chi-squares. In
sections 1.4 and 1.5 we study the properties of idempotent matrices and prove the
algebraic version of Cochrans theorem respectively. Singular value decompositions
plays a very important role in developing the theory and studying the properties of
canonical correlations between two random vectors. In section 1.6 we study the
singular value decomposition.

We use the following notations. Matrices are denoted by boldface capital letters like
A, B, C. Vectors are denoted by boldface italic lower case letters like x, y, z. Scalars
are denoted by lower case letters like a, b, . The transpose, rank and trace of a
matrix A are denoted by At, rank A and tr(A) respectively.
Rn denotes the ndimensional Euclidean space.
Objectives :
After completing this unit, you should be able to
Determine the definiteness of a given quadratic form
Apply the spectral decomposition in the study of principal components
Compute a triangular square root of a positive definite matrix
Apply the properties of positive and nonnegative definite matrices to the
problems that will be studied in unit 2 on multivariate normal distribution.
Apply Cochrans theorem to the distribution of quadratic forms in normal
variables
Apply the singular value decomposition in the development of canonical
correlations.

1.2 REAL SYMMETRIC MATRICES


Real symmetric matrices play a very important role in the study of Multivariate
statistical analysis. For example the variance-covariance matrices are real symmetric
matrices. They also play crucial role in the distribution of quadratic forms in
correlated normal random variables. We shall denote the (i, j)th element of a matrix
A by aij. Then we write A = ((aij)).
Definition:
A square matrix A of order n x n is said to be a real symmetric matrix
if (i) all the elements of A are real and (ii) aij = aji for i, j = 1, , n.
Example 1.
2

(i)
i
1
4

i
3

Which of the following are real symmetric matrices?


1

(ii)
1 2

3
3

(iii)

2 1

(iv)

4
5

Solution:
(i) is not a real symmetric matrix because not all of its elements are real.
(ii) is not a real symmetric matrix because it is not a square matrix.
(iii) is not a real symmetric matrix because (1, 2)th element 3 2 = (2, 1)th element.
(iv) is a real symmetric matrix because (a) it is a square matrix (of order 2 x 2), (b) all
its elements are real and (c) (1, 2)th element 4 = 4 = (2, 1)th element.
In 14.3 of MTE02 we noticed that there is a unique real symmetric matrix A
associated with a given real quadratic form Q(x) in the sense that Q(x) = xtAx. This
matrix is called the matrix of the quadratic form Q(x).
2

Example 2. Find the matrix of the quadratic form in three variables x1, x2, and x3.
2
2
Q(x) = 2 x1 x2 5 x1 x3 3 x2 x3 .
x1

Solution: Since there are three variables x1, x2, and x3, x = x2 . Let A be the
x
3
symmetric matrix such that Q(x) = xtAx. Then A is of order 3 x 3. Further a21 = a12 =
(coefficient of x1x2) = 1. In general, whenever i j, aij = aji = (coefficient of xi,
2
xj). Also aii = the coefficient of xi , i = 1, 2, 3. Thus

A= 1

3
0

2.5

2.5
0
1

We are now ready for an exercise.


E 1. Find the matrices of the following quadratic forms:
(i) x12 x22
(ii) 2 x12 3x1 x2 5 x22
(iii) 3 x1 x2 5 x2 x3 4 x1 x3
2
2
2
(iv) x1 x2 x4 (in four variables x1, x2, x3 and x4.)
Depending up on the range, every non null quadratic form Q(x) can be classified into
one of the following mutually exclusive and collectively exhaustive classes:
(a) positive definite (pd) if Q(x) > 0 for all x 0
(b) positive semidefinite (psd) if (i) Q(x) 0 for all x and (ii) Q(x) = 0 for some x
0
(c) negative definite (nd) if Q(x) < 0 for all x 0 (i.e., if Q(x) is positive
definite.)
(d) negative semidefinite (nsd) if (i) Q(x) < 0 for all x and (ii) Q(x) = 0 for some
x 0. (i.e., if Q(x) is positive semidefinite.)
(e) indefinite if it does not belong to any one of the above classes (a) (d) (i.e.,
There exists x and y such that Q(x) > 0 and Q(y) < 0)
The quadratic form Q(x) 0 can be classified into any one of (b) and (d).
Example 3: Classify each of the following quadratic forms using the above
classification (It is also said to be identification of the definiteness of the quadratic
form). Also write down the matrices of the respective quadratic forms.
(i) x12 x22
(ii) x12 x22
(iii) x12 x22 2x42 (in four variables x1, x2, x3 and x4.)
(iv) x12 x 22 (v) x12 x22 2x42 (again in four variables.)

Solution:
(i) x12 x22 1 if x1 = 1 and x2 = 0. Again x12 x22 1 if x1 = 0 and x2 = 1.
1

is indefinite. The matrix of the quadratic form is


0

Thus it

0
.
1

(ii) x12 x22 > 0 whenever at least one of x1 and x2 is not zero. Hence this quadratic
1

form is positive definite. The matrix of the quadratic form is


0
matrix.

0
, the identity
1

(iii) Q(x) = x12 x22 2x42 0 for all values of x1, x2, x3 and x4. However for x3 = 1 and
x1 = x2 = x4 = 0, the value of x12 x22 2x42 = 0. Thus there is a vector
0

0

x = 0 such that Q(x) = 0. Hence this quadratic form is positive semi-definite.


1

1
0
The matrix of the quadratic form is
0

0
1
0
0

0
0
.
0

0
0
0
0

We leave it to you to show that the quadratic forms in (iv) and (v) are negative
definite and negative semi-definite respectively. (You can use the quadratic forms in
(ii) and (iii) to arrive at the above conclusion and in writing down the matrices of the
quadratic forms in (iv) and (v).
In the above example, we considered quadratic forms whose matrices are diagonal
matrices. Here it is easy to identify the definiteness of the quadratic form. In fact if
n

Q(x) =

x
i 1

2
i

is a quadratic form in n variables x1,, xn, then Q(x) is p.d., p.s.d.,

n.d., n.s.d. or indefinite according as i > 0 for all i; i 0 for all i and j = 0 for some
j; i < 0 for all i; i 0 for all i and j = 0 for some j and i > 0 for some i and j < 0
for some j respectively.
What if we have a quadratic form Q(x) = 2 x12 3 x1 x2 x22 or Q(x) =
2 x12 x22 x32 3 x1 x2 2 x1 x3 4 x2 x3 ? (Notice that the matrices of these quadratic

2
forms are
1.5

1.5
1

2
1.5
and
1

1.5
1
2

1
2
1

respectively and are not

diagonal matrices.)
In general, consider a quadratic form Q(x) = xtAx where A is not a diagonal matrix.
How do we determine the definiteness of the quadratic form in such a case? The
following results will be useful towards that end.

Theorem 1. Consider a quadratic form Q(x) = xtAx where A is symmetric. Make a


nonsingular linear transformation of the variables: y = Tx (where T is nonsingular)
t
call the transformed quadratic form as ( y )( y t T 1 AT 1 y ). Then the ranges of
Q(x) and (y) are the same.
Proof: Let belong to the range of Q(x). So, there is a vector x0 such the = Q(x0) =
t
t
x0t Ax0 . Write y0 = Tx0. Now = x0t Ax0 = x0t T t T 1 AT 1Tx0 = y0t T 1 AT 1 y0 =
(y0). Hence belongs to the range of (y). Thus the range of Q(x) is a subset of the
range of (y). Since T is nonsingular, by reversing the arguments, we can show that
the range of (y) is a subset of the range of Q(x).
What we are saying through the Theorem 1 is that the range of a quadratic form is
invariant under nonsingular linear transformations. Thus the definiteness of a
quadratic form is invariant under nonsingular linear transformations. (Making a
nonsingular linear transformation can also be interpreted as changing the basis as was
done in section 14.4 of MTE-02.)
Recall that a real square matrix S is called an orthogonal matrix if St = S-1. It is easy
to see that if S and T are orthogonal matrices of the same order, then so is ST, for,
Tt St S T = Tt I T = I. Similarly S T Tt St = I. Hence (ST)t = Tt St is the inverse of
1

ST. Also, it is easy to verify that


O
orthogonal matrix.

O
T

is an orthogonal matrix if T is an

Our object is to determine the definiteness of a quadratic form Q(x), the matrix of
which is not necessarily diagonal. We shall now show that we can make an
orthogonal transformation of the variables (i.e. we can make a transformation y = Px
where P is an orthogonal matrix) such that under this transformation, the quadratic
2
form is transformed into a quadratic form i yi . Since we know how to
2
2
determine the definiteness of i yi , and since the definiteness of i yi is the
same as that of Q(x), we have the definiteness of Q(x).
If A is a real matrix, then it is not necessary that its eigen values are real. For
0

example if A =
, then the eigen values are i and i . However, if A is real
1 0
and symmetric then all its eigen values are real as shown below.
Theorem 2. Let A be a real symmetric matrix. All the eigen values of A are real and
all the eigen vectors of A can be chosen to be real.
Proof: Let +i be an eigen value of A and let the corresponding eigen vector be
x+iy where and are real numbers and x and y are real vectors. Clearly at least one
of x and y is nonnull as x+iy, being an eigen vector is nonnull. Now,
A(x+iy) = (+i)(x+iy)
Equating the real parts on both sides and the imaginery parts on both sides of the
above equality we get,

Ax = x - y
Ay = y + x

(2.1)
(2.2)

Premultiplying (2.1) by yt and (2.2) by xt, we get


yt Ax = yt x - yt y
xt Ay = xt y + xt x

(2.3)
(2.4)

Since A is symmetric and yt Ax is a scalar we have yt Ax = (yt Ax)t = xt At y = xt A y.


Similarly yt x = xt y. Now subtracting (2.3) from (2.4) we get

( xt x + yt y) = 0. Since at least one of x and y is non-null, xt x + yt y 0.


Hence all the eigen values of A are real. Further A(x+iy) = ( x+iy) yields A x = x
and A y = y. Since at least one of x and y is non null and x and y are real, we can
choose the non-null vector between x and y as an eigen vector of A corresponding to
.
Now we are ready to prove an important result concerning the real symmetric
matrices, namely, the spectral decomposition theorem.
Theorem 3. Let A be a real symmetric matrix of order n x n. Then there exists a real
orthogonal matrix P of order n x n such that A = P Pt where is a real diagonal
matrix.
Proof: We shall prove the theorem by induction on n. Let A be a 1 x 1 real symmetric
matrix i.e. A = a where a is a real number. Clearly 1t.a.1 = 1.a.1 = a. Also 1 is an
orthogonal matrix of order 1 x 1 since 1t.1 = 1.1 = 1. So the theorem is true for n = 1.
Let the theorem be true n = m (a possible integer 1). Let A be a matrix of order
(m+1) x (m+1). Let x1 be a normalized eigen vector of A corresponding to eigen
value 1. Then A x1 = 1 x1. Then x1 can be extended to an orthonormal basis x1,,
xn, of Rm+1. (See unit 12 of MTE-02). Write R = (x1 :.: xn) clearly. R is an
orthogonal matrix. Now

1 b12t
b12t
t
= R 1
where 0, b12
AR = A(x1 :.: xm+1) = (x1 :.: xm+1)
and

0
B
0
B

22

22
B 22 are of order m x 1, 1 x m and m x m respectively. [ This is so because, Ax2,,
Axm+1 being vectors in Rm+1 and x1,, xm+1 form a basis of Rm+1, Axi is a linear
combination of x1,, xm+1.]
1
So R AR =
0
t

b12t
. Since RtAR is real and symmetric it follows that b12 = 0 and
B 22

0
1
. By induction
B22 is an m x m real symmetric matrix. Thus RtAR =
0 B 22
hypothesis there exists an orthogonal matrix S1 of order m x m such that B 22 S1 S 1t


S t or A = RS

where is a real diagonal matrix. Writing S =


0 S1
orthogonal matrix and

RtAR = S
0

we notice that S is an

S t R t

Write P = RS. Since R and S are orthogonal matrices so is P as noticed earlier.


Writing
0

, we observe that is a real diagonal matrix.

Thus the theorem is true for n = m+1.


Hence the theorem follows by induction on n.
The beauty of the theorem 3 lies in its interpretation. Let A be a real symmetric
matrix and let A = PPt where P is orthogonal and is a real diagonal matrix. We
then have

AP = P

or

1 0

A(p1 :.: pn) = (p1 :.: pn)


0
n

or

Api = ipi, i = 1, , n.
Since pi is a vector in an orthonormal basis, pi is (a non null vector) of unit norm.
Hence i is an eigen value of A and pi is an eigen vector of A corresponding to the
eigen value i. Thus the diagonal elements of are the eigen values of A and the
columns of P are the orthonormal eigen vectors of A. Further
1 0

A = PPt = (p1 :.: pn)


0
n

p1t

= 1 p1 p1t n pn pnt
t
pn

t
Write Ei pi pi , i = 1, , n

E i if i j
0 otherwise

Then E i E j

t
and rank Ei = rank pi pi = rank pi = 1

Thus we are able to write A i E i where E1,., En are symmetric idempotent


i 1

matrices of rank 1such that Ei Ej = 0 whenever i j. the set {1,., n} is called the
spectrum of A. Since the decomposition mentioned above involves the spectrum and
the eigen vectors it is called a spectral decomposition of A.

Example 4. Obtain a spectral decomposition of the matrix A =


1
4
1

Solution: The characteristic equation of the matrix is

1
=0
2

or (4 - )(2 - ) 1 = 0
or 2 - 6 + 7 = 0
6 36 28
3 2
2

The roots are

So the eigen values are 3 2 and 3 2 .


u1
Let u be the eigen vector of the given matrix corresponding to the eigen value
u2
3 2 .

Then

[ A (3

2 )I ]u 0

1 2

or

u1
0

1 2 u2
1

Notice that the second column of A (3 2 )I is 1 2 times the first column.


So, u1 1 2 and u2 1 satisfy the equation A (3 2 )I u 0
To normalize u, we divide it by its norm namely
normalized eigen vector corresponding to
1 2

1
42 2
1

1 2

1
42 2

the

12

eigen

42 2

value

. Thus the
is
3 2

It can be shown similarly that the normalized eigen vector


1

.
4 2 2 1 2
1

corresponding to 3 2 is
P

1 2

and 3 2

Hence A = PPt where


0

3 2

Using theorem 3, we can determine the definiteness of a quadratic form. Consider the
quadratic form Q(x) = xtAx. Let A = PPt be a spectral decomposition of A. Then
Q(x) = xPPtx = yty where y = Ptx. Since P is nonsingular (in fact, orthogonal) the
definiteness of Q(x) is the same as the definiteness of yty. The definiteness of yty
is determined by the diagonal elements 1,., n are the eigen values of A.
Thus

positive definite if i 0 for all i

positive semidefinite if i 0 i and j 0 for some j

xtAx is negative definite if i 0 for all i


negative semidefinite if 0 for all i and 0 for some j
i
j

indefinite if i 0 for some i and j 0 for some j

Because of the one-one correspondence between real symmetric matrices and the
quadratic forms, we call a real symmetric matrix A as positive definite, positive
semidefinite, negative definite, negative semidefinite or indefinite according as the
quadratic form xtAx is positive definite, positive semidefinite, negative definite,
negative semidefinite or indefinite respectively.
Example 4. Determine the definiteness of the quadratic forms (i) 2 x12 x1 x2 x22 and
(ii) x12 x 22 x32 3 x1 x 2 3 x1 x3 3 x 2 x3 .

0.5

Solution: (i) The matrix of the quadratic form is A =


0.5

We see that the characteristic equation |A - I| = 0 is


(2-)(1-) + 0.25 = 0 or 2 - 3 + 2.25 = 0
Hence the eigen values which are the roots of the above equation are

99
or
2

3/2 and 3/2.


Hence the quadratic form is positive definite.

1
1.5
(ii) The matrix of the quadratic form is A =
1.5

1.5
1
1.5

1.5
1.5
1

It is easy to notice that the sum of each row in A is -2.


1
1


Hence A 1 2 1 . Thus -2 is an eigen value of A. Further the sum of the eigen
1
1

values which is the same as the trace of A is 3. Hence there must be at least one
positive eigen value of A. So the quadratic form is in definite.
E2.

Let A be a real symmetric matrix, a diagonal element of which is negative.


Show that A cannot be positive definite or positive semidefinite.

E3.

Determine
(i)
(iii)

the

definiteness

x12 5 x1 x2 7 x22 ,

(ii)

of

the

following

quadratic

forms:

x12 x22 x32 x1 x2 10 x1 x3 2 x2 x3 ,

2 x12 3 x22 4 x32 6 x1 x2

E4.

Let A =
1
100
A .

. Obtain the spectral decomposition of A. Hence write down

1.3 POSITIVE DEFINITE AND NONNEGATIVE DEFINITE MATRICES


In the previous section, we noted that the definiteness of a quadratic form is also
attributed to the matrix of the quadratic form. Thus, if xtAx is positive definite where
A is a real symmetric matrix, then we call A as a positive definite (pd) matrix. A real
symmetrix matrix is called a nonnegative definite (nnd) matrix if it is either pd or psd
i.e., if xtAx 0 for all x. Positive definite (pd) and nonnegative definite (nnd)
matrices play a very important role in the multivariate analysis. For example, we
shall see in the next unit that the variance-covariance matrix of a random vector is
nnd. Unless stated otherwise, when we say a matrix is pd, psd, nnd, nd, nsd we mean
that the matrix is real and symmetric. We may not state this fact explicitly each time.
In this section we study several properties of positive definite and nonnegative
definite matrices. We shall also give an easy way to construct positive definite
matrices and orthogonal matrices of order n x n. Let us start with the following very
useful.
Theorem 4. (a) A matrix A is positive definite if and only if A = BBt for some
nonsingular matrix B.
(b) A matrix A is nonnegative definite if and only if A = CCt for some
matrix C.
Proof. (a) If part. Let A = BBt for some nonsingular matrix B. Let x be a nonnull
vector. Then
2
2
xtAx = xt BBt x =yty = y1 yn 0 where y = Bt x

Since B is nonsingular, so is Bt
Let if possible y = Bt x = 0. Then x = (Bt)-1y = 0
Since x 0, there is a contradiction. So y 0.
Hence xtAx =yty > 0.
The choice of x being arbitrary, it follows, that A is positive definite.
Only if part. Let A be positive definite. Then all its eigen values are strictly positive.
Let A = PPt be a spectral decomposition of A. Let 1 , , n be positive square
roots of 1, , n respectively. Write

10

1
2

Then B = P Pt is symmetric and BBt = P Pt P Pt = P Pt = PPt = A.


Further since 1, , n are strictly positive, so are 1 , , n . Now |B| = | P Pt |
= |P|| Pt || | = | PPt || | = |I|| | = 1 n > 0. Hence B is nonsingular. Notice
that B = P Pt is a spectral decomposition of B. Now since diagonal elements of
are strictly positive it follows that B is pd. In fact, we proved that if A is pd then A =
BBt for some symmetric pd matrix B.
(b) It suffices to prove the statement for positive semidefinite matrices as an nnd
matrix is either pd or psd. (For the pd matrices we already proved the statement in
(a).)
If part. Let A = CCt. Then xtAx = xt CCt x =utu 0 where u = Ctx. Hence A is
nnd.
Only if part. Notice that since A is psd all its eigen values are nonnegative. Let 1
2 r > r+1 = = n = 0 be the eigen values of A. We can write a spectral
decomposition of A as
1
0

A = P

Pt

Where P is an orthogonal matrix and 1 is a diagonal pd matrix of order r x r. Write

1 2

1

0

Then

12 0
Pt is symmetric and CCt = A.
C = P 1
0 0

A matrix B such that A = BBt is called a square root of A. Given A, B is not unique
since A = BPPtBt where P is any orthogonal matrix. In theorem 4 we gave a method
of computing a square root if we know the spectral decomposition of A. However
obtaining spectral decomposition is not easy in general. We give below a method of
obtaining a square root of a positive definite matrix. Let us start with an example.
4

Example 6. Obtain a square root of the positive definite matrix A = 1


2

11

Solution. We shall obtain a lower triangular matrix B such that A = BBt. Write B =
0
b11 0

b21 b22 0 . We shall solve for bij j = i,,3, i = 1, 2, 3 such that A = BBt.
b

31 b32 b33
Write
b11

= b21
2 1 5
b

31
4

1
3

0
b33

0
b22
b32

b11

0
0

b21
b22
0

b31

b32
b33

Equating the elements on both sides of the equality sign,


2
a11=4= b11
or b11=2 (You can choose either +2 or -2 but choose and fix one of them)

a12 = 1 = b11 b21 so b21 =

1
=
b11

a13 = 2 = b11 b31 so b31 = 1


1
11
=
. So b22 =
4
4

11
.
4

a23 = 1 = b21 b31+ b22 b32 or b22 b32 = 1 .1 = or

b32 =

2
2
2
b22
a22 = 3 = b21
or b22
= 3-

a33 = 5 = b312 b322 b332 or


2

1
Thus B =
2

2
b33
5 12

11

=5-1-

1
43

11 11

1
2

4
1

.
11
11

or b33

43
.
11

11
4
1
11

0 is a square root of A.

43
11

For the given matrix A in example 6, we could obtain a lower triangular matrix B
such that A = BBt. Can we always do this? Let us examine how we went about in
solving for the elements of B. First we solved for the first column of B, then for the
second column and so on. Also observe that each time we just had to solve one
equation in one unknown to obtain the elements of B. Could there have been some
2
2
hitch? What if the computed value for b22
or later for b33
turns out to be negative?
If it happens to be so, we would not have been able to solve for B. It can be shown
(which is beyond the scope of the present notes) that if A is positive definite then the
above situation never arises. (For a proof see Rao and Bhimasankaram(2000) page

12

358-359) There is also another method of obtaining a triangular square root using
elementary row operations. (See Rao and Bhimasankaram (2000) page: 361-363).
A square root of a positive definite matrix is useful in transforming correlated random
variables to uncorrelated random variables as we shall see in unit 2.
E 5. Compute a lower triangular square root in each of the following cases.
4
(i)
1

(ii) 3
3

A11 A12
be an n x n positive definite matrix where A11 and
Example 7. Let A = t
A12 A 22
A22 are square matrices of order r x r and (n-r) x (n-r) respectively for some r (1 r
n-1). Show that A11 is positive definite.
Solution: Let x be a nonnull vector.
Then x A11 x x
t

t
1 n

A
: 0 11
t
A12

A12


A 22 n n 0

x : 0 A 0
0

n n

since
0

is a

nonnull vector. Hence A11 is pd.


Example 8. Let A be a positive definite matrix. Then show that |A| > 0.
Solution. Since A is positive definite, by theorem 4(a), A = BBt for some nonsingular
matrix B. So
|A| = |BBt| = |B|.|Bt| = |B|2 > 0.
a11

a12
Let A =

a
1n
a11

a12
Write Ai =

a
1i

a12
a22

a2 n
a12
a22

a2 i

a1n

a2 n
be a pd matrix.

ann
a1i

a2i
, i = 1,, n.

aii

The matrices Ai, i = 1,, n are called leading principal submatrices of A. Combining
examples 7 and 8 we have |Ai|>0 for i = 1,, n if A is p.d. Is the converse true? This
leads us to the
Theorem 5. Let A be a real symmetric matrix of order n x n. Let Ai, i = 1,, n be
as defined above. Then A is positive definite if and only if |Ai| > 0 for Ai, i = 1,, n.
13

Proof: Only if part follows from examples 7 and 8. For the proof of the if part see
Rao and Bhimasankaram (2000) page 341.
Example 9. Let A be symmetrix positive definite. Show that RARt is pd where R is
any nonsingular matrix.
Solution. Let A = BBt where B is nonsingular. Then RARt = RBBtRt = CCt where C
= RB. Further, C is nonsingular since R and B are nonsingular. Hence RARt is pd.
Example 10. A symmetric matrix A is positive definite if RARt is pd for some
nonsingular matrix R.
Let x 0. consider xtAx = xtR-1RARt R 1 x = ytRARty where
y R 1 x 0 since x 0. Hence xtAx = ytRARty > 0 since y 0 and RARt is pd.
Hence A is pd.
t

Solution.
t

E6.

Let C and D be symmetric matrices of order r x r and (n-r) x (n-r)


C

respectively. Show that


0
E7.

E8.

A11
Let A = t
A12
A22 is pd.

is pd if and only if C and D are pd.

A12
where A11 and A22 are square. Show that if A is pd, then
A 22

Let A be as in E7. Show that if A is nnd, then A11 and A22 are nnd.

A11 A12
be a partition of A where A11 and A22 are square. A
Theorem 6. Let A = t
A12 A 22
1
is pd, if and only if A11 and A22 - A21 A11
A12 are pd.
Proof: For the if part A11 is pd. For the only if part A is pd and hence A11 is pd by
theorem 5. It is easy to see that

A11
A = t
A12

I
A12

= t 1
A 22
A12 A11

A11
Hence A = R
0
I

Where R = t 1
A12 A11

A11

1
A 22 A 21 A11 A12

1
I A11
A12

0
t
R
1
A 22 A 21A11
A12
0

. Notice that |R| = |I|.|I| = 1

Hence R is nonsingular. By examples 9 and 10, and E6 it follows that A is pd if and


t
1
A11
only if A11 and A22 - A12
A12 are pd.
14

We promised in the beginning that we shall give an easy way to construct pd matrices
and orthogonal matrices. We shall do so now.
Theorem 7. Let A be a symmetric matrix of order n x n with positive diagonal
aij i = 1,, n.
elements and let aii
j i

Then A is positive definite.


The proof is beyond the scope of this notes.
Using the above theorem, it is easy to see that the matrix A in example 6 and those in
E5 are pd.
Theorem 8. Let u be a vector with unit norm. Then I - 2uut is a symmetric
orthogonal matrix.
Proof: (I - 2uut)t = It 2(ut)tut = I - 2uut
Hence I - 2uut is symmetric.
Further, (I - 2uut)(I - 2uut) = I - 2uut - 2uut + 4uutuut = I - 4uut + 4uut = I since utu
=1. So I - 2uut is orthogonal.
Example 11. Let A be a positive semidefinite matrix of order n x n and let aii = 0.
Then show that aij = 0 for j = 1, , n.
Solution: Consider x = ei + ej where ei and ej are ith and jth columns of the identity
matrix of order n x n.
Now

xtAx

2 t
t
t
= ei Aei e j Ae j 2ei Ae j
2
= aii a jj 2aij a jj 2aij

Let if possible aij 0. Choose

a jj 1
2aij

. Then xtAx = ajj (ajj+1) = -1 < 0 which

is a contradiction since A is psd. Hence aij = 0. Choice of j being arbitrary, the result
follows.
E9.

Let A be an nnd matrix. Show that xtAx = 0 if and only if Ax = 0.

E10.

Show that (1-)I + 11t is pd if and only if


of the matrix and 1t = (1, , 1).

1
1 where n is the order
n 1

E11.

Construct a 3 x 3 symmetric nondiagonal positive definite matrix A such that


a11 = 2, a22 =5, a33 = 4.

E12.

Let A and B be nnd matrices of the same order. Show that (i) A + B is nnd;
(ii) the column space of A is a subspace of the column space of A+B.
15

1.4 IDEMPOTENT MATRICES


A square matrix A is said to be idempotent if A2 = A. Can you quickly come up with
some examples of idempotent matrices? Yes, you are right! O and I are idempotent
matrices. In fact, the only nonsingular idempotent matrix is I. Why? This is so
because A2 = A and A is nonsingular implies A = I (premultiply both sides of A2 = A
by A-1.) Similarly the only square matrix of rank 0, namely O is idempontent. What
Ir 0
is an
about idempotent matrices of order n x n with rank r (1 r n-1)?
0 0
example of an idempotent matrix of rank r.
Further, if A is an idempotent matrix and P is a nonsingular matrix of the same order,
then PAP-1. PAP-1 = PA2P-1 = PAP-1.
Thus PAP-1 is an idempotent matrix.
I r 0 -1
P is an idempotent matrix of rank r for every nonsingular matrix P.
Hence P
0 0
We shall now show that every idempotent matrix of rank r is of the form
I r 0 -1
P for some nonsingular matrix P.
P
0 0
Theorem 9. Let A be an n x n matrix of rank r (1 r n-1). Then A is idempotent if
I r 0 -1
P for some nonsingular matrix P.
and only if A = P
0 0
Proof: If part has already been proved above. For the only if part, let A be an
n x n idempotent matrix of rank r (1 r n-1). Let A = (a1: a2:: an). We have
(a1:: an) = A = A2 = A(a1:: an)
Hence Aai = ai i = 1,, n. Since rank A = r, there exist r linearly independent
columns ai1 , , ai r of A. Thus
A ai = ai , j = 1,, r.
j

(1.4.1)

Again, A(I-A) = 0. Hence the column of (I-A) is a subspace of the null space N(A) of
A. We know that dimension of N(A) = n - rank A = n - r. So, rank of I-A is at most
n - r. On the other hand, since I = A+(I-A), n = rank I rank A + rank (I-A).
Hence rank (I-A) n - rank A. Thus rank (I-A) = n - rank A. This, coupled with the
fact the column space of (I-A) N(A), shows that the column space of (I-A) is the
same as the null space of A. Let el al , , el al
be linearly independent
columns of I-A. Then
1

nr

n r

16

A el k al k = 0, k = 1,, n-r.

(1.4.2)

Consider P = ai1 : : ai r : el1 al1 : : el nr al nr . Clearly, P is an n x n matrix. Let

Px = 0. Then, x1ai1 x2ai2 xr air xr 1 el1 al1 xn el nr al nr 0

Now, 0 = APx = x1Aai1 x2 Aai2 xr Aair xr 1A el1 al1 xn A el nr al nr


= x1ai1 xr air in view of (1.4.1) and (1.4.2)

This implies x1 = x2 = xr = 0 since ai , , ai are linearly independent. This in turn


implies xr+1 = , xn = 0 since el1 al1 , i = 1,, n-r are linearly independently.
1

Thus Px = 0 implies x = 0 or the columns of P are linearly independent. So rank


P = n or in other words P is nonsingular.
Further, AP

= A a i 1 : : ai r : el 1 al 1 : : el n r a l n r
= a i : : a i : e l a l : : el
1

Ir
= P
0
Ir
0

Thus we have A = P

nr

I
al nr r
0

0 -1
P .
0

Let A be an idempotent matrix of order n x n with rank r. From the theorem 9, the
following statements are clear.
(a) A is similar to a diagonal matrix.
(b) A has at most two distinct eigen values 1 and 0.
multiplicity r and 0 with algebraic multiplicity n-r.

1 is with algebraic

Finding the rank of a matrix in general is not very easy. However it is quite easy for
idempotent matrices. We start with a definition.
Definition.

The trace of a square matrix A of order n x n is defined as the sum of


n

its diagonal elements and is denoted by tr(A). Thus tr(A) =

a
i 1

ii

Example 12. Let A and B be are square matrices of the same order. Show that (i)
tr(c.A) = c.tr(A) when c is a real number; (ii) tr(A+B) = tr(A)+ tr(B).
Solution: Left as an exercise.
Example 13. Let A and B be matrices of order m x n and n x m respectively. Show
that tr(AB) = tr(BA).
17

Solution:
m

The (i,i)th element of AB is given by

aij b ji =
i 1 j 1

aij b ji =
j 1 i 1

a b
j 1

ij

ji

b ji aij = tr(BA), since


j 1 i 1

Hence tr(AB) =

b
i 1

ji

aij is the (j,j)th

element of BA.
We are now ready to prove
Theorem 10. Let A be an idempotent matrix of order n x n. Then rank A = tr(A).
Proof: The proof is trivial if rank A is 0 or n. Let rank A = r when 1 r n-1. Then
I r 0 -1
P .
by theorem 9, there exists a nonsingular matrix P such that A = P
0 0
Ir
Now tr(A) = tr P
0

0 -1
P = tr
0

Ir

0 -1
I
P P = tr r
0
0

r = rank A.

We state below another result on idempotent matrices.


Theorem 11. A square matrix A of order n x n is idempotent if and only if rank (I-A)
= n - rank A.
For proof one may refer to Rao and Bhimasankaram (2000) Page 134.
Theorem 12. Let A be a real symmetric and idempotent matrix of rank r. Then there
Ir 0 t
S . Hence A is nonnegative
exists an orthogonal matrix S such that A = S
0
0

definite.
Proof: Left as an exercise.
Example 4. Let A and B be idempotent matrices of the same order. Then show that
A+B is idempotent if and only if AB = BA = 0.
Solution: If part is trivial. For the only if part, let A, B and A+B be idempotent.
Then A+B = (A+B)(A+B) = A2 +B2 +AB + BA = A + B + AB + BA. So, AB+BA =
0. Premultiplying by A, we get AB+ABA = 0. Now post multiplying the previous
equality by A we get ABA+ABA = 0 or ABA = 0. Hence AB = 0 and as a
consequence BA = 0.
E13.

Let A be a 2 x 2 idempotent matrix. Can a11 be equal to 2?

E14.

Let A and B be idempotent matrices.

Then show that


0

is also

idempotent.

18

E15.

Show that if A and B are idempotent and the column space of A is contained
in the column space of B, the BA = A.

1.5 COCHRANS THEOREM


Cochrans theorem concerns the distributions of quadratic forms in independent
x1

standard normal variables. Let x = be a vector of independent standard normal


x
n
variables. Let A1, A2,Ak be real symmetric (nonrandom) matrices such that
xtx = xt A1x + xtA2x + + xtAkx. We know that xtx is distributed as chi-square with
n degrees of freedom. Cochrans theorem asserts that xtAix, i = 1,, k are distributed
n

as independent chi-squares if and only if

rank A
i 1

n . In this section we prove an

algebraic version of this result. In the next unit we shall prove the statistical version.
Theorem 13. Let A1, A2,Ak be real symmetric matrices such that A1 + A2+ + Ak
= I. The following are equivalent:
(a) Ai is idempotent, i = 1,, k
k

(b)

rank A
i 1

(c) Ai Aj = 0 whenever i j
k

Proof: (a) (b): n = rank I = tr(I) = tr(A1,,Ak) = tr(A i ) =


i 1

rank A
i 1

(since

A1,,Ak are idempotent rank Ai = tr(Ai) by theorem 10.)


(b) (c): Let rank Ai = ri. Then by (b)

r n.
i 1

Since Ai is a real symmetric

t
t
matrix there exists a matrix Pi of order n x ri such that A i Pi i Pi , Pi Pi I ri and
i is a real nonsingular diagonal matrix, i = 1,, k.
k

So, I = A1++Ak =

P P
i 1

t
i

= PPt
1

where P = (P1: P2::Pk) and =


0

0
2
0
0

k
0
0

Notice that the number of

columns in P is

which equals n by hypothesis. Hence P is a square matrix of

order n x n. So, n rank P rank (PPt) = rank I = n. Hence P is a nonsingular


matrix. Similarly is also nonsingular. Since PPt = I and P is a square matrix,
19

Pt = (P)-1. In other words, PtP = I or PtP = -1 which is a diagonal matrix. So


Pit P j = 0 whenever i j. Hence Ai Aj = Pi i Pit P j j P tj = 0 whenever i j.
2
(c) (a): For each i, Ai = Ai I = Ai (A1++Ak) = A i since Ai Aj = 0 whenever i j.
Thus Ai is idempotent for each i.

We now prove an algebraic version of another useful result in connection with the
distribution of quadratic forms in normal variables.
Theorem 14. Let A and B be symmetric idempotent matrices and let B-A be non
negative definite. Then B-A is also a symmetric idempotent matrix.
Proof: Since A is symmetric idempotent, it is nnd. Since B-A is nnd, the column
space of A is contained in the column space of B. So BA=A. Then (B-A)A= 0 =
A(B-A). Since B(I-B) = (I-B)B = 0, A(I-B) = (I-B)A = 0 and (I-B)(B-A) = 0 = (B-A)
(I-B). Now A+ (B-A) + (I-B) = I. By (c) (a) of theorem 13 it follows that B-A is
idempotent.

1.6 SINGULAR VALUE DECOMPOSITION


In theorem 3, we showed that if A is a real symmetric matrix, there exists an
orthogonal matrix P such that A = PPt. We also showed that the diagonal elements
of are the eigen values and the column of P are the orthonormal eigen vectors of A.
What about a real m x n matrix A? We know that every m x n matrix A of rank r (1
Ir 0
S when R and S are nonsingular.
r min{m,n}) can be written as A = R
0 0
Can we replace the nonsingular matrices by orthogonal matrices if we can relax Ir to a
positive definite diagonal matrix? If so, what interpretation can we give to the
orthogonal matrices and the diagonal elements of the diagonal matrix? We study
these details in this section.
Theorem 15. Let A be a real matrix of order m x n with rank r (1 r min{m,n}).
Then there exists orthogonal matrices U and V of orders m x m and n x n

0

respectively such that A = U

Vt

when is a positive definite diagonal

matrix.
Proof: Notice AAt and AtA are non negative definite matrices (Why? See theorem
4). Let u1, u2,, um be orthogonal eigen vectors of AAt corresponding to the eigen
values 1 2 r > r+1 = = m =0. So AAtui = iui, i = 1,,m. Write
vi

1
A t ui , i = 1,,r. Where
i

i is the positive square root of i. Then for i, j

= 1,,r

20

1if i j
vi v j u AA u
i j 0 if i j
1

t t
i j

Thus v1,, vr are orthonormal vectors. Extend v1,, vr to an orthonormal basis


v1,, vn of Rn. write U = (u1:: um) and V = (v1:: vn). Clearly U and V are
orthogonal matrices.
Also AAtui = 0 for i = r+1,,n. Hence ui' AA t ui = 0 or At ui = 0 for i = r+1,,n.
t
t
Since u1u1 ... um um I, we have

t
t
A = u1u1 ... um um A

= u u ... u u A since urt A=0 for i = r+1,,m.


t
1 1

i ui vit

i 1

Denote i =

t
r r

i , i = 1,,r.
r

It follows that A =

i 1

0 t
V .
0
0

i ui vit = U

We shall now interpret the columns of U and V and the diagonal elements of in the
0 t
V when U and V are orthogonal and is a positive
above form. Let U
0 0
definite diagonal matrix.
Then rank of A is the same as rank of which in turn is the number of rows in .
Now
2


AA = U
0

0

0 0

2
Thus U
0

0 t
U is a spectral decomposition of AAt. Hencethe diagonal elements
0

0
0

U = U
t

Ut.

of 2 are the nonzero eigen values of AAt and the columns of U are the orthonormal
eigen vectors of AAt. To be more specific

21

i2ui for i 1,, r

AAtui

0 for i r 1, , m

2
Again A A= V
0
t

0 t
V , which is a spectral decomposition of AAt. Hence
0

i2vi for i 1,, r

At Aui

0 for i r 1,, n

Thus the diagonal elements of and the columns of U and V relate to the eigen
values and eigen vectors of AAt and AtA. The diagonal elements of are called the
singular values and the columns of U and V are called the singular vectors of A. The

0

decomposition A = U

Example: Let A =

Vt is called the singular value decomposition of A.

0
1

1
2
3
2

2
1
2

2 2

2 0
1 0

1
0

0
0

0
0

1 1
2 1

1
1
1

1
1
1

1
1

be a

singular value decomposition of A. What are the eigen values of AAt and AtA?
Identify the corresponding eigen vectors. What is the rank of A?

0

Solution: A = U

1
1
Where U = 2
3
2

Vt

2
1
2

2
1

1 1
V=
2 1

and
2

are orthogonal matrices, and =


0

1
1

1
1

1
1

1
1

1
1

The eigen values of AA are 12 = 4, 22 = 1 and 0. The corresponding eigen vectors


are the

1
and
3

1
2

1
1

2
first, second and third columns respectively of U, namely
, 1
3
3

2
2
2

2 respectively.
1

We leave it as an exercise to identify the eigen values and eigen vectors of AtA.

22

The rank of A is the same as the number of rows of namely 2.


We remark here that the nonzero eigen values of AAt and AtA are the same. In fact
for any two matrices A and B of order m x n and n x m respectively, the nonzero eigen
values of AB and BA are the same. (The proof is beyong the scope of this notes. For
a proof we refer the reader to Rao and Bhimasankaram (2000) page 282.)

1.7 SUMMARY
In this unit we have covered the following points:
1.
2.
3.
4.
5.
6.
7.
8.
9.

Definition of a real symmetric matrix


Classification of quadratic forms
Spectral decomposition of a real symmetric matrix
A method of determining the definiteness of a quadratic form
Properties of positive definite and nonnegative definite matrices
A method of finding a triangular square root of a pd matrix
Properties of idempotent matrices
Cochrans Theorem
Singular Value Decomposition.

1.8 REFERENCE
Ramachandra Rao A and P Bhimasankaram (2000) Linear Algebra 2nd Edition,
Hindustan Book Agency, New Delhi.

23

1.9 SOLUTIONS TO EXERCISES


E1.

(i) Coefficient of x12 , x22 , and x1x2 are respectively 1, -1 and 0. So the matrix
1

of the quadratic form x12 x22 is


0

0
.
1

(ii) Coefficients of x12 , x22 and x1x2 are respectively 2, 5 and 3. So the

matrix of the quadratic form is


1.5

1.5
.
5

2
(iii) Coefficients of x12 , x22 , x3 , x1x2, x1x3 and x2x3 are respectively 0, 0, 0, 3,
-4, 5. So the matrix of the quadratic form 3x1x2 + 5x2x3 4x1x3 is

0
1.5

1.5
0
2.5

2.5 .
0

2
(iv) Coefficients of x12 , x22 , x3 , x42 , x1x2, x1x3, x1x4, x2x3, x2x4 and x3x4 are
respectively 1, 1, 0, 1, 0, 0, 0, 0, 0, 0. So the matrix of the quadratic form x12

2
2

+x +x

2
4

1
0
is
0

1
0
0

0
1
0

0
.
0

E2.

Suppose aii < 0. Let ei denote the ith column of the identity matrix. Then
eit Aei = aii < 0. Hence A cannot be pd or psd.

E3.

(i) The matrix of the quadratic form x12 5 x1 x2 7 x22 is A=


.
7
2.5
The eigen values of A are the roots of the characteristic equation |A -I|=0 or
(1-) (7-) 6.25 = 0.

2.5

The Characteristic equation can be rewritten as 2 - 8 + 0.75 = 0.


Hence the roots are

64 3
8 64 3
and
which are both positive.
2
2

Hence the quadratic form x12 5 x1 x2 7 x22 is positive definite.


(ii) For x1 = 1 and x2 = x3 = 0, the value of the quadratic form is 1. Again for x2
= 1, x1 = x3 = 0, the value of the quadratic form is -1. Hence the quadratic
form is indefinite.

24

(iii) The matrix of the quadratic form is A = 3


0

3
3
0

0
0 . The characteristic
4

equation of A is (4-)((2-)(3-) - 9) = 0. So 4 is a root of the above equation.


The remaining two roots are the roots of the equation (2-)(3-) - 9 = 0 or
2 - 5 - 3 = 0 So the roots are

25 12
. Thus all the three roots are
2

positive. Hence the given quadratic form is positive definite.


E4.

The eigen values of A =


are the roots of the characteristic equation
1 2
(2-)2-1 = 0 or 2 - 4 + 3 = 0 or (-3)(-1) = 0. So the eigen values are 1
x1
be an eigen vector corresponding to 1. Then
= 3 and 2 = 1. Let
x
2
2x1 + x2 = 3x1
x1 + 2x2 = 3x2
or

-x1 + x2 = 0
x1 - x2 = 0

Thus x1 = x2. So the normalized eigen vector corresponding to the eigen value
3 is

1
2

1
.
1
1
2

It can similarly be shown that

is the normalized eigen vector

corresponding to the eigen value 1. So the spectral decomposition of A is


1 1

1 1

1 1
A = 3. 1 1 1.
2 1
2 1

or

1 1 1

2 1 1

1 1

2 1

If Anxn = PPt is a spectral decomposition of A, then A2 = PPt PPt = P 2Pt.


By induction it can be shown that Ak = P kPt. for k = 1,2,..
k
k
Thus if 1,, n are the eigen values of A, then 1 , , n are the eigen values
of Ak. The eigen vectors of Ak can be taken to be the same as the eigen
vectors of A.

Hence A100 =

1 1 1

2 1 1

3100

1 1 1

2 1 1

25

E5.

4 1 b11

(i) Let
1 2 b21

b22

b11

b21

b22

2
So b11
= 4 or b11 = 2

1
1

b11 2
1 7
2
2
2
2
b21
b22
2 or b22
4 4
7
so b22
2

b21 b12 = 1 or b12

Thus the required lower triangular square root is

1
2

9 3 3 b11


(ii) Let 3 5 1 b21
3 1 6 b

31

b31

b32
b33

2
b11
=9
b11 b21 = 3
b11 b31 = 3
2
2
b21
b22
5

so

0
b22
b32

0
b33

b11

0
0

b21
b22
0

or
b11 = 3
b21 = 1
b31 = 1
2
b22
= 5-1 = 4 or

or
or
or

0
7 .
2

b22 = 2

b21 b31 + b22 b32 = 1


or b22 b32 = 1-1 = 0 or b32 = 0
2
2
2
2
b31 b32 b33 = 6 or b33
= 6 - 0 - 1 = 5 or b33 = 5 .
3

Thus the required triangular square root is 1


1

E6.

0
2
0

0
5

If part: let C and D be positive definite. Then


0 x
x t Cx y t Dy 0 whenever at least one of x and y is
D y
C 0
is pd.
non-null. Hence
0 D

C
: y t
0

Only if part: Let


0

0 x t : 0

be pd consider

0 x
x t Cx whenever x 0. Hence C is pd.
D 0

26

E7.

A11
t
We know that 0 0 : y t
A12
t
so that y Ay is conformable).

A12

A 22

0
y t A 22 y whenever y 0 (y is chosen
y

Hence A22 is pd.


E8.

Use the same procedure as in E7.

E9.

Let A be an nnd matrix. Then there exists B such that A = BBt. Hence 0 =
xtAx = xtBBtx implies that Btx = 0. So Ax = BBtx = 0.

E10.

As will be shown in the section 1.6, nonzero eigen values of AAt and AtA are
the same. Hence nonzero eigen values of 11t are the same as the eigen values
of the 1 x 1 matrix 1t1 = n. Thus the eigen values of 1t1 = n. So, the
eigen values of 11t are n and 0, 0, ,0 (n-1) times.
Let be an eigen value of 11t. Let the corresponding eigen vector be x.
Then ((1-)I + 11)x = (1-)x + x = (1-+)x
Thus the eigen values of (1-)I+11 are (1-)+ n, (1-),, (1-) ((n-1)
times).
Hence (1-)I +11t is pd if and only if 1+(n-1) > 0 and 1- > 0 or

1
1.
n 1

E11.

0.5

We use theorem 7 for this purpose. Thus 1

3
4

0.5

is a pd matrix with

diagonal elements 2, 5 and 4 respectively.


E12.

(i) xt(A+B)x = xtAx + xtBx 0 since xtAx and xtBx are nonnegative. Hence
A+B is nnd.
Ct

(ii) Write A = CC and B = DD . Hence A+B = CC + DD = (C : D)


t

t
D

Hence column space of A = Column space of C column space of (C : D) =


column space of A+B.
E13.

b 2

be an idempotent matrix. Then



=
Let
c d
c d c d
c
4 + bc = 2.
b cannot be 0 since 4+bc = 2 Also bc = -2
2b + bd = b So 2+d = 1 or d = -1
2c + cd = c
bc + d2 = d
bc + 1 = -1
or bc = -2. Choose b = -2 and c = 1

27

(in fact choose b any nonzero number and c =


2

Thus
1
2.
E14.

E15.

is idempotent. Hence there is an idempotent matrix with a11 =

A2

B 0

0 A

B 0

So
0

2
)
b

A
2
B 0

is idempotent if A and B are idempotent.

Since the column space of A is contained in the column space of B, we get


A = BD for some D.
Now BA = B.BD = BD = A

Prepared by Prof. P Bhimasankaram, Indian Statistical Institute, Hyderabad.

28

You might also like