Professional Documents
Culture Documents
B. V. Limaye
Murali K. Srinivasan
Jugal K. Verma
January 7, 2014
2
Contents
3
4 CONTENTS
Chapter 1
Convention 1.1.1. We shall write F to mean either the real numbers R or the complex numbers
C. Elements of F will be called scalars.
The entry in row i and column j is aij . We also write A = (aij ) to denote the entries. When all
the entries are in R we say that A is a real matrix. Similarly, we define complex matrices. For
example,
1 −1 3/2
5/2 6 11.2
is a 2 × 3 real matrix.
A 1 × n matrix [a1 a2 · · · an ] is called a row vector and a m × 1 matrix
b1
b2
·
·
bn
5
6 CHAPTER 1. MATRICES, LINEAR EQUATIONS AND DETERMINANTS
Matrix multiplication
b1
·
First we define the product of a row vector a = [a1 . . . an ] and a column vector b =
· ,
bn
both with n components.
Define ab to be the scalar ni=1 ai bi .
P
The product of two matrices A = (aij ) and B = (bij ), denoted AB, is defined only when the
number of columns of A is equal to the number of rows of B. So let A be a m × n matrix and let
B be a n × p matrix. Let the row vectors of A be A1 , A2 , . . . , Am and let the column vectors of B
be B1 , B2 , . . . , Bp . We write
A1
A2
· , B = [B1 B2 · · · Bp ] .
A=
·
Am
xp
be a column vector with p components. Then
Bx = x1 B1 + x2 B2 + · · · + xp Bp .
as desired. So, Bx can be thought of as a linear combination of the columns of B, with column l
having coefficient xl . This way of thinking about Bx is very important.
Example 1.1.3. Let e1 , e2 , . . . , ep denote the standard column vectors with p components, i.e., ei
denotes the p × 1 column vector with 1 in component i and all other components 0. Then Bei = Bi ,
column i of B.
So, the jth column of AB is a linear combination of the columns of A, the coefficients coming from
the jth column Bj of B. For example,
2 0
1 3 1 5 4
1 1 = .
2 4 2 8 6
0 1
Similarly, ith row Ai B of AB is a linear combination of the rows of B, the coefficients coming from
the ith row Ai of A.
Properties of Matrix Operations
Theorem 1.1.4. The following identities hold for matrix sum and product, whenever the sizes of
the matrices involved are compatible (for the stated operations).
(i) A(B + C) = AB + AC.
(ii) (P + Q)R = P R + QR.
(iii) A(BC) = (AB)C.
(iv) c(AB) = (cA)B = A(cB).
Proof. We prove item (iii) (leaving the others as exercises). Let A = (aij ) have p columns,
B = (bkl ) have p rows and q columns, and C = (crs ) have q rows. Then the entry in row i and
column s of A(BC) is
p
X
= a(i, m){entry in row m, column s of BC}
m=1
p q
( )
X X
= a(i, m) b(m, n)c(n, s)
m=1 n=1
q
( p )
X X
= a(i, m)b(m, n) c(n, s),
n=1 m=1
but
0 1 1 0 0 0
=
0 0 0 0 0 0
Definition 1.1.5. A matrix all of whose entries are zero is called the zero matrix. The entries
aii of a square matrix A = (aij ) are called the diagonal entries. If the only nonzero entries of a
square matrix A are the diagonal entries then A is called a diagonal matrix. An n × n diagonal
matrix whose diagonal entries are 1 is called the n × n identity matrix. It is denoted by In . A
square matrix A = (aij ) is called upper triangular if all the entries below the diagonal are zero,
i.e., aij = 0 for i > j. Similarly we define lower triangular matrices.
A square matrix A is called nilpotent if Ar = 0 for some r ≥ 1.
Example 1.1.6. Let A = (aij ) be an upper triangular n × n matrix with diagonal entries zero.
Then A is nilpotent. In fact An = 0.
Since column j of An is An ej , it is enough to show that An ej = 0 for j = 1, . . . , n. Denote
column j of A by Aj .
1.1. MATRIX OPERATIONS 9
Similarly
A3 e3 = A2 (Ae3 ) = AA3 = A2 (a13 e1 + a23 e2 ) = 0.
Continuing in this fashion we see that all columns of An are zero.
Inverse of a Matrix
Definition 1.1.7. Let A be an n×n matrix. If there is an n×n matrix B such that AB = In = BA
then we say A is invertible and B is the inverse of A. The inverse of A is denoted by A−1 .
Remark 1.1.8. (1) Inverse of a matrix is uniquely determined. Indeed, if B and C are inverses of
A then
B = BI = B(AC) = (BA)C = IC = C.
(2) If A and B are invertible n × n matrices, then AB is also invertible. Indeed,
Transpose of a Matrix
Proof. For any matrix C, let Cij denote its (i, j)th entry.
(i) Let A = (aij ), B = (bij ). Then, for all i, j,
(ii) Since AA−1 = I = A−1 A, we have (AA−1 )t = I = (A−1 A)t . By (i), (A−1 )t At = I =
At (A−1 )t . Thus (At )−1 = (A−1 )t .
Lemma 1.1.12. (i) If A is a symmetric matrix then so is A−1 . (ii) Every square matrix A is a
sum of a symmetric and a skew symmetric matrix in a unique way.
A = P + Q.
We discuss a widely used method called the Gauss elimination method to solve a system of m linear
equations in n unknowns x1 , . . . , xn :
where the aij ’s and the bi ’s are known scalars in F. If each bi = 0 then the system above is called
a homogeneous system. Otherwise, we say it is inhomogeneous.
Set A = (aij ), b = (b1 , . . . , bm )t , and x = (x1 , . . . , xn )t . We can write the system above in the
matrix form
Ax = b.
The matrix A is called the coefficient matrix. By a solution, we mean any choice of the unknowns
x1 , . . . , xn which satisfies all the equations.
1.2. GAUSS ELIMINATION 11
Definition 1.2.2. A m × n matrix M is said to be in row echelon form (ref ) if it satisfies the
following conditions:
(a) By a zero row of M we mean a row with all entries zero. Suppose M has k nonzero rows
and m − k zero rows. Then the last m − k rows of M are the zero rows.
(b) The first nonzero entry in a nonzero row is called a pivot. For i = 1, 2, . . . , k, suppose that
the pivot in row i occurs in column ji . Then we have j1 < j2 < · · · < jk . The columns {j1 , . . . , jk }
are called the set of pivotal columns of M . Columns {1, . . . , n} \ {j1 , . . . , jk } are the nonpivotal
or free columns.
Note that a matrix in row canonical form is automatically in row echelon form. Also note that,
in both the definitions above, the number of pivots k is ≤ m, n.
where the aij ’s are arbitrary scalars. It may be checked that U is in rcf with pivotal columns 2, 5, 7
12 CHAPTER 1. MATRICES, LINEAR EQUATIONS AND DETERMINANTS
Example 1.2.6. Let R be the matrix from the example above. Let c = (c1 , c2 , c3 , c4 )t . We want
to write down all solutions to the system U x = c.
(i) If c4 6= 0 then clearly there is no solution.
(ii) Now assume that c4 = 0. Call the variables x2 , x5 , x7 pivotal and the variables x1 , x3 , x4 , x6 , x8
nonpivotal or free.
Give arbitrary values x1 = s, x3 = t, x4 = u, x6 = v, x8 = w to the free variables. These values
can be extended to values of the pivotal variables in one and only one way to get a solution to the
system Rx = c:
The process above is called back substitution. Given arbitrary values for the free variables,
we first solve for the value of the largest pivotal variable, then using this value (and the values of
the free variables) we get the value of the second largest pivotal variable, and so on.
We extract the following Lemma from the examples above and its proof is left as an exercise.
Lemma 1.2.7. Let U be a m × n matrix in ref. Then the only solution to the homogeneous system
U x = 0 which is zero in all free variables is the zero solution.
Note that a matrix in rcf is also in ref and the lemma above also applies to such matrices.
Theorem 1.2.8. Let Ax = b, with A an m × n matrix. Let c be a solution of Ax = b and S the
set of all solutions of the associated homogeneous system Ax = 0. Then the set of all solutions to
Ax = b is
c + S = {c + v : v ∈ S}.
(iv) Let p be the unique solution of U x = c having all free variables zero. Then every solution
of U x = c is of the form X
p+ ai s i ,
i∈F
Example 1.2.10. In our previous two examples P = {2, 5, 7} and F = {1, 3, 4, 6, 8}. To make sure
the notation of the theorem is understood write down p and si , i = 1, 3, 4, 6, 8.
We now discuss the first step in Gauss elimination, namely, how to reduce a matrix to ref or
rcf. We define a set of elementary row operations to be performed on the equations of a system.
These operations transform a system of equations into another system with the same solution set.
Performing an elementary row operation on Ax = b is equivalent to replacing this system by the
system EAx = Eb, where E is an invertible elementary matrix.
Let eij denote the m × n matrix with 1 in the ith row and jth column and zero elsewhere. Any
matrix A = (aij ) of size m × n can be written as
m X
X n
A= aij eij .
i=1 j=1
For this reason eij ’s are called the matrix units. Let us see the effect of multiplying e13 with a
matrix A written in terms of row vectors :
0 0 1 ··· 0
.. .. .. — R1 — — R3 —
. . . .
..
—
R2 —
— 0 —
e13 A = ... ... ... .. — R3 — — 0 —
= .
.
.. .
.. .. .. .. — ..
— . — —
. . . .
— Rm — m×n — 0 —
0 0 0 · · · 0 m×m
We now define three kinds of elementary row operations and elementary matrices. Consider the
system Ax = b, where A is m × n, b is m × 1, and x is a n × 1 unknown vector.
1.2. GAUSS ELIMINATION 15
(i) Elementary row operation of type I: For i 6= j and a scalar a, add a times equation j to
equation i in the system Ax = b.
What effect does this operation have on A and b? Consider the matrix
1 1
1 a
1
E=
1 or
.. = I + aeij , i 6= j.
.
..
. a
1 1
This matrix has 1’s on the diagonal and a scalar a as an off-diagonal entry. By the above observation
— R1 — — R1 — — 0 —
— R2 — — R2 — — Rj —
ith row
(I + aeij ) = + a
.. .. ..
. . .
— Rm — — Rm — — 0 —
— R1 —
..
.
= — Ri + aRj —
ith row
..
.
— Rm —
It is now clear that performing an elementary row operation of type I on the system Ax = b we
get the new system EAx = Eb.
Suppose we perform an elementary row operation of type I as above. Then perform the same
elementary row operation of type I but with the scalar a replaced by the scalar −a. It is clear that
we get back the original system Ax = b. It follows (why?) that E −1 = I − aeij .
(ii) Elementary row operation of type II: For i 6= j interchange equations i and j in the system
Ax = b.
What effect does this operation have on A and b?. Consider the matrix
1
1
..
.
0 1
F = = I + eij + eji − eii − ejj .
..
.
1 0
..
.
1
Premultiplication by this matrix has the effect of interchanging the ith and jth rows. Performing
this operation twice in succession gives back the original system. Thus F 2 = I.
16 CHAPTER 1. MATRICES, LINEAR EQUATIONS AND DETERMINANTS
(iii) Elementary row operation of type III: Multiply equation i in the system Ax = b by a
nonzero scalar c.
What effect does this operation have on A and b?. Consider the matrix
1
1
..
.
c
= I + (c − 1)eii , c 6= 0
G=
1
1
..
.
1
Premultiplication by G has the effect of multiplying the ith row by c. Do this operation twice
in succession, first time with the scalar c and the second time with scalar 1/c, yields the original
system back. It follows that G−1 = I + (c−1 − 1)eii .
The matrices E, F, G above are called elementary matrices of type I,II,III respectively. We
summarize the above discussion in the following result.
Theorem 1.2.11. Performing an elementary row operation (of a certain type) on the system
Ax = b is equivalent to premultiplying A and b by an elementary matrix E (of the same type),
yielding the system EAx = Eb.
Elementary matrices are invertible and the inverse of an elementary matrix is an elementary
matrix of the same type.
Since elementary matrices are invertible it follows that performing elementary row operations
does not change the solution set of the system. We now show how to reduce a matrix to row
reduced echelon form using a sequence of elementary row operations.
Theorem 1.2.12. Every matrix can be reduced to a matrix in rcf by a sequence of elementary row
operations.
Proof. We apply induction on the number of rows.If the matrix A is a row vector, the conclusion
is obvious. Now suppose that A is m × n, where m ≥ 2. If A = 0 then we are done. If A is not
the zero matrix then there is a nonzero column in A. Find the first nonzero column, say column
j1 , from the left. Interchange rows to move the first nonzero in column j1 to the top row. Now
multiply by a nonzero scalar to make this entry (in row 1 and column j1 ) 1. Now add suitable
multiples of the first row to the remaining rows so that all entries in column j1 , except the entry
in row 1, become zero. The resulting matrix looks like
0 ··· 0 1 ∗ ··· ∗
0 ··· 0 0 ∗ ··· ∗
· · · · ·
A1 =
·
· · · ·
· · · · ·
0 ··· 0 0 ∗ ··· ∗
1.2. GAUSS ELIMINATION 17
Proof. (i) Reduce A to rcf U by Gauss elimination. Since m < n there is atleast one free variable.
It follows that there is a nontrivial solution.
(ii) Reduce Ax = b to EAx = Eb using Gauss elimination, where U = EA is in rref. Put
c = Eb = (c1 , . . . , cm )t . Suppose U has k nonzero rows. There cases arise:
(a) atleast one of ck+1 , . . . , cm is nonzero: in this case there is no solution.
(b) ck+1 = · · · = cm = 0 and k = n: there is a unique solution (why?).
(c)ck+1 = · · · = cm = 0 and k < n: there are infinitely many solutions (why?).
No other cases are possible (why?). That completes the proof.
In the following examples an elementary row operation of type I is indicated by Ri + aRj , of
type II is indicated by Ri ↔ Rj , and of type III is indicated by aRi .
Example 1.2.14. Consider the system
2 1 1 x1 5
Ax = 4 −6 0 x2 = −2 = b.
−2 7 2 x3 9
−3 −4 −2
0
0 1 0 0
0 0 −2
+ t 0 ,
+ s +r
0 0 1 0
0 0 0 1
1/3 0 0 0
(d) ⇒ (a) First observe that a square matrix in rcf is either the identity matrix or its bottom row
is zero. If A can’t be reduced to I by elementary row operations then U = the rcf of A has a zero
row at the bottom. Hence U x = 0 has atmost n − 1 nontrivial equations. which have a nontrivial
solution. This contradicts (d).
This proposition provides us with an algorithm to calculate inverse of a matrix if it exists. If A
is invertible then there exist invertible matrices E1 , E2 , . . . , Ek such that Ek · · · E1 A = I. Multiply
by A−1 on both sides to get Ek · · · E1 I = A−1 .
20 CHAPTER 1. MATRICES, LINEAR EQUATIONS AND DETERMINANTS
Lemma 1.2.18. (Gauss-Jordan Algorithm) Let A be an invertible matrix. To compute A−1 , apply
elementary row operations to A to reduce it to an identity matrix. The same operations when
applied to I, produce A−1 .
1 0 0 1 0 0 1 0 0
R3 − R2
0 1 0 −1 1 0 . Hence A−1 = −1 1 0
−→
0 0 1 0 −1 1 0 −1 1