You are on page 1of 21

Math 2163

Lecture Notes on Elementary


Linear Algebra

John Patrick B. Sta. Maria


Polytechnic University of the Philippines

2011
2
Chapter 1

Linear Equations and Matrices

1.1 Systems of Linear Equations


In these notes, we will reserve the symbol R to denote the set of all real numbers.
We are assuming that the student is familiar with this set as an algebraic structure.
We will be sometimes interested in complex numbers which are numbers of the form
a + bi where a, b ∈ R and i2 = −1. The set of all complex numbers will be denoted
by C. What makes these sets of numbers special in our course is that one can do
arithmetic on them “naturally”. The need for the complex numbers will be apparent
in the chapter involving eigenvalues and matrix polynomials.

A linear equation over F = R or C in the unknowns x1 , x2 , . . . , xn is any


equation of the form
a1 x1 + a2 x2 + · · · + an xn = b, (1.1)

where a1 , a2 , . . . , an , b ∈ F . We say that equation (1.1) expresses b as a linear


combination of x1 , x2 , . . . , xn .

An n-tuple of numbers (s1 , s2 , . . . , sn ) ∈ F n is said to solve (be a solution of)


equation (1.1) if and only if

a1 s1 + a2 s2 + · · · + an sn = b.

This simply means that the linear equation is satisfied if we make the following substi-
tution: x1 = s1 , x2 = s2 , . . . , xn = sn .
Examples 1.1.1.

1. The equation 2x1 −3x2 +x3 = 1 is linear and (1, 1, 2) is a solution. Notice that if one

1
2 CHAPTER 1. LINEAR EQUATIONS AND MATRICES

assigns values to two of the unknowns, the third unknown is uniquely determined.
Thus, the equation has infinitely many solutions.

2. x1 x2 + 3x3 = 4 is not linear. So are the equations x21 + 2x2 − x3 = 5 and sin x1 +
cos x2 = π.

Notice that if n = 1, equation (1.1) reduces to a familiar linear equation a1 x1 = b


studied in elementary algebra. We know that this equation has a unique solution if
and only if a1 6= 0.

In this course, we will be interested in describing solution(s) to not just one linear
equation but to systems of linear equation. Let m and n be positive integers. An m × n
linear system over F = R or C is a set of m linear equations in n unknowns. Such a
system may be expressed by writing

a11 x1 + a12 x2 + · · · + a1n xn = b1


a21 x1 + a22 x2 + · · · + a2n xn = b2
..
. (1.2)
am1 x1 + am2 x2 + · · · + amn xn = bm

where aij , bi ∈ F for i = 1, . . . , m and j = 1, . . . , n. A solution to the linear system


(1.2) is an n−tuple (s1 , s2 , . . . , sn ) ∈ F n which simultaneously solves the given m linear
equations in n unknowns, i.e. ai1 s1 + ai2 s2 + · · · + ain sn = bi for i = 1, 2, . . . , m.

One of the major problems in linear algebra is the problem of existence and
uniqueness of solutions to a given linear system. Given a system such as (1.2), we
would like to answer the following questions:

1. When does (1.2) have a solution?

2. If (1.2) has a solution, how many?

3. If (1.2) has a solution, how stable it is in the sense of: How is this solution affected
by a “small” change in the data (i.e. the given coefficients aij )?

Questions (1) and (2) may be answered in this chapter but it will require an
advanced course in linear algebra or matrix analysis to answer the third.

If a system of linear equations has a solution, the system is said to be consis-


tent, otherwise inconsistent. If b1 = b2 = · · · = bm = 0, we say that the system is
1.1. SYSTEMS OF LINEAR EQUATIONS 3

homogeneous. Notice that by setting xi = 0 for i = 1, . . . , n determines a solution


to a homogeneous system. This solution is called the trivial solution. This means
that a homogeneous system is always consistent with the trivial solution as the “auto-
matic” solution. It is natural to ask when can a homogeneous system have a non-trivial
solution?

Examples 1.1.2. 1. Consider a 2 × 2 system of linear equations. Such a system will


be of the form

a11 x1 + a12 x2 = b1
a21 x1 + a22 x2 = b2 (1.3)

Geometrically, the system may be represented by the following pair of lines:

....
. y
... .
....
y
... ....... y ...

... . ...
..
.... ...
.. ....... ...
..

.... .... .......


... ... ...
.. .. ..

........ ... ...


.
. . . ... ...

.... .......
.. .. ..

........ ...... ...


..
. ...
. ..
...
..
...
..
...........
. ..
...
..
... .
....
. .
....
. ...
..
... ....... ...
..
...

.... ................
.
..
.
....................................................................................................................................................................... .. . ... ..
.
.......................................................................................................................................................................
. . ....... x ..
.
.......................................................................................................................................................................
. .... .... .......
. . .

... . ........ x ...


..
. . . . x ...
..
...
..

.. . ...
........
..
... .
. . ... ..
...
....
..
...

... ... ....


.
.. .. ..
........ ...
..
..
...
..
...
..
...... ...
. ... ...

....
.. .. ..
. . .
... ... ...

......
.. .. ..
.. .. ..
... ... ...
. . .

unique solution no solution infinitely many solutions

Figure 1.1: 2 × 2 Linear System

In elementary algebra, you were taught how to interpret solution(s) of a 2 × 2


linear system geometrically. Each linear equation on this system is represented by
a straight line in the cartesian plane. Solution(s) to the system is(are) represented
by the points of intersection of the lines. If the given lines are parallel, then there is
no solution and thus the system is inconsistent. It is easy to check that the system
is consistent if and only if a11 a22 − a21 a22 6= 0. (Prove this.)

2. Solve the following system of linear equations:

5x1 − 2x2 = 1
3x1 − 4x2 = −5

We solve this by elimination. We label the first equation by (1) and the second by
4 CHAPTER 1. LINEAR EQUATIONS AND MATRICES

(2). First, we multiply equation (1) by 2 and subtract equation (2) from 2 times
equation (1). We get
7x1 = 7

which would imply that x1 = 1. Now that we have x1 , we substitute this directly
to the first equation to get x2 = 2. We see that the given system has a unique
solution (1, 2). A system of linear equations having a unique solution is referred to
as an independent system. The use of the term will be apparent later.

3. While a system can have a unique solution, it can happen that a system has not
just one solution, but infinitely many solutions. Consider

3x1 + 3x2 − x3 = 8
6x1 + 8x2 − 2x3 = 3

This is a 2 × 3 system. Multiplying the first equation by 2 and subtracting it from


the second equation give us
2x2 = −13

or x2 = −13/2. Substituting this to our system gives

6x1 − 2x3 = 55
6x1 − 2x3 = 55

Or simply 6x1 − 2x3 = 55. Just what do we mean by this? This means that the
solutions are triples of the form (x1 , −13/2, x3 ) where 6x1 − 2x3 = 55. Notice that
once we assign an arbitrary value to anyone of x1 and x3 , the value of the other
variable is uniquely determined. Thus, if we let x1 = t be any real (complex)
number, then x3 = (6t − 55)/2. Thus, the solution set for the system is described
by
S = {(t, − 13
2 ,
6t−55
2 ) : t ∈ R(or C)}.

In particular, if we set t = 0, a solution thus generated is (0, −13/2, −55/2).

4. A linear system can possibly have no solution at all. For example, consider the
3 × 2 linear system

2x1 + 3x2 = 5
x1 − x2 = 0
x1 + x2 = 3
1.1. SYSTEMS OF LINEAR EQUATIONS 5

Note that there is only one way to satisfy the first and the second equation. It is by
setting x1 = x2 = 1. But with these values, x1 + x2 = 2 6= 3, thus can never satisfy
the third equation. We conclude that there cannot be an ordered pair (x1 , x2 ) that
can satisfy these three equations.

As we can see, solving systems of linear equations is exponentially becoming


difficult as the size of the system is increasing. We are therefore in need of a good
algorithm for this and we will in fact develop some in the next section.

We will now formalize the methods in the previous examples. Two linear systems
are said to be equivalent of they have the same set of solutions. We will see in the
next theorem how one can construct a linear system equivalent to a given linear system.

Theorem 1.1.1 (Method of Elimination). If a linear system is changed to another by


one of these operations

(1) an equation is swapped with another

(2) an equation has both sides multiplied by a non-zero constant

(3) an equation is replaced by the sum of itself and a multiple of another,

then the two systems are equivalent.

Proof. This is easy and is left for the student.

Problem Set 1.1.

1. Show that the system

a11 x1 + a12 x2 = b1
a21 x1 + a22 x2 = b2 (1.4)

is consistent if and only if ∆ = a11 a22 − a21 a12 6= 0. The number ∆ will appeal to
us later in chapter 4.

2. Describe the solution set of the homogeneous system

3x1 − 2x2 − x3 − 4x4 = 0


x1 + x2 − 2x3 − 3x4 = 0.
6 CHAPTER 1. LINEAR EQUATIONS AND MATRICES

3. Find all the solutions of the system

7x1 + 3x2 + 21x3 − 13x4 + x5 = −14


10x1 + 3x2 + 30x3 − 16x4 + x5 = −23
7x1 + 2x2 + 21x3 − 11x4 + x5 = −16
9x1 + 3x2 + 27x3 − 15x4 + x5 = −20

1.2 Matrices
Let F = R or C. An m × n matrix over F is a rectangular array of mn elements of F ,
called entries, arranged in m rows and n columns. Such a matrix may take the form
 
a11 a12 ··· a1n
 a21 a21 ··· a2n 
 
A=
 .. .. .. ..  (1.5)
 . . . . 

am1 am2 · · · amn

where aij ∈ F for i ∈ {1, 2, . . . , m} and j ∈ {1, 2, . . . , n}. We will abbreviate the matrix
above as A = [aij ]. Take note that we use the indexing aij to denote the entry of the
matrix A in the (i, j)th position, i.e. the entry on the ith row and jth column. The
set of all m × n matrices over F is denoted Mm×n (F ). If F = C, we refer to a matrix
over F as a complex matrix and analogously, if F = R, a matrix over F is called a real
matrix. If m = n, we denote Mm×n (F ) =: Mn (F ). If A ∈ Mn (F ), we say A is a square
matrix.

Definition 1.2.1. We say that the matrices A = [aij ] and B = [bij ] are equal, written
A = B, if and only if A and B have the same number of rows and columns and aij = bij
for all i and for all j.

The definition above simply says that for two matrices to be equal, their corre-
sponding entries should be equal.

Examples 1.2.1.

1. The following are examples of matrices:


 
" # 4 " # " #
2 −3  √  5i −12 0 x+y −3
A= B= 2  C= D= x, y ∈ R
1 1 3 0 1+i 1 2x − y
−1/2
1.2. MATRICES 7

All the matrices above are complex matrices. However, only A, B and D are real
matrices. A and D are both 2 × 2 so they are square matrices. Notice that A = B
if and only if x and y satisfies that system

x+y =2
2x − y = 1

This system is satisfied only when x = y = 1.

2. B 6= C since B has only one column while C has 3.

Just like numbers, we can define operations on matrices.

Definition 1.2.2. Let A, B ∈ Mm×n (F ). If A = [aij ] and B = [bij ], we define the sum
A + B of A and B to be the matrix C = [cij ] where

cij = aij + bij .

In other words, if two matrices are of the same size, we can add them to get
a matrix whose entries are obtained by adding the corresponding entries of the given
matrices.

Theorem 1.2.1 (Properties of Matrix Addition).

1. A + B = B + A for all A, B ∈ Mm×n (F ).

2. (A + B) + C = A + (B + C) for all A, B, C ∈ Mm×n (F )

3. There is a matrix Omn such that A + Omn = Omn + A = A for all A ∈ Mm×n (F ).

4. For all A ∈ Mm×n (F ), there is a matrix B ∈ Mm×n (F ) such that A + B = B + A =


Omn .

Proof.

1. To see this, we first note that addition of numbers is commutative. If we let A = [aij ]
and B = [bij ], then

A + B = [aij ] + [bij ] = [aij + bij ] = [bij + aij ] = [bij ] + [aij ] = B + A

since aij , bij ∈ F are numbers.

2. The proof of this is similar to 1.


8 CHAPTER 1. LINEAR EQUATIONS AND MATRICES

3. Simply take Omn = [ωij ] to be the m × n matrix with all entries 0, i.e. let ωij = 0
for all i and for all j. Then if A = [aij ], then

A + Omn = [aij ] + [ωij ] = [aij + ωij ] = [aij + 0] = [aij ] = A.

The other equality holds because of (1).

4. Let A = [aij ] be an m×n matrix. Consider the matrix B = [bij ] obtained by taking
the negatives of the entries of A, i.e. let bij = −aij . Then

A + B = [aij ] + [bij ] = [aij + bij ] = [aij + (−aij )] = [ωij ] = Omn .

Remarks 1.2.1.

1. Let A, X ∈ Mm×n (F ) such that A + X = A. By (4) there exists B ∈ Mm×n (F )


such that A + B = Omn . Now, using (1), (2) and (3),

X = X + Omn = X + (A + B)mn = (X + A) + B = A + B = Omn .

This simply means that no other matrix can satisfy property (3) of theorem 1.2.1
aside from Omn . We call the matrix Omn the zero m × n matrix.

2. If A is an m × n matrix, then by property (4) of theorem 1.2.1, there is an m × n


matrix B such that A + B = Omn = B + A. Now, suppose that there is another
m × n matrix B 0 such that A + B 0 = Omn = B 0 + A. Then

B = B + Omn = B + (A + B 0 ) = (B + A) + B 0 = (A + B) + B 0 = Omn + B 0 = B 0 .

Thus, the existence of the matrix B in property (4) is unique. Having said that,
we will call the unique matrix B as the negative of A and we shall denote it as
−A. From now on, the expression A − B, where A and B are matrices of the same
size, shall mean the sum A + (−B).

In what follows, we shall call the elements of F as scalars.

Definition 1.2.3. (Scalar Multiplication) Let c ∈ F be a scalar and A = [aij ] ∈


Mm×n (F ). We define the matrix cA as the matrix B = [bij ] ∈ Mm×n (F ) such that
bij = caij . In other words, cA is the matrix obtained by multiplying the scalar c to each
1.2. MATRICES 9

of the entries of A. More precisely,


   
a11 a12 ··· a1n ca11 ca12 ··· ca1n
 a21 a21 ··· a2n   ca21 ca21 ··· ca2n 
   
cA = c 
 .. .. .. = .
..   .. .. .
.. 
 . . . .   .. . . . 
am1 am2 · · · amn cam1 cam2 · · · camn

Multiplication defined above is called scalar multiplication. The following prop-


erties of scalar multiplication can easily be verified.

Theorem 1.2.2 (Properties of Scalar Multiplication). If A and B are m × n matrices


and c, d are scalars, then

1. c(A + B) = cA + cB.

2. (c + d)A = cA + dA.

3. c(dA) = (cd)A.

4. 1A = A.

5. 0A = Omn .

Examples 1.2.2.
       
1 0 2 0
1. If A = 0, B = 1 and C = −1, then 2A − B − C = 0 = O31 .
       

1 0 2 0
" # " #
2 −3 0 1 4 −2
2. Let A = and B = .
1 0 −2 0 2 5

Consider the matrix equation A + 2X = B. Find X.

Solution. To solve for the matrix X, we first note that X must be a 2 × 3 matrix.
Let X = [xij ] where i ∈ {1, 2} and j ∈ {1, 2, 3}. The equation may be written as
" # " # " #
2 −3 0 x11 x12 x13 1 4 −2
+2 = ,
1 0 −2 x21 x22 x23 0 2 5

which simplifies to
" # " #
2 + 2x11 −3 + 2x12 2x13 1 4 −2
= .
1 + 2x21 2x22 −2 + 2x23 0 2 5
10 CHAPTER 1. LINEAR EQUATIONS AND MATRICES

Using the definition of matrix equality, we equate corresponding entries and solve
for the variables giving us x11 = −1/2, x12 = 7/2, x13 = −1, x21 = −1/2, x22 = 1
and x23 = 7/2. Therefore,
" #
−1/2 7/2 −1
X= .
−1/2 1 7/2

Let A1 , A2 , . . . , Ak be m × n matrices. An m × n matrix B is said to be a linear


combination of the Ai s if there exist scalars a1 , a2 , . . . , ak such that

B = a1 A1 + a2 A2 + · · · + ak Ak .

h i h i h i
For example, if E1 = 1 0 0 , E2 = 0 1 0 and E3 = 0 0 1 , note
that
h any imatrix in M1×3 (C) is a linear combination of E1 , E2 and E3 . To see this let
x y z be an arbitrary element of M1×3 C. Then
h i
x y z = xE1 + yE2 + zE3 .

We will now study how we can multiply matrices. Unlike in the real (complex)
number system, we will see that not all pairs of matrices can be multiplied and matrix
multiplication is much more complicated than just addition.

Definition 1.2.4 (Matrix Multiplication). Let A = [aij ] be an m × p matrix and


B = [bij ] be a p × n matrix. We define the product AB of A and B as the m × n matrix
C = [cij ] where
p
X
cij = aik bkj = ai1 b1j + ai2 b2j + · · · + aip bpj ,
k=1

for i ∈ {1, 2, . . . , m} and j ∈ {1, 2, . . . , n}.

The definition above says that for the matrix A be multiplied to a matrix B, the
number of columns of A should agree with the number of rows of B. In effect, the
resulting product AB have rows as many as the rows of A and columns as many as the
columns of B. To illustrate the actual multiplication, refer to the figure below:
1.2. MATRICES 11

   
a11 a12 ··· a1p c11 c12 ··· c1j ··· c1n
   
··· ··· ···

 a21 a22 a2p   c21 c22 c2j c2n 
 b11 b12 · · · b1j ··· b1n
 .. .. ..     . .. .. .. 
  
 . . .  b b22 · · · b2j ··· b2n   .. . . . 
  21
. .. .. =
.. 
  
· · · aip   .. ··· ···
a ai2 . . .   ci1 ci2 cij ajn 

 i1 
 . .. ..   . .. .. .. 
 ..  ..
 .  bp1 bm2 · · ·
.  bpj ··· bpn  . . . 

am1 am2 · · · amp cm1 cm2 · · · cmj ··· cmn

cij = ai1 b1j + ai2 b2j + · · · + aip bpj

To get the (i, j)-th entry of the matrix product AB, one needs the ith row of
A and the jth column of B. We multiply the the corresponding entries and add the
resulting products.

Multiplication of matrices might not be as easy as matrix addition, but one can
see in Chapter 3 that this operation is very natural. Though the definition above is
natural, some strange properties of matrix multiplication is present. For example, let
A be a 3 × 4 matrix and B be a 4 × 2 matrix. While we can execute the product AB,
one cannot multiply B to A since the number of columns of B is 2 and the number of
rows of A is 4 and 2 6= 4. Thus, BA is not defined. We will see more peculiarities of
matrix multiplication in the following examples.

Examples 1.2.3.

" # " #
2 0 4 1
1. Let A = and B = . Note that both AB and BA are defined.
1 −1 −1 4
" # " #
(2)(4) + (0)(−1) (2)(1) + (0)(4) 8 2
AB = =
(1)(4) + (−1)(−1) (1)(1) + (−1)(4) 5 −3

and " # " #


(4)(2) + (1)(1) (4)(0) + (1)(−1) 9 −1
BA = = .
(−1)(2) + (4)(1) (−1)(0) + (4)(−1) 2 −4
Here, while both AB and BA are defined, AB 6= BA. This illustrates that matrix
multiplication is not in general commutative!
12 CHAPTER 1. LINEAR EQUATIONS AND MATRICES

   
h i 1 2 1 −1
2. If U = 2 1 −1 and V = 3, then U V = [0] while V U =  6 3 −3.
   

5 10 5 −5
Notice that none of U and V is a zero matrix but U V is.

3. Consider the linear system

a11 x1 + a12 x2 + · · · + a1n xn = b1


a21 x1 + a22 x2 + · · · + a2n xn = b2
..
. (1.6)
am1 x1 + am2 x2 + · · · + amn xn = bm

   
x1 b1
 x2   b2 
   
If we let A = [aij ] for i = 1, 2, . . . , m and j = 1, 2, . . . , n, X =  .  and B = 
 
 .. ,

.
 .   . 
xn bm
then (1.6) may actually be written as

AX = B.

In this case, A is called the coefficient matrix. Thus, the solvability of 1.6 is
equivalent to the solvability of the linear matrix equation AX = B. We will be
more precise on this in the next section.

4. Consider the m × n system AX = B as in (1.6). Notice that the system may be


written as        
a11 a12 a1n b1
 a21   a22   a2n   b2 
       
 ..  + x2  ..  + · · · + xn  ..  =  ..  .
x1        
 .   .   .   . 
am1 am2 amn bm
Thus, the solvability of the system AX = B depends on whether B may or may
not be written as a linear combination of the columns of A.

5. Let In = [δij ] be the n × n matrix defined by



1 if i = j
δij =
0 if i 6= j.
1.2. MATRICES 13

Pictorially, In takes the form


 
1 0 0 ··· 0
 
0
 1 0 ··· 0 
0
 0 1 ··· 0 .
. .. .. . . .. 
 .. . . . .
 
0 0 0 ··· 1

If A = [aij ] ∈ Mn (F ), one can check that AIn = In A = A. The matrix In will play
an important part in our study so it deserves a special name. We will call In the
n × n identity matrix.

6. For a positive integer k, let F k denote the set of all columns of the form
 
a1
a2 
 
.
.
.
ak

where ai ∈ F for i = 1, 2, . . . , k. If A is an m × n matrix over F , then A may be


thought as a function from F n to F m in the sense of the following: If x ∈ F n , then
Ax ∈ F m . The map
x 7→ Ax

gives a well-defined map from F n to F m . Such a function plays a central role in


linear algebra. These maps are examples of linear transformations which are stud-
ied in Chapter 3. One of the major cornerstones in linear algebra is the discovery
of the fact that any linear transformation on finite dimensional vector spaces arises
in this way! This therefore means that matrices and linear algebra are inseparable.

So in this case, as in the case of example 1.2.3 (2), we will write U V = 0. Like
matrix addition, matrix multiplication obeys some algebraic rules.

Theorem 1.2.3 (Properties of Matrix Multiplication). Assuming that multiplications


below are possible, we have

1. A(BC) = (AB)C for any A, B and C.

2. A(B + C) = AB + AC and (A + B)C = AC + BC for any A, B and C.

3. (rA)(sB) = (rs)(AB) for any matrices A and B and for any scalars r and s.
14 CHAPTER 1. LINEAR EQUATIONS AND MATRICES

Proof. We will only prove (1) and the proofs of (2) and (3) are analogous. Let
A = [aij ] ∈ Mm×n , B = [bij ] ∈ Mn×p and C = [cij ] ∈ Mp×q . If AB = [dij ], then

n
X
dij = aik bkj .
k=1

In a similar manner, if BC = [fij ], then

p
X
fij = bir crj .
r=1

Thus,
p
X
(AB)C = dis csj
s=1
p
" n ! #
X X
= aik bks csj .
s=1 k=1

Now,
n
X
A(BC) = ait ftj
t=1
n p
!
X X
= ait btr crj
t=1 r=1
Xn
= [ait (bt1 c1j + bt2 c2 j + · · · + btp cpj )]
t=1
Xn
= (ait bt1 c1j + ait bt2 c2 j + · · · + ait btp cpj )
t=1
n n n
! ! !
X X X
= ait bt1 c1j + ait btj c2j + · · · + ait bt1 cpj
t=1 t=1 t=1
p n
" ! #
X X
= ait btr crj
r=1 t=1

= (AB)C.

From now on, when we wish to write a 1×1 matrix, we shall identify it as a scalar,
i.e. we will write a in place of [a]. There is no harm in doing this since the elements of
1.2. MATRICES 15

M1×1 (F ) “acts” like the elements of F in the sense of the following:

[a] + [b] = [a + b] and [a][b] = [ab].

Some class of matrices have special characteristics and thus deserve special
names. A matrix having equal number of rows and columns is called a square matrix.
The set of all square n × n matrices will be denoted by Mn . Note that Mn is closed
under the fundamental matrix operations, i.e. the sum, the product, scalar multiples,
of matrices in Mn are in Mn .

While matrix multiplication is not commutative, it is associative. This property


of matrix multiplication is oftentimes taken for granted. Notice that in the spirit of as-
sociativity, one may write ABC unambiguously since (AB)C = A(BC). Inductively, we
may write A1 A2 A3 · · · An with no confusion. This leads us to the definition of exponents.

If n is a positive integer, and A is a square matrix, we define

An = AA · · · A}.
| {z
n factors

For n ≤ 0, we defer the definition of An until the section on non-singular matrices.

A square matrix T = [tij ] is called an upper triangular matrix if tij = 0 for


i < j. If tij = 0 for all i > j, T is called a lower triangular matrix. For example, if
   
1 0 0 1 1 −2
L = 2 −3 0  and U = 0 2 0  ,
   

0 −1 −1 0 0 4

the U is upper triangular while L is lower triangular. If a matrix is either lower


triangular or upper triangular, we say that it’s triangular.

Let λ1 , λ2 , . . . , λn be scalars, by the expression diag(λ1 , λ2 , . . . , λn ) we mean the


square n × n matrix  
λ1 0 0 · · · 0
 
 0 λ2 0 · · · 0
 
D = [dij ] =  0 0 λ3 · · · 0 .

. .. .. . . .. 
 .. . . . . 
 
0 0 0 ··· λn
16 CHAPTER 1. LINEAR EQUATIONS AND MATRICES

Such a matrix is called a diagonal matrix. Note that D has a property that
dij = 0 for i 6= j. Moreover, D is both upper triangular and lower triangular. For
diagonal matrices, matrix operations is more natural. It can be shown that

diag(λ1 , λ2 , . . . , λn ) + diag(σ1 , σ2 , . . . , σn ) = diag(λ1 + σ1 , λ2 + σ2 , . . . , λn + σn ),

rdiag(λ1 , λ2 , . . . , λn ) = diag(rλ1 , rλ2 , . . . , rλn )

for all scalars r and

diag(λ1 , λ2 , . . . , λn )diag(σ1 , σ2 , . . . , σn ) = diag(λ1 σ1 , λ2 σ2 , . . . , λn σn ).

Some diagonal matrices behave better. A diagonal matrix of the D =


diag(λ, λ, . . . , λ) is called a scalar matrix. In particular, In and On are scalar ma-
trices.
We now introduce another matrix operation which is of paramount importance.

Definition 1.2.5 (Matrix Transpose). Let A = [aij ] ∈ Mm×n . The transpose of A is


the matrix AT = [bij ] ∈ Mn×m where bij = aji .

This definition simply tells us that the transpose of a matrix A is the matrix
whose rows are the columns of A. Consequently, the rows of A are the columns of its
transpose. The following properties of matrix transpose are not surprising.

Theorem 1.2.4 (Properties of Matrix Transpose). For any matrices where the following
operations are possible,

1. (AT )T = T ,

2. (A + B)T = AT + B T ,

3. (rA)T = rAT ,

4. (AB)T = B T AT ,

5. (An )T = (AT )n if n is a positive integer.

Proof. (1), (2) and (3) are easy. We will only prove (4) and (5) follows from (4). Let
A = [aij ] ∈ Mm×n and B = [bij ] ∈ Mn×m . Then AT is n × m and B T is m × n so,
B T AT is m × m. Moreover, (AB)T is also m × m. We now verify the equality. Let
C = [cij ] = AB. If D = [dij ] = C T , then

n
X
dij = cji = ajk bki .
k=1
1.2. MATRICES 17

If E = [eij ] = B T and F = [fij ] = AT , and if we let G = [gij ] = B T AT , then

n
X n
X n
X
gij = eik fkj = bki ajk = ajk bki = cji = dij .
k=1 k=1 k=1

Therefore, (AB)T = [dij ] = [gij ] = B T AT .

We now explore some classes of square matrices which are invariant under matrix
transpose.

Definition 1.2.6. A square matrix A is said to be

1. symmetric if and only if AT = A;

2. skew-symmetric if and only if AT = −A.

We first give some general remarks on symmetric and skew-symmetric matrices.


If A = [aij ] is symmetric, by definition,AT = A and so aij = aji for all i and for all j.
Conversely, if a square matrix A = [aij ] satisfies aij = aji for all i and for all j, then A
is symmetric. A symmetric matrix, therefore, may be viewed pictorially as one whose
entries are symmetric about the main diagonal of A. For example, the matrix
" #
2 6
6 −1

is symmetric. Note that if A is symmetric, then so are the matrices AT and −A.

Now suppose that A = [aij ] is skew-symmetric. By definition, AT = −A and so,


−aij = aji or aij + aji = 0 for all i and for all j. This means that each entry is the
negative of the entry directly opposite it with respect to the main diagonal. What’s
interesting about skew-symmetric matrices is the fact that the diagonal entries are all
0.1 This is easy to show. Since aij + aji = 0 for all i and for all j, then in particular,
for i = j (i.e. on the main diagonal) aii + aii = 0 implying that 2aii = 0 and so aii = 0
for all i. For instance, the matrix below is skew-symmetric.
 
0 4 −3
−4 0 0
 

3 0 0

1
This is true provided that the field of scalars is of characteristic not equal to 2, i.e. if 1 + 1 6= 0.
18 CHAPTER 1. LINEAR EQUATIONS AND MATRICES

The following facts are easy to establish.

1. If A is any matrix of any size, then AT A and AAT are square matrices. Moreover,
these matrices are symmetric.

2. If A is a square matrix, then A + AT is symmetric while A − AT is skew-symmetric.


Moreover,
1 1
A = (A + AT ) + (A − AT ),
2 2
i.e. any square matrix A may be written as

A = S + S0

where S is symmetric and S 0 is skew-symmetric.

Examples 1.2.4.
" #
1 −3 0
1. Given A = . Verify that AT A and AAT are symmetric matrices using
2 1 1
direct computation.
Sol’n. Note that
   
1
2 " # 5 −1 2
 1 −3 0
AT A = −3 1 = −1 10 1
  
2 1 1
0 1 2 1 1

and  
" " # # 1 2
1 −3 0 10 −1
AAT = −3 1 =
 
2 1 1 −1 6
0 1
which are symmetric matrices.
" #
2 4
2. Let A = . Write A as a sum of a symmetric and skew-symmetric matrix.
0 −2
Sol’n. Let S = 21 (A + AT ) and S 0 = 12 (A − AT ). Then
" # " #! " #
1 2 4 2 0 2 2
S= + =
2 0 −2 4 −2 2 −2

which is symmetric and


" # " #! " #
0 1 2 4 2 0 0 2
S = − =
2 0 −2 4 −2 −2 0
1.2. MATRICES 19

which is skew-symmetric. Notice that A = S + S 0 .

You might also like