Professional Documents
Culture Documents
2011
2
Chapter 1
a1 s1 + a2 s2 + · · · + an sn = b.
This simply means that the linear equation is satisfied if we make the following substi-
tution: x1 = s1 , x2 = s2 , . . . , xn = sn .
Examples 1.1.1.
1. The equation 2x1 −3x2 +x3 = 1 is linear and (1, 1, 2) is a solution. Notice that if one
1
2 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
assigns values to two of the unknowns, the third unknown is uniquely determined.
Thus, the equation has infinitely many solutions.
2. x1 x2 + 3x3 = 4 is not linear. So are the equations x21 + 2x2 − x3 = 5 and sin x1 +
cos x2 = π.
In this course, we will be interested in describing solution(s) to not just one linear
equation but to systems of linear equation. Let m and n be positive integers. An m × n
linear system over F = R or C is a set of m linear equations in n unknowns. Such a
system may be expressed by writing
One of the major problems in linear algebra is the problem of existence and
uniqueness of solutions to a given linear system. Given a system such as (1.2), we
would like to answer the following questions:
3. If (1.2) has a solution, how stable it is in the sense of: How is this solution affected
by a “small” change in the data (i.e. the given coefficients aij )?
Questions (1) and (2) may be answered in this chapter but it will require an
advanced course in linear algebra or matrix analysis to answer the third.
a11 x1 + a12 x2 = b1
a21 x1 + a22 x2 = b2 (1.3)
....
. y
... .
....
y
... ....... y ...
... . ...
..
.... ...
.. ....... ...
..
.... .......
.. .. ..
.... ................
.
..
.
....................................................................................................................................................................... .. . ... ..
.
.......................................................................................................................................................................
. . ....... x ..
.
.......................................................................................................................................................................
. .... .... .......
. . .
.. . ...
........
..
... .
. . ... ..
...
....
..
...
....
.. .. ..
. . .
... ... ...
......
.. .. ..
.. .. ..
... ... ...
. . .
5x1 − 2x2 = 1
3x1 − 4x2 = −5
We solve this by elimination. We label the first equation by (1) and the second by
4 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
(2). First, we multiply equation (1) by 2 and subtract equation (2) from 2 times
equation (1). We get
7x1 = 7
which would imply that x1 = 1. Now that we have x1 , we substitute this directly
to the first equation to get x2 = 2. We see that the given system has a unique
solution (1, 2). A system of linear equations having a unique solution is referred to
as an independent system. The use of the term will be apparent later.
3. While a system can have a unique solution, it can happen that a system has not
just one solution, but infinitely many solutions. Consider
3x1 + 3x2 − x3 = 8
6x1 + 8x2 − 2x3 = 3
6x1 − 2x3 = 55
6x1 − 2x3 = 55
Or simply 6x1 − 2x3 = 55. Just what do we mean by this? This means that the
solutions are triples of the form (x1 , −13/2, x3 ) where 6x1 − 2x3 = 55. Notice that
once we assign an arbitrary value to anyone of x1 and x3 , the value of the other
variable is uniquely determined. Thus, if we let x1 = t be any real (complex)
number, then x3 = (6t − 55)/2. Thus, the solution set for the system is described
by
S = {(t, − 13
2 ,
6t−55
2 ) : t ∈ R(or C)}.
4. A linear system can possibly have no solution at all. For example, consider the
3 × 2 linear system
2x1 + 3x2 = 5
x1 − x2 = 0
x1 + x2 = 3
1.1. SYSTEMS OF LINEAR EQUATIONS 5
Note that there is only one way to satisfy the first and the second equation. It is by
setting x1 = x2 = 1. But with these values, x1 + x2 = 2 6= 3, thus can never satisfy
the third equation. We conclude that there cannot be an ordered pair (x1 , x2 ) that
can satisfy these three equations.
We will now formalize the methods in the previous examples. Two linear systems
are said to be equivalent of they have the same set of solutions. We will see in the
next theorem how one can construct a linear system equivalent to a given linear system.
a11 x1 + a12 x2 = b1
a21 x1 + a22 x2 = b2 (1.4)
is consistent if and only if ∆ = a11 a22 − a21 a12 6= 0. The number ∆ will appeal to
us later in chapter 4.
1.2 Matrices
Let F = R or C. An m × n matrix over F is a rectangular array of mn elements of F ,
called entries, arranged in m rows and n columns. Such a matrix may take the form
a11 a12 ··· a1n
a21 a21 ··· a2n
A=
.. .. .. .. (1.5)
. . . .
where aij ∈ F for i ∈ {1, 2, . . . , m} and j ∈ {1, 2, . . . , n}. We will abbreviate the matrix
above as A = [aij ]. Take note that we use the indexing aij to denote the entry of the
matrix A in the (i, j)th position, i.e. the entry on the ith row and jth column. The
set of all m × n matrices over F is denoted Mm×n (F ). If F = C, we refer to a matrix
over F as a complex matrix and analogously, if F = R, a matrix over F is called a real
matrix. If m = n, we denote Mm×n (F ) =: Mn (F ). If A ∈ Mn (F ), we say A is a square
matrix.
Definition 1.2.1. We say that the matrices A = [aij ] and B = [bij ] are equal, written
A = B, if and only if A and B have the same number of rows and columns and aij = bij
for all i and for all j.
The definition above simply says that for two matrices to be equal, their corre-
sponding entries should be equal.
Examples 1.2.1.
All the matrices above are complex matrices. However, only A, B and D are real
matrices. A and D are both 2 × 2 so they are square matrices. Notice that A = B
if and only if x and y satisfies that system
x+y =2
2x − y = 1
Definition 1.2.2. Let A, B ∈ Mm×n (F ). If A = [aij ] and B = [bij ], we define the sum
A + B of A and B to be the matrix C = [cij ] where
In other words, if two matrices are of the same size, we can add them to get
a matrix whose entries are obtained by adding the corresponding entries of the given
matrices.
3. There is a matrix Omn such that A + Omn = Omn + A = A for all A ∈ Mm×n (F ).
Proof.
1. To see this, we first note that addition of numbers is commutative. If we let A = [aij ]
and B = [bij ], then
3. Simply take Omn = [ωij ] to be the m × n matrix with all entries 0, i.e. let ωij = 0
for all i and for all j. Then if A = [aij ], then
4. Let A = [aij ] be an m×n matrix. Consider the matrix B = [bij ] obtained by taking
the negatives of the entries of A, i.e. let bij = −aij . Then
Remarks 1.2.1.
This simply means that no other matrix can satisfy property (3) of theorem 1.2.1
aside from Omn . We call the matrix Omn the zero m × n matrix.
B = B + Omn = B + (A + B 0 ) = (B + A) + B 0 = (A + B) + B 0 = Omn + B 0 = B 0 .
Thus, the existence of the matrix B in property (4) is unique. Having said that,
we will call the unique matrix B as the negative of A and we shall denote it as
−A. From now on, the expression A − B, where A and B are matrices of the same
size, shall mean the sum A + (−B).
1. c(A + B) = cA + cB.
2. (c + d)A = cA + dA.
3. c(dA) = (cd)A.
4. 1A = A.
5. 0A = Omn .
Examples 1.2.2.
1 0 2 0
1. If A = 0, B = 1 and C = −1, then 2A − B − C = 0 = O31 .
1 0 2 0
" # " #
2 −3 0 1 4 −2
2. Let A = and B = .
1 0 −2 0 2 5
Solution. To solve for the matrix X, we first note that X must be a 2 × 3 matrix.
Let X = [xij ] where i ∈ {1, 2} and j ∈ {1, 2, 3}. The equation may be written as
" # " # " #
2 −3 0 x11 x12 x13 1 4 −2
+2 = ,
1 0 −2 x21 x22 x23 0 2 5
which simplifies to
" # " #
2 + 2x11 −3 + 2x12 2x13 1 4 −2
= .
1 + 2x21 2x22 −2 + 2x23 0 2 5
10 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
Using the definition of matrix equality, we equate corresponding entries and solve
for the variables giving us x11 = −1/2, x12 = 7/2, x13 = −1, x21 = −1/2, x22 = 1
and x23 = 7/2. Therefore,
" #
−1/2 7/2 −1
X= .
−1/2 1 7/2
B = a1 A1 + a2 A2 + · · · + ak Ak .
h i h i h i
For example, if E1 = 1 0 0 , E2 = 0 1 0 and E3 = 0 0 1 , note
that
h any imatrix in M1×3 (C) is a linear combination of E1 , E2 and E3 . To see this let
x y z be an arbitrary element of M1×3 C. Then
h i
x y z = xE1 + yE2 + zE3 .
We will now study how we can multiply matrices. Unlike in the real (complex)
number system, we will see that not all pairs of matrices can be multiplied and matrix
multiplication is much more complicated than just addition.
The definition above says that for the matrix A be multiplied to a matrix B, the
number of columns of A should agree with the number of rows of B. In effect, the
resulting product AB have rows as many as the rows of A and columns as many as the
columns of B. To illustrate the actual multiplication, refer to the figure below:
1.2. MATRICES 11
a11 a12 ··· a1p c11 c12 ··· c1j ··· c1n
··· ··· ···
a21 a22 a2p c21 c22 c2j c2n
b11 b12 · · · b1j ··· b1n
.. .. .. . .. .. ..
. . . b b22 · · · b2j ··· b2n .. . . .
21
. .. .. =
..
· · · aip .. ··· ···
a ai2 . . . ci1 ci2 cij ajn
i1
. .. .. . .. .. ..
.. ..
. bp1 bm2 · · ·
. bpj ··· bpn . . .
am1 am2 · · · amp cm1 cm2 · · · cmj ··· cmn
To get the (i, j)-th entry of the matrix product AB, one needs the ith row of
A and the jth column of B. We multiply the the corresponding entries and add the
resulting products.
Multiplication of matrices might not be as easy as matrix addition, but one can
see in Chapter 3 that this operation is very natural. Though the definition above is
natural, some strange properties of matrix multiplication is present. For example, let
A be a 3 × 4 matrix and B be a 4 × 2 matrix. While we can execute the product AB,
one cannot multiply B to A since the number of columns of B is 2 and the number of
rows of A is 4 and 2 6= 4. Thus, BA is not defined. We will see more peculiarities of
matrix multiplication in the following examples.
Examples 1.2.3.
" # " #
2 0 4 1
1. Let A = and B = . Note that both AB and BA are defined.
1 −1 −1 4
" # " #
(2)(4) + (0)(−1) (2)(1) + (0)(4) 8 2
AB = =
(1)(4) + (−1)(−1) (1)(1) + (−1)(4) 5 −3
h i 1 2 1 −1
2. If U = 2 1 −1 and V = 3, then U V = [0] while V U = 6 3 −3.
5 10 5 −5
Notice that none of U and V is a zero matrix but U V is.
x1 b1
x2 b2
If we let A = [aij ] for i = 1, 2, . . . , m and j = 1, 2, . . . , n, X = . and B =
.. ,
.
. .
xn bm
then (1.6) may actually be written as
AX = B.
In this case, A is called the coefficient matrix. Thus, the solvability of 1.6 is
equivalent to the solvability of the linear matrix equation AX = B. We will be
more precise on this in the next section.
If A = [aij ] ∈ Mn (F ), one can check that AIn = In A = A. The matrix In will play
an important part in our study so it deserves a special name. We will call In the
n × n identity matrix.
6. For a positive integer k, let F k denote the set of all columns of the form
a1
a2
.
.
.
ak
So in this case, as in the case of example 1.2.3 (2), we will write U V = 0. Like
matrix addition, matrix multiplication obeys some algebraic rules.
3. (rA)(sB) = (rs)(AB) for any matrices A and B and for any scalars r and s.
14 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
Proof. We will only prove (1) and the proofs of (2) and (3) are analogous. Let
A = [aij ] ∈ Mm×n , B = [bij ] ∈ Mn×p and C = [cij ] ∈ Mp×q . If AB = [dij ], then
n
X
dij = aik bkj .
k=1
p
X
fij = bir crj .
r=1
Thus,
p
X
(AB)C = dis csj
s=1
p
" n ! #
X X
= aik bks csj .
s=1 k=1
Now,
n
X
A(BC) = ait ftj
t=1
n p
!
X X
= ait btr crj
t=1 r=1
Xn
= [ait (bt1 c1j + bt2 c2 j + · · · + btp cpj )]
t=1
Xn
= (ait bt1 c1j + ait bt2 c2 j + · · · + ait btp cpj )
t=1
n n n
! ! !
X X X
= ait bt1 c1j + ait btj c2j + · · · + ait bt1 cpj
t=1 t=1 t=1
p n
" ! #
X X
= ait btr crj
r=1 t=1
= (AB)C.
From now on, when we wish to write a 1×1 matrix, we shall identify it as a scalar,
i.e. we will write a in place of [a]. There is no harm in doing this since the elements of
1.2. MATRICES 15
Some class of matrices have special characteristics and thus deserve special
names. A matrix having equal number of rows and columns is called a square matrix.
The set of all square n × n matrices will be denoted by Mn . Note that Mn is closed
under the fundamental matrix operations, i.e. the sum, the product, scalar multiples,
of matrices in Mn are in Mn .
An = AA · · · A}.
| {z
n factors
0 −1 −1 0 0 4
Such a matrix is called a diagonal matrix. Note that D has a property that
dij = 0 for i 6= j. Moreover, D is both upper triangular and lower triangular. For
diagonal matrices, matrix operations is more natural. It can be shown that
This definition simply tells us that the transpose of a matrix A is the matrix
whose rows are the columns of A. Consequently, the rows of A are the columns of its
transpose. The following properties of matrix transpose are not surprising.
Theorem 1.2.4 (Properties of Matrix Transpose). For any matrices where the following
operations are possible,
1. (AT )T = T ,
2. (A + B)T = AT + B T ,
3. (rA)T = rAT ,
4. (AB)T = B T AT ,
Proof. (1), (2) and (3) are easy. We will only prove (4) and (5) follows from (4). Let
A = [aij ] ∈ Mm×n and B = [bij ] ∈ Mn×m . Then AT is n × m and B T is m × n so,
B T AT is m × m. Moreover, (AB)T is also m × m. We now verify the equality. Let
C = [cij ] = AB. If D = [dij ] = C T , then
n
X
dij = cji = ajk bki .
k=1
1.2. MATRICES 17
n
X n
X n
X
gij = eik fkj = bki ajk = ajk bki = cji = dij .
k=1 k=1 k=1
We now explore some classes of square matrices which are invariant under matrix
transpose.
is symmetric. Note that if A is symmetric, then so are the matrices AT and −A.
3 0 0
1
This is true provided that the field of scalars is of characteristic not equal to 2, i.e. if 1 + 1 6= 0.
18 CHAPTER 1. LINEAR EQUATIONS AND MATRICES
1. If A is any matrix of any size, then AT A and AAT are square matrices. Moreover,
these matrices are symmetric.
A = S + S0
Examples 1.2.4.
" #
1 −3 0
1. Given A = . Verify that AT A and AAT are symmetric matrices using
2 1 1
direct computation.
Sol’n. Note that
1
2 " # 5 −1 2
1 −3 0
AT A = −3 1 = −1 10 1
2 1 1
0 1 2 1 1
and
" " # # 1 2
1 −3 0 10 −1
AAT = −3 1 =
2 1 1 −1 6
0 1
which are symmetric matrices.
" #
2 4
2. Let A = . Write A as a sum of a symmetric and skew-symmetric matrix.
0 −2
Sol’n. Let S = 21 (A + AT ) and S 0 = 12 (A − AT ). Then
" # " #! " #
1 2 4 2 0 2 2
S= + =
2 0 −2 4 −2 2 −2