Professional Documents
Culture Documents
Chapter 1. Introduction 5
Chapter 2. Formal definitions and notation 7
Chapter 3. Basic operations 10
3.1. Transpose of a matrix 10
3.2. Matrix addition 11
3.3. Partitioned matrices 12
3.4. Scalar multiplication 15
3.5. Vector multiplication 16
3.6. Matrix multiplication 17
3.7. Trace of a matrix 22
Chapter 4. Determinant of a matrix 23
4.1. Definition 23
4.2. The determinant of matrices of order 1, 2 and 3 23
4.3. The cofactor expansion 24
4.4. Properties 26
Chapter 5. Linearly independent vectors 27
5.1. Linear combinations of vectors 27
5.2. Linearly independent vectors 28
Chapter 6. Rank 31
6.1. Definition 31
6.2. Maximal number of linearly independent vectors 31
6.3. Row rank 32
6.4. The rank of a product matrix 33
6.5. Systems of linear equations 33
Chapter 7. Inverse matrices 35
7.1. Existence and uniqueness of the inverse matrix 35
7.2. Computation of the inverse 36
7.3. Ordinary Least Squares 37
Chapter 8. Vector spaces, spanning sets and projection matrices 38
3
CONTENTS 4
Introduction
These notes cover matrix algebra results that are useful to econome-
tricians. They are based on Greene (2008), Searle (1982), Rao (1973)
and Harville (1997). Exercises are given throughout and I recommend
the reader to go through all of them (refraining from looking at the
solutions that follows each exercise until one provides her/his own so-
lutions or, at least, enough effort).
Consider Table 1 showing disposable income and aggregate con-
sumption for the U.S. over the years 1940-1950.
5
1. INTRODUCTION 6
The terms of a square matrix that lie onto the line parallel to and just
below the main diagonal are referred to as the first lower subdiagonal
of A, the terms laying onto the line parallel to and just below the first
lower subdiagonal are referred to as the second lower subdiagonal, etc.
Upper subdiagonals are defined similarly. In the foregoing matrix A
the main diagonal is given by the terms (2.1, 5.1, 2.1, 7.2), and the
first lower and upper subdiagonals are (3.5, 0, 0.4) and (1.5, 3.4, 6.7),
respectively. The elements of a square matrix other than the main
diagonal are referred to as the off-diagonal or non-diagonal terms.
A square matrix whose off-diagonal terms are all zero is called a
diagonal matrix, for example
2 0 0
A = 0 7 0 .
0 0 5
A diagonal matrix whose main diagonal has all unity elements is called
an identity matrix and is denoted by I. If I is of a given order n, then
it is usually denoted by In , for example
1 0 0
I3 = 0 1 0 .
0 0 1
A matrix consisting of only a single column is called a column vec-
tor,
x11
x21
x= ... ,
xn1
the order of which is (n 1) . It is denoted by lower case letters in bold.
A matrix consisting of only a single row is called a row vector,
x0 = x11 x12 . . . x1n ,
Basic operations
1
6
x=
5
9
0
x = 1 6 5 9 .
10
3.2. MATRIX ADDITION 11
Notice that A11 and A12 have the same number of columns as A21
and A22 , respectively, and that A11 and A21 have the same number
of rows as A12 and A22 , respectively. This is by no chance: a proper
partitioning requires that the horizontal and vertical dashed lines must
go the full length of the matrix. In general terms, the partitioning of a
generic matrix Ank into four submatrices as in (3.3.2) is said a 2 2
partitioning and can always be represented as
Kn1 k1 Ln1 k2
A=
Mn2 k1 Nn2 k2
where n1 + n2 = n and k1 + k2 = k. Partitioning can involve less or
more than four submatrices, for example both
3 4 | 5 6 8 3
4 7 | 9 1 4 2
(3.3.3) A= 2 8 | 8 6 7 1
4 3 | 0 7 0 9
4 7 | 5 3 3 5
and
3 4 | 5 6 8 3
4 7 | 9 1 4 2
|
(3.3.4) A= 2 8 | 8 6 7 1
4 3 | 0 7 0 9
4 7 | 5 3 3 5
are legitimate partitioned matrices of A. Partitioning a generic matrix
Ank into two submatrices as in (3.3.3) is said a 1 2 partitioning and
can always be represented, in general terms, as
A = Knk1 Lnk2
where k1 + k2 = k. Partitioning a matrix Ank into six submatrices as
in (3.3.4) is said a 3 2 partitioning.
Exercise 3. 1) Formulate the 32 partitioning of a matrix Ank in
general terms. 2) Find a 21 partitioning of A of (3.3.1). 3) Formulate
the 2 1 partitioning of a matrix Ank in general terms.
Solution: 1)
Kn1 k1 Ln1 k2
Ank = Mn2 k1 Nn2 k2
On3 k1 Pn3 k2
with n = n1 + n2 + n3 and k = k1 + k2 .
3.3. PARTITIONED MATRICES 14
2)
3 4 5 6 8 3
4 7 9 1 4 2
2 8 8 6 7 1
A=
4 3 0 7 0 9
4 7 5 3 3 5
3)
Kn1 k
Ank =
Ln2 k
with n = n1 + n2 .
Example 4. Partitioning is important in regression analysis, for
instance when we wish to keep the variables of interest distinct from
the other regressors into the sample regressors matrix X. So, if in
0.1 1.2 1
0.4 1.8 2
0.6 1.8 3
X=
0.1 1.9 4
0.3 1.7 5
0.1 1.3 6
the first two columns are the explanatory variables of interest, whereas
the last is only a control variable, we may find it convenient to represent
X as a partitioned matrix X = (X1 X2 ), where
0.1 1.2 1
0.4 1.8 2
0.6 1.8 3
X1 = and X2 = .
0.1 1.9 4
0.3 1.7 5
0.1 1.3 6
Remark 1. It is worth noting that A of (3.3.1) and A of (3.3.2)
(or A of (3.3.1) and A of (3.3.3) for that matters) are the same matrix
of order (5 6) from a mathematical point of view. Partitioning just
makes it explicit some qualitative difference among the entries of the
matrix, as in example 4, where there is the difference between the
entries of the first two columns, observations peculiar to the variables
of interest, and the entries of the last column, observations peculiar to
the control variable.
The transpose of a partitioned matrix is the transpose of the ma-
trix of submatrices. It can be carried out into two logical steps. First,
3.4. SCALAR MULTIPLICATION 15
transpose the matrix of submatrices as if you dont know that its ele-
ments are matrices; second, transpose the submatrices. For example,
let
A= B C ,
then
B0
0
A = ;
C0
or let
B C D
A= ,
E F G
then
B0 E0
A0 = C 0 F 0 .
D0 G0
The transpose of an l m partitioned matrix is always an m l
partitioned matrix.
Exercise 4. Transpose the partitioned matrix of (3.3.4) and verify
that it is a 2 3 partitioning of the transpose of A in (3.3.1).
Solution:
3 4 | 2 4 | 4
4 7 | 8 3 | 7
| |
A0 = 5 9 | 8 0 | 5
6 1
| 6 7 | 3
8 4 | 7 0 | 3
3 2 | 1 9 | 5
3 4 2 4 4
4 7 8 3 7
5 9 8 0 5
A0 =
6 1 6 7 3
8 4 7 0 3
3 2 1 9 5
Example 5. Let
3 4
A= ,
1 7
then
10.5 14
3.5A = .
3.5 24.5
a0n
where
a0i = ai1 ai2 ai3 . . . aik
and
Ank 0k = 0n Ank = 0nk .
Exercise 6. Given
1 2 3 7
5 6 0 2
0 0
1
0
A= 0 1 3 8 and B =
6 5
1 2 3 3
0 0
9 3 6 5
BA is not defined.
Exercise 7. Given
1
5
A=
0 and B =
0 2 8
1
9
Solution: AB is defined
0 2 8
0 10 40
AB =
0 0 0
0 2 8
0 18 72
BA is not defined.
Exercise 8. Given
1
5
A= 0 0 2 8 5 1
and B =
1
9
Is the matrix product AB defined? If yes, compute it. Is the matrix
product BA defined? If yes, compute it.
Solution: AB is defined
0 2 8 5 1
0 10 40 25 5
AB = 0 0 0 0 0
0 2 8 5 1
0 18 72 45 9
BA is defined, BA = 24.
3.6.1. The classical regression model in matrix form. Con-
sider the classical regression model for a data set with n observation
(3.6.1) yi = 1 xi,1 + . . . + k xi,k + i
where yi denotes the value of the dependent variable, xi,h the value of
the h.th regressor, h = 1, . . . , k, and i the value of the random shock,
all of them at the i.th observation, i = 1, . . . , n.
The n equations (3.6.1) can be expressed compactly in matrix form
as
y = X + ,
where
y1 x1,1 x1,h x1,k i
.. .. .. .. ..
. . . . .
y = yi , X = xi,1 xi,h xi,k , = h
. . .. .. .
.. .. . . ..
yn xn,1 xn,h xn,k k
3.6. MATRIX MULTIPLICATION 20
3.6.3. The laws of algebra. The associative law holds for matrix
multiplication, provided that matrices are conformable. Given Ank ,
Bkl and Clm , then
A (B + C) = AB + AC,
provided that, on the one hand, A and B are conformable for matrix
multiplication and, on the other, B and C are conformable for matrix
addition.
As already seen, the commutative law does not hold, since it is
not generally true that AB = BA, even if both products are defined.
In special cases it may happen that AB = BA, as for the following
matrices
1 2 0 2
(3.6.2) A= and B = .
3 4 3 3
Finally, by the associative law for the matrix multiplication and prop-
erty (3)
tr [Ank (Bkm Cmn )] = tr [Bkm (Cmn Ank )] = tr [Cmn Ank Bkm ] .
CHAPTER 4
Determinant of a matrix
4.1. Definition
Loosely speaking, the determinant is a function of the elements of
a matrix and is defined only for square matrices. While the formal
definition is cumbersome (readers are referred to Rao (1973) p. 22, or
Searle (1982) p. 90), it is also not strictly needed for our econometric
interests. Below, I show how to obtain the determinant of square ma-
trices of order 1, 2 and 3 and then, provide a general computational
rule that applies to any square matrix.
(1) its five diagonals: the main diagonal, the first lower sub-diagonal,
given by the elements a21 and a32 , the second lower sub-diagonal
given by a31 , and the first and the second upper sub-diagonals
with elements symmetrical to those in the corresponding lower
ones
a11 a12 a13
& &
A33 = a 21 a 22 a23
;
& &
a31 a32 a33
(2) or, alternatively, its five anti-diagonals
a11 a12 a13
. .
A33 = 21a a 22 a23 .
. .
a31 a32 a33
Then, det (A33 ) is the sum of the product of the main diagonal terms
a11 a22 a33 , the product of the first lower sub-diagonal terms and the
second upper sub-diagonal term, a21 a32 a13 , and the product of the
first upper sub-diagonal terms and the second lower sub-diagonal term
a12 a23 a31 , minus the sum of the products taken in the same way but
along the anti-diagonals: the product of the main anti-diagonal terms
a13 a22 a31 , the product of the first lower sub-anti-diagonal terms and the
second upper sub-anti-diagonal term, a23 a32 a11 , and the product of the
first upper sub-anti-diagonal terms and second lower sub-anti-diagonal
term a12 a21 a33 .
Exercise 11. Compute the determinant of the matrix
1 1 2
A= 3 5 4
1 0 2
Solution: det=18.
(that is the row and column containing aij ). Then, multiply each term
aij det (Aij ) by (1)i+j . The sum of all resulting products is det (A) :
X n
det (A) = (1)i+j aij det (Aij )
j=1
det (A33 ) = (1)2 a11 (a22 a33 a23 a32 ) + (1)3 a12 (a21 a33 a23 a31 )
(1)4 a13 (a21 a32 a22 a31 )
= a11 a22 a33 a11 a23 a32 a12 a21 a33 + a12 a23 a31
a13 a21 a32 a13 a22 a31
Exercise 13. Consider
a11 a12
A22 = .
a21 a22
Obtain det (A22 ) by applying a cofactor expansion and formula (4.2.1)
and verify that it is identical to that in (4.2.2).
Solution:
det (A22 ) = (1)2 a11 a22 + (1)3 a12 a21
= a11 a22 a12 a21
4.4. PROPERTIES 26
ck
a linear combination of the columns of Ank is given by the (n 1)
column vector
Ank ck1 = c1 a1 + c2 a2 + . . . + ck ak .
Example 7. Let
2 1
0.5
A= 4 3 and c = ,
2
6 5
then
2 1
Ac = 0.5 4
+ 2 3 .
6 5
Example 8. Suppose you have data on gross income, y, and income
tax payments, t, for a cross-section of n households, so that
y1 t1
y = ... , t = ... .
yn tn
Then, net income, x, is a linear combination of y and t
y1 t1
.
. .
. 1
x= . . = y t.
1
yn tn
27
5.2. LINEARLY INDEPENDENT VECTORS 28
cki
This can be seen more clearly by expanding Ank Ckm as follows
Ank Ckm = Ank c1 Ank c2 . . . Ank cm .
Exercise 15. Given a matrix Ank , represent the two cases of 1
and m generic linear combinations of the rows of Ank in matrix form.
Solution: Let 0
a1
a02
Ank = ...
a0n
where a0i =
ai1 . . . aij . . . aik , i = 1, ..., n, and
c0 = c1 . . . cn ,
0
so that c is non-null and Ank c = 0n1 .
Exercise 18. Consider a matrix
Ank = a1 a2 . . . ak
whose columns are non-zero and linearly dependent, prove that at least
one column of A can be obtained as a linear combination of its prede-
cessors.
Solution: There exists a non-zero xk1 such that Ank xk1 = 0n1 ,
or
(5.2.1) x1 a1 + x2 a2 . . . + xn an = 0n1
and since all vectors are non-zero, there exist at least two non-zero
components of xk1 . Among all non-zero components of xk1 , take the
one with the highest subscript, say xi , and solve system (5.2.1) for ai .
5.2. LINEARLY INDEPENDENT VECTORS 30
Rank
6.1. Definition
Given a matrix A, the rank of A, rank (A), is defined as the maxi-
mal number of linearly independent columns of A. Ank is said of full
column rank (f.c.r.) if and only if rank (Ank ) = k.
It is obvious that if A has f.c.r. and c is a conformable non-null
column vector, then Ac = b 6= 0 (otherwise the columns of A would
be linearly dependent).
an1
31
6.3. ROW RANK 32
a1 e1 . . . en is a set of linearly dependent non-null vectors (see
exercise 19) and so there will be an ei that can be written as a linear
combination of its predecessors, as shown in exercise 18. Then, ei can
be replaced out in equation (6.2.1) to get
(6.2.2) a0 = b1,0 a1 + c1,0 e1 + . . . + ci1,0 ei1 + ai+1,0 ei+1 + . . . + an,0 en .
Consider the vectors
a2 a1 e1 . . . ei1 ei+1 . . . en
Since a2 is a linear combination of e1 , ..., ei , ..., en and ei in turn
is a linear combination of a1 , e1 , ..., ei1 , then a2 can be expressed
as a linear combination of a1 , e1 , ..., ei1 , ei+1 ..., en , which makes
of a2 a1 e1 . . . ei1 ei+1 . . . en a set of linearly dependent
vectors. So, one more vector eh can be replaced out in equation (6.2.2).
The process of adding as and excluding es (once included, an a will
never be excluded since its predecessors are only as and all as are
linearly independent) continues until a0 is written as a linear combina-
tion of a1 , . . . , an which shows the result that the maximal number of
linearly independent vectors of order n is n.
As an immediate implication, if Ank is of f.c.r. then k n.
a0n
6.5. SYSTEMS OF LINEAR EQUATIONS 33
and
c0 = 1 0 . . . 0 ,
(that both x
and c are non-null follows from the fact that b is non-
null and A has f.r.) Hence, taking x = (1/c) x
establishes existence.
Uniqueness is easily proved as follows. Let
b = Ax1 = Ax2 ,
then A (x1 x2 ) = 0, and since A has f.r. it follows that x1 = x2 .
To prove necessity, maintain that there exists a unique (n 1) vec-
tor x1 6= 0, solution to the system in (6.5.1). Suppose, by contradic-
tion, that A is not of f.r. Then, there exists a vector x2 6= 0 such that
Ax2 = 0. Hence, all x1 + ax2 , any real scalar a, are solutions to (6.5.1)
contradicting uniqueness of x1 .
From Section 6.3 it follows that an equivalent statement can be
made in terms of row vectors. Given b0 any non-null (1 n) vector,
then there exists a unique (1 n) vector x0 6= 0 solution to the system
x0 A = b0 if, and only if, rank (A) = n.
6.5.2. Homogeneous systems. An homogeneous system of n lin-
ear equations in n unknowns can always be represented as
Ax = 0,
where A is any given non-null square matrix of order n. It is immediate
that there exists a non-zero solution to the system if and only if A is
not of f.r. Otherwise, the only solution would be the trivial one, x = 0.
CHAPTER 7
Inverse matrices
Hence, given inequality (6.4.1), rank (ABkm ) rank (A) for any
Bkm and any m, with equality holding if m = k and Bkm is non-
singular, which proves that dim [R (A)] = rank (A) . Obviously, if A is
of f.c.r., then dim [R (A)] = k.
38
8.5. PROJECTION MATRICES 39
0
Exercise 23. Verify that P[A] = P[A] and P[A] P[A] = P[A] .
Solution: h i0
1 0 1
0
A (A A) A = A (A0 A) A0
and
1 1 1
A (A0 A) A0 A (A0 A) A0 = A (A0 A) A0 .
Any matrix with the two foregoing properties is said an orthogonal
projector, and so is P[A] . In geometrical terms, P[A] projects vectors
onto R (A) along a direction that is parallel to the space orthogonal to
R (A), A . Symmetrically,
M[A] = I P[A]
is the orthogonal projector that projects vectors onto A along a di-
rection that is parallel to the space orthogonal to A , R (A).
Exercise 24. Prove that M[A] is an orthogonal projector (hint:
just verify that M[A] is symmetric and idempotent).
Solution: 0
I P[A] = I 0 P[A]
0
= I P[A]
and
I P[A] I P[A] = I P[A] P[A] + P[A] = M[A] .
The properties of orthogonal projectors are readily understood,
upon grasping their geometrical meaning. Of course, they can be also
demonstrated algebraically, which is what the following exercises are
concerned with.
Exercise 25. (this is a little bit hard, but instructive) Given two
(n k) real matrices A and B, both of f.c.r., prove that if A and B
span the same space than P[A] = P[B] (hint: prove that A can be always
expressed as A = BK where K is a non-singular (k k) matrix).
Solution: If R (A) coincides with R (B), then every column of A
belongs to R (B), and as such every column of A can be expressed as a
linear combination of the columns of B, A = BK, where K is (k k) .
Therefore, P[A] = BK (K 0 B 0 BK)1 K 0 B. Since both A and B have
rank equal to k, in the light of inequality (6.4.1), k min [k, rank (K)],
which implies that rank (K) k, and since rank (K) > k is not possi-
ble (see Section 6.2), then rank (K) = k and K is non-singular. Finally,
by the property of the inverse of the product of square matrices
1
P[A] = BK (K 0 B 0 BK) K 0B0
1 1
= BKK 1 (B 0 B) (K 0 ) K 0B0
= P[B] .
8.6. OLS RESIDUALS AND PREDICTED VALUES 41
Exercise 26. Prove that P[A] and M[A] are orthogonal, that is
P[A] M[A] = 0.
Solution: P[A] M[A] = P[A] I P[A] = P[A] P[A] = 0.
42