You are on page 1of 22

Econometrics II Heij et al. Chapter 7.

Panel Data, SUR and GLS


Marius Ooms
Tinbergen Institute Amsterdam

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 1/22

Heij et al. (2004) 7.7.1-7.7.3


Panel data
Seemingly unrelated regression model (SUR)
Generalized least squares (GLS)
Feasible GLS
Panel data with fixed effects
Panel data with random effects

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 2/22

Panel data
Panel data consist of cross-section observations for

different time points.


We observe one dependent variable yit for individual i at

time t where i = 1, . . . , m and t = 1, . . . , n.


In most applications m is (much) larger than n: m >> n.
We have k strongly exogenous explanatory variables in a

vector xit and n >> k .


In the literature one often uses N (big N ) for m and T (big T ) for
n. As Heij et al. is mostly time series oriented, they use n as the
time series dimension and m as the cross-section dimension (or
number of equations in multivariate time series).
For n = 1 and large m , we have simple cross-section data.
When m = 1 and n is large, we have univariate time-series.
TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 3/22

General Panel data Model


The considered models are of the following regression form:
yit = it + xit it + it , Var() = ,

where is the mn mn covariance matrix of the mn 1


disturbances vector and xit is a (k 1) 1 vector of
explanatory variables. Parameters depend on time t and on
individual i.
This general model is not empirically identified since it
contains more parameters than observations!
Therefore, we have to impose restrictions on the regression
parameters (it , it ) and on the covariance matrix , before we
can estimate parameters.

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 4/22

Seemingly unrelated regression model (SUR)


The SUR model is given by
yit = i + xit i + it

Where E[it jt ] = ij ,

E[it js ] = 0 for all i, j and t 6= s.

All individuals have their own regression parameters, but


these are restricted to be constant over time.
The regression relations for the different individuals are only
related via the correlation of the error terms, but the error
covariance across individuals is unrestricted
No error covariance across time: no serial correlation, or serial
cross correlation.

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 5/22

SUR model specification and estimation per individual


Denote the observations for unit i by the n 1 vector yi and by
the n k matrix Xi , with corresponding parameter vector
i = (i , i ) and n 1 vector of disturbances i . The model for
SUR unit i can be written as
yi = Xi i + i

Estimating the parameters i by OLS per equation is


consistent, but is inefficient if the disturbances for the different
individuals display contemporaneous correlation and the
regressor sets differ from individual (equation) to individual
(equation)
This is easily shown if we combine data for all the units in one
big transformed regression model, where we can apply OLS
theory of 3.1.4. See next slides.

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 6/22

Complete SUR model in matrix notation


Combining the models for the m units gives

y1
y2
..
.
ym

X1 0
0 X2
..
..
.
.
0
0

E[] = 0, var() = =

0
0
..
.
Xm
11 I
12 I
..
.

12 I
22 I
..
.

1
2
..
.
m

1
2
..
.
m

1m I
2m I
..
.

1m I 2m I

mm I

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 7/22

Inefficiency of OLS for SUR model


Simultaneous OLS estimation of the mk 1 parameters i for
i = 1, . . . , m in the above model is equivalent to applying OLS
per unit.
Exercise (1) Prove this proposition. Hint: Consider the
method-of-moments equations for OLS.
This simultaneous OLS estimator is not BLUE since the
covariance matrix is not of the form 2 I . Next we will discus a
general method to deal with this problem.

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 8/22

Generalized least squares (GLS) idea


First suppose we know up to one constant term.
The idea is to transform the model so that the error covariance
matrix becomes scalar 2 I . This idea is similar to WLS.
Transform the joint model by a (big) square and invertible but
nondiagonal weighting matrix A, transform:
y = X +

into

Ay = AX + A

As the variance matrix of A is AA , we choose A s.t.


AA = I , or A1 (A )1 = (A A)1 = .
A is a square root of 1 . The decomposition of is standard in
matrix algebra, see section A.6, think of A as A = 1/2 . Note
that A is not uniquely defined.
TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 9/22

GLS, theoretical infeasible version


Assume the following notation for the transformed variables:
y = Ay, X = AX, = A,

so that
y = X +

with E[ ] = 0 and Var( ) = Inm . Now the BLUE estimator of


is given by
bGLS = (X X )1 X y = (X 1 X)1 X 1 y.

This is called the Generalized Least Squares estimator. In


practice this estimator is infeasible as we do not know .

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 10/22

Two step Feasible GLS in SUR model


When is unknown, we first estimate with OLS. We then
perform the so-called Feasible GLS in two steps:
Estimate .

Do m regressions, one per unit to estimate j by OLS,


j = 1, . . . , m. Let ei be the n 1 vector of OLS residuals for
unit i. The unknown (co)variances ij are then estimated by
1 Pn
by replacing ij by sij .
sij = n t=1 eit ejt . Then obtain

Estimate the parameters j jointly by GLS.

That is
1 X)1 X
1 y
bF GLS = (X

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 11/22

Asymptotic distribution of FGLS in SUR application


Under the assumption of correct specification and normally
distributed error terms, it can be shown that the FGLS estimator
in the SUR model has the same asymptotic properties as ML,
i.c. for n and m fixed one can derive
1 X)1 ).
bF GLS N (, (X X )1 ) N (, (X

We can use this result to perform asymptotic t- and F -tests. In


practice n is finite and one has to take extra precautions to make
1 X is a full rank positive definite matrix, see also next
sure X
slide.
NB: if the number of unknown parameters in increases linearly with n, FGLS does not
work. Compare the GMM standard error derivation for general in 5.5.2

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 12/22

Finite Sample Rank condition SUR, efficiency SUR


is an mn mn block
The estimated SUR covariance matrix
diagonal matrix with the m m matrix S on the diagonal blocks
with elements sij = n1 ei ej with ei the n 1 vector of OLS
is invertible, and 2-step
residuals of unit i. This means that
FLGS for SUR possible, if and only if the m m matrix S is
invertible, i.e. if and only if rank(S ) = m.

Necessary condition for Feasible GLS in SUR: Define the n m


matrix E = (e1 , , em ) , then S is n1 E E and
can be
m = rank(S) = rank(E) n. Therefore S and
invertible and we can estimate with FGLS only if m n.
There are special cases in which OLS is efficient for SUR
models. Exercise (2) 7.10, page 715.

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 13/22

Panel data with fixed effects


When m >> n the data are typical panel data or longitudinal
data. We cannot apply the SUR model, as this requires m n.
The model has to be simplified by parameter restrictions.
E.g., the coefficients of the explanatory variables are assumed
to be the same for all units (pooling): we impose (pooling)
restrictions on the slope parameters i : i = , i = 1, . . . , m.
We then obtain the panel data model with fixed effects:
yit = i + xit + it , it IID(0, 2 ),

The constant terms i are fixed unknown parameters, but they


differ from unit to unit (not pooled). The errors are independent
and homoskedastic in time and across units.
NB: the number of parameters increases linearly in m, so
standard asymptotic theory still requires n , although nearly
all parameters are pooled in the cross section.
TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 14/22

Fixed effects model in matrix notation I


We can rewrite the model in standard regression form using unit
dummy variables
(
1, i = j
Dit (j) =
, i = 1, . . . , m, j = 1, . . . , m
to get
0, i 6= j
yit =

Pm

j=1 j Dit (j)

+ xit + it , it IID(0, 2 )

Next, define the n 1 vector yi with elements yit , define it


accordingly and define the n (k 1) matrix Xi with tth row xit ,
t = 1, . . . , n and let be an n 1 vector of ones.
For the ith unit we obtain the matrix notation
yi = i + Xi + i

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 15/22

Fixed effects model in matrix notation II


Now stack the equations yi = i + Xi + i for the m time
series.
Next, define the mn 1 vector y consisting of the stacked yi s,
define accordingly, define the mn (k 1) matrix X as the
matrix of stacked Xi and define the stacked mn m matrix D as

0 0

0 0
D=
..
.. ..

.
. .
0 0

If = (1 , , m ) , then following single regression model


arises
y = X + D + , N (0, 2 I)
TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 16/22

Fast Fixed Effects estimation in regression form I


Efficient estimators of and can be obtained by OLS. When m
is large, direct OLS is computationally unattractive as it requires
the inverse of (X D) (X D) . An intuitive and easier method
(m+k1)(m+k1)

applies partial regression, following the Frisch-Waugh theorem


(3.2.5), in matrix notation:
1. Regress y and (all columns of) X on D and save the

residuals, MD y and MD X , MD = I D(D D)1 D . Since


(D D)1 = n1 I , MD y and MD X have elements yit yi and
xit xi : just removing individual sample means!
2. Regress MD y on MD X to obtain
OLS

OLS = (X MD X)1 X MD y

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 17/22

Interpretation Fixed effects estimation in regression II


The Fixed Effect estimator or Least Squares Dummy Variable
Estimator (LSDVE) of is therefore obtained by regressing
unit-mean adjusted y on unit-mean adjusted X . The fixed effect
OLS estimates
follow from the last m OLS normal equations
(3.41). In matrix regression form:
D X + D D
= D y,

so that

= (D D)1 (D y D X ).

which has the familiar interpretation of the estimates of constant


terms in regressions per individual (but here with a given
common ) :
i = yi x
i

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 18/22

Panel data model with random effects


The model with fixed effects cannot be consistently estimated if
n is fixed and m , and it cannot be used to forecast a new
unit ym+1 given xm+1 : i is not modelled.
The simplest model for this purpose is the random effects
model, which has a random intercept with a common mean
for all units. In social sciences (SPSS) this specification is called
called mixed model (mix of random and fixed coefficients).
i = + i ,
i IID(0, 2 )
yit = + xit + it it = it + i , it IID(0, 2 )

with i and it independent. The disturbances it are correlated


with their own past because of the i . The properties of it are:
2 ] = 2 + 2 ,
E[it is ] = 2 for t 6= s
E[it ] = 0, E[it

E[it js ] = 0 for all t, s and i 6= j


TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 19/22

Random effects models FGLS, I


We can estimate the parameters and by OLS, but this
estimator is not BLUE since the disturbances it are cross
correlated. An efficient estimator can be obained by feasible
GLS.
In the first step of FGLS we need to estimate 2 and 2 .
Since i is fixed in the ith unit, it can be removed from the model
by taking the unit de-meaned variables. Consider
yit yi = (xit x
i ) + (it i ),

i = 1, . . . , m,

t = 1, . . . , n

Let be the OLS estimate of for the above model. Then the
within variance, 2 = E(2it ), is estimated by

2 =

1
m(n 1)

m X
n
X
(yit yi (xit x
i ) )2 .
i=1 t=1

TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 20/22

Random effects models FGLS, II


To estimate 2 we combine the within variance estimator
2
and the between variance estimator which estimates the
unexplained variance between unit-means in:
yi = + x
i + (
i + i ),

i = 1, . . . , m

2,
The variance estimate of this regression, denoted by
B
2 and 2
estimates var(i + i ) = var(n1 2 + 2 ). Combining
B
one derives the estimator
2

2 =
B
n1
2

Given
2 and
2 one can do the second step of FGLS to
reestimate and . The resulting estimator is also known as the
EGLS (Estimated GLS) estimator of .
Exercise (3): Check the derivation on page 695-696 for
m = 3, n = 2.
TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 21/22

Conclusion
The courses Econometrics I and II have introduced you to
the main ideas of parametric econometric modelling

(Data analysis, parsimonious specification, consequences


of modellling errors, diagnostic checking, testing) and
the basics of econometric (asymptotic) inference (Exact

statistical inference, likelihood based inference, moment


based inference, stationarity, rate of convergence)
in
Static linear and nonlinear single equation models
Binary Discrete choice models
Dynamic linear single- and multiple equation models
Panel data models

Many different parametric models and methods exist, but these


are (all) based on (combinations of) the ideas mentioned above.
Not discussed: Bayesian and nonparametric estimation and infererence
TI Econometrics II 2006/2007,

7.7.1-7.7.3 p. 22/22

You might also like