Introduction To Econometrics: Brandon Lee

Introduction to Econometrics
Brandon Lee
15.450 Recitation 9
Brandon Lee Introduction to Econometrics

Law of Large Numbers
Suppose xt are IID and E [xt ] = µ. Then the Law of Large

Numbers states that
T
1
plim ∑ xt = µ
T t=1
Intuitively, the LLN says that as the sample gets larger, the
sample average approaches the true mean.
The LLN is often the basis for establishing consistency of
statistical estimators.

Consistency of OLS Estimator
Suppose
yi = xi β + εi
where E [εi |xi ] = 0 (which then implies that the error term is
uncorrelated to the sample: E [xi εi ] = 0)
The OLS estimator is given by
−1
βˆ = X 0 X X 0y
Let’s verify that βˆ is consistent: that is, plim βˆ = β .

Continued
Note
−1
βˆ = X 0 X X 0y
−1
= X 0X X 0 X 0β + ε

−1 0
= β + X 0X Xε
Therefore,
−1 0
plim βˆ = β + plim X 0 X Xε
0 −1 0 !
XX Xε
= β + plim
N N

Continued
0 −1
The key here is that XNX will converge to some limit and
0
so will XNε . But by the Law of Large Numbers, we know
that 0
Xε
plim = E [xi εi ] = 0
N
Therefore,
plim βˆ = β

Central Limit Theorem
Suppose that xt is a random vector such that E [xt ] = µ and

Var (xt ) = Ω. The Central Limit Theorem states that
T
1
√ ∑ (xt − µ ) ⇒ N (0, Ω)
T t=1
Here, the convergence is in “convergence in distribution”.

The CLT is often used to derive asymptotic distribution of
statistical estimators.

Maximum Likelihood Estimator
Having observed the sample x1 , . . . , xT , we want to estimate

the unknown true parameter θ0 of the data generating process
f (x; θ ).
Maximum likelihood estimation is an intuitive procedure in
which the probability of observing our sample is maximized at
our maximum likelihood estimate θ̂MLE .
Likelihood function (it is a function of the parameter θ ,
taking as given the sample): L (θ |x1 , . . . , xT )
Log-likelihood function (this is typically what we work with):
L (θ |x1 , . . . , xT )
The goal is to find θ̂ that maximizes our (log-)likelihood
function. Sometimes we can do this by finding a solution to
the first order condition, but in other situations we may have
to resort to numerical optimization routines.

Example: Mixture of Normals
Assume that asset returns are IID and normally distributed,

2

N µ, σ . We’ve seen in the lectures that the MLE of µ and
σ 2 are simply given by the sample mean and sample variance,
respectively.
Let’s assume instead that returns are IID over time, but now
drawn from a mixture of normal distributions: that is with
probability λ , it is drawn from N µ1 , σ12 and with probability

1 − λ , it is drawn from N µ2 , σ22 . This is one of the popular

approaches to modelling fat-tail distributions.
Now the parameters of the model are λ , µ1 , σ12 , µ2 , σ22 .


Continued
Note that
f Rt |λ , µ1 , σ12 , µ2 , σ22

(Rt −µ1 )2 (Rt −µ2 )2

1 −
2σ12
1 −
2σ22
=λ·q e + (1 − λ ) · q e
2πσ12 2πσ22
and since we have IID sample,

T
L λ , µ1 , σ12 , µ2 , σ22 |R1 , . . . , RT = ∏ f Rt |λ , µ1 , σ12 , µ2 , σ22

t=1
The log-likelihood function is given by

L λ , µ1 , σ12 , µ2 , σ22 |R1 , . . . , RT

 
T (Rt −µ1 )2 (Rt −µ2 )2
1 −
2σ12
1 −
2σ22
= ∑ log λ · q e + (1 − λ ) · q e 
t=1 2πσ1 2 2
2πσ2

Example: GARCH
Suppose Rt ∼ N µ, σt2 . The interesting aspect of this

specification is time-varying volatility. In particular, we assume

GARCH(1,1) structure:
σt2 = α + β (Rt −1 − µ)2 + γσt−1

2
We have in mind β > 0 and γ > 0 so that past realized and

latent volatility carry over to the current period. These kinds
of specifications can capture the volatility clustering we see in
the data.
The parameters of our model are µ, α, β , γ, σ02 .


Continued
The likelihood function is given by

T
L µ, α, β , γ, σ02 |R1 , . . . , RT = ∏ f Rt |µ, α, β , γ, σ02 ; R1 , . . . , Rt−1

t=1
T 2
1 − (Rt −µ)
2
2σt
=∏p e
t=1 2πσt2
Note that σt2 is included in the information set (R1 , . . . , Rt−1 ).

Optimizing this objective function cannot be done analytically
because evolution of σt2 depends on all the parameters in a
non-trivial manner. We have to resort to numerical methods
to find the optimum.

MIT OpenCourseWare
http://ocw.mit.edu
15.450 Analytics of Finance

Fall 2010
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms .

Introduction To Econometrics: Brandon Lee

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Econometrics: Brandon Lee

Uploaded by

Copyright:

Available Formats

Introduction to Econometrics

Brandon Lee Introduction to Econometrics

Suppose xt are IID and E [xt ] = µ. Then the Law of Large

Brandon Lee Introduction to Econometrics

Let’s verify that βˆ is consistent: that is, plim βˆ = β .

Brandon Lee Introduction to Econometrics

Brandon Lee Introduction to Econometrics

Brandon Lee Introduction to Econometrics

Suppose that xt is a random vector such that E [xt ] = µ and

Here, the convergence is in “convergence in distribution”.

Brandon Lee Introduction to Econometrics

Having observed the sample x1 , . . . , xT , we want to estimate

Brandon Lee Introduction to Econometrics

Assume that asset returns are IID and normally distributed,

1 − λ , it is drawn from N µ2 , σ22 . This is one of the popular

Brandon Lee Introduction to Econometrics

(Rt −µ1 )2 (Rt −µ2 )2

and since we have IID sample,

The log-likelihood function is given by

Brandon Lee Introduction to Econometrics

Suppose Rt ∼ N µ, σt2 . The interesting aspect of this

specification is time-varying volatility. In particular, we assume

σt2 = α + β (Rt −1 − µ)2 + γσt−1

We have in mind β > 0 and γ > 0 so that past realized and

Brandon Lee Introduction to Econometrics

The likelihood function is given by

Note that σt2 is included in the information set (R1 , . . . , Rt−1 ).

Brandon Lee Introduction to Econometrics

15.450 Analytics of Finance

You might also like