You are on page 1of 12

Introduction to Econometrics

Brandon Lee

15.450 Recitation 9

Brandon Lee Introduction to Econometrics


Law of Large Numbers

Suppose xt are IID and E [xt ] = µ. Then the Law of Large


Numbers states that
T
1
plim ∑ xt = µ
T t=1

Intuitively, the LLN says that as the sample gets larger, the
sample average approaches the true mean.
The LLN is often the basis for establishing consistency of
statistical estimators.

Brandon Lee Introduction to Econometrics


Consistency of OLS Estimator

Suppose
yi = xi β + εi
where E [εi |xi ] = 0 (which then implies that the error term is
uncorrelated to the sample: E [xi εi ] = 0)
The OLS estimator is given by
−1
βˆ = X 0 X X 0y

Let’s verify that βˆ is consistent: that is, plim βˆ = β .

Brandon Lee Introduction to Econometrics


Continued

Note
−1
βˆ = X 0 X X 0y
−1
= X 0X X 0 X 0β + ε

−1 0
= β + X 0X Xε

Therefore,
−1 0
plim βˆ = β + plim X 0 X Xε
 0 −1  0 !
XX Xε
= β + plim
N N

Brandon Lee Introduction to Econometrics


Continued

 0 −1
The key here is that XNX will converge to some limit and
 0 
so will XNε . But by the Law of Large Numbers, we know
that  0 

plim = E [xi εi ] = 0
N
Therefore,
plim βˆ = β

Brandon Lee Introduction to Econometrics


Central Limit Theorem

Suppose that xt is a random vector such that E [xt ] = µ and


Var (xt ) = Ω. The Central Limit Theorem states that
T
1
√ ∑ (xt − µ ) ⇒ N (0, Ω)
T t=1

Here, the convergence is in “convergence in distribution”.


The CLT is often used to derive asymptotic distribution of
statistical estimators.

Brandon Lee Introduction to Econometrics


Maximum Likelihood Estimator

Having observed the sample x1 , . . . , xT , we want to estimate


the unknown true parameter θ0 of the data generating process
f (x; θ ).
Maximum likelihood estimation is an intuitive procedure in
which the probability of observing our sample is maximized at
our maximum likelihood estimate θ̂MLE .
Likelihood function (it is a function of the parameter θ ,
taking as given the sample): L (θ |x1 , . . . , xT )
Log-likelihood function (this is typically what we work with):
L (θ |x1 , . . . , xT )
The goal is to find θ̂ that maximizes our (log-)likelihood
function. Sometimes we can do this by finding a solution to
the first order condition, but in other situations we may have
to resort to numerical optimization routines.

Brandon Lee Introduction to Econometrics


Example: Mixture of Normals

Assume that asset returns are IID and normally distributed,


2

N µ, σ . We’ve seen in the lectures that the MLE of µ and
σ 2 are simply given by the sample mean and sample variance,
respectively.
Let’s assume instead that returns are IID over time, but now
drawn from a mixture of normal distributions: that is with
probability λ , it is drawn from N  µ1 , σ12 and with probability


1 − λ , it is drawn from N µ2 , σ22 . This is one of the popular


approaches to modelling fat-tail distributions.
Now the parameters of the model are λ , µ1 , σ12 , µ2 , σ22 .


Brandon Lee Introduction to Econometrics


Continued
Note that
f Rt |λ , µ1 , σ12 , µ2 , σ22


(Rt −µ1 )2 (Rt −µ2 )2


1 −
2σ12
1 −
2σ22
=λ·q e + (1 − λ ) · q e
2πσ12 2πσ22

and since we have IID sample,


T
L λ , µ1 , σ12 , µ2 , σ22 |R1 , . . . , RT = ∏ f Rt |λ , µ1 , σ12 , µ2 , σ22
 
t=1

The log-likelihood function is given by


L λ , µ1 , σ12 , µ2 , σ22 |R1 , . . . , RT

 
T (Rt −µ1 )2 (Rt −µ2 )2
1 −
2σ12
1 −
2σ22
= ∑ log λ · q e + (1 − λ ) · q e 
t=1 2πσ1 2 2
2πσ2

Brandon Lee Introduction to Econometrics


Example: GARCH

Suppose Rt ∼ N µ, σt2 . The interesting aspect of this




specification is time-varying volatility. In particular, we assume


GARCH(1,1) structure:

σt2 = α + β (Rt −1 − µ)2 + γσt−1


2

We have in mind β > 0 and γ > 0 so that past realized and


latent volatility carry over to the current period. These kinds
of specifications can capture the volatility clustering we see in
the data.
The parameters of our model are µ, α, β , γ, σ02 .


Brandon Lee Introduction to Econometrics


Continued

The likelihood function is given by


T
L µ, α, β , γ, σ02 |R1 , . . . , RT = ∏ f Rt |µ, α, β , γ, σ02 ; R1 , . . . , Rt−1
 
t=1
T 2
1 − (Rt −µ)
2
2σt
=∏p e
t=1 2πσt2

Note that σt2 is included in the information set (R1 , . . . , Rt−1 ).


Optimizing this objective function cannot be done analytically
because evolution of σt2 depends on all the parameters in a
non-trivial manner. We have to resort to numerical methods
to find the optimum.

Brandon Lee Introduction to Econometrics


MIT OpenCourseWare
http://ocw.mit.edu

15.450 Analytics of Finance


Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms .

You might also like