Time Series Lecture Notes

TIME SERIES ECONOMETRICS:
SOME BASIC CONCEPTS

Reference : Gujarati, Chapters 21, 22
1. Assume the underlying time series data is
stationary.
2. Sometimes autocorrelation because the underlying
time series data is non-stationary.
3. Sometimes one obtains a very high R2 and
significant regression coefficients even though there
is no meaningful relationship between the two
variables the problem of spurious, or
nonsense, regression.
7-1
Stochastic Processes
Let yt be the observation made at time t. The units of
time vary with application; they could be years,
quarters, months, days,... We assume that the
observations are equally spaced in time. The sequence
of random variables {Y1, Y2, , YT } is called a
stochastic process. Its mean function is:
t = E(Zt)
t = 0, 1, 2,
t is the expected value of the process at time t. The

autocovariance function is:
t,s = Cov(Zt, Zs)
= E(Zt t)(Zs s)
t, s = 0, 1, 2,
The variance function is:

V ar(Zt) = t,t = Cov(Zt, Zt)
= E(Zt t)2
The autocorrelation function is:
Cov(Zt, Zs)
t,s = Corr(Zt, Zs) =
[V ar(Zt)V ar(Zs)]1/2
7-2
t, s = 0, 1,
STATIONARITY
The time series Zt is weakly stationary if
t = E(Zt) =
and
t,s = Cov(Zt, Zs) = Cov(Ztl , Zsl )
(1)
for any integer l. Equation (1) implies:

t,s = 0,k
where k = |t s|. Thus, for a stationary process we
can simply write
k = Cov(Zt, Ztk )
and
Note that k = k /0
7-3
k = Corr(Zt, Ztk )
WHITE NOISE
Let {t} be a sequence of independent random
variables with mean 0 and variance 2 and let
Yt = + t
Then
E(Yt) =
k = Cov(Yt, Ytk )
k=0
k 6= 0
= Cov(t, tk ) =
and
k=0
k 6= 0
k =
Such a sequence is called a purely random sequence

or white noise sequence.
7-4
Example of Stationary Series

Let {t} be a white noise sequence which is
distributed as N (0, 2). Define a new process {Yt} by
Yt = + t + t1
Then
E(Yt) =
0 = var(Yt)
= var(t + t1)
= var(t) + var(t1) = 2 2
1 = cov(Yt, Yt1)
= cov(t + t1 , t1 + t2)
= 2
k = cov(Yt, Ytk )
= cov(t + t1 , tk + tk1)
= 0
for s > 1
Hence
k = 2
k=0
|k| = 1
and k = 1/2
|k| > 1
7-5
k=0
|k| = 1
|k| > 1
Example of Nonstationary Series

In practice, we usually find series which are not
stationary. For example, economic or business series
show a trend or change in mean level over time
reflecting growth or nonstationary due to seasonal
features in series.
Important practical matter in time series involves
methods of transforming nonstationary series to
stationary one, or modeling nonstationarity. Two
fundamental approaches for dealing with
nonstationarity are:
1. Work with change or differences in series since
these may be stationary.
2. remove non-stationary components,
e.g. nonconstant mean, by linear regression
technique.
7-6
RANDOM WALK
Let at be iid N (0, 2) and let
Zt = Zt1 + at
t = 1, 2,
and
Z0 = 0
Then
Zt = a1 + a2 + + at
Zt is called random walk with mean t = 0, variance
V ar(Zt) = t 2, and autocovariance, t,s = t 2 for
1 t s. Since t, V ar(Zt) and t,s depend on t, Zt
is not stationary.
7-7
Example: Random Walk with Drift

Let t be iid N (0, 2) and let
Yt = Yt1 + + t
t = 1, 2,
and Y0 = 0 where is a constant. Such a series is

called a random walk with drift . We have
Yt = Y0 + t + Ptj=1 j Its mean t = E(Yt) = t, its
variance is: V ar(Yt) = t 2. Thus {Yt} is not
stationary, with both mean and variance depend on t.
Note that the series of changes or first difference of
{Yt}, defined by
Zt = Yt Yt1 = + t
is a white noise series.
7-8
ESTIMATION OF MEAN,
AUTOCOVARIANCES,
AND AUTOCORRELATIONS FOR
STATIONARY SERIES
Suppose Y1, Y2, , YT be a sample realization of a
stationary time series, {Yt} with mean
= E(Yt)
autocovariance function
k = Cov(Yt, Yt+k ) = Cov(Yt, Ytk )
and autocorrelation function
k = Corr(Yt, Yt+k ) =
7-9
k
0
The estimator for is the sample mean

T
1 X
Y =
Yt
T t=1
Estimator for k is
k
1 TX
(Yt Y )(Yt+k Y )
k = 0, 1, 2,
ck =
T t=1
where k is small relative to T . Note that
T
1 X
c0 =
(Yt Y )2
T t=1
is the sample variance. Estimator for k is sample
ACF
k
(Yt Y )(Yt+k Y )
ck PTt=1
k = 0, 1, 2,
rk = =
PT
2
c0
t=1 (Yt Y )
A plot of rk versus k is called a correlogram.
7-10
Sampling Properties of Estimators

1. Y is an unbiased estimator of . That is:
E(Y ) =
2.
T
1 X
V ar(Y ) = V ar(
Yt )
t=1
T
TX
1 T k
0
1 + 2
=
k
T
T
k=1
If Yt are independent, then k = 0 for all k 6= 0
and so V ar(Y ) = 0/T .
When T is large, then
3. rk is approximately normally distributed.

4.
E(rk ) k
and
5.
1
V ar(rk )
T
X
s=
s + s+k sk 4k ssk + 2s k
7-11
Special case
When the series is white noise, so s = 0 for s 6= 0,
then
1
V ar(rk )
for k 6= 0
T
In fact, rk is approximately distributed as N (0, 1/T )
for k = 1, 2,
This property will be applied to check whether the
model is appropriate or not. If the model fits the data,
the residuals will follow a white noise series and hence
95% of its ACF will lie between 2/ T and 2/ T .
7-12
General Characteristics of Sample ACF

1. Stationary
(a) Sample ACF tends to damp out to 0 as lag k
increases fairly rapidly
(b) cut off
(c) damp out exponentially or sinusoidally
2. Non-stationary
(a) Sample ACF tends to damp out very slowly,
linearly
(b) sinusoidally but damping out very slowly
strong seasonal component
7-13
PARTIAL AUTOCORRELATION
FUNCTION
For a stationary and normal-distributed time series
{Zt}, the partial autocorrelation , (PACF), at
lag k is defined as:
k,k = Corr(Zt , Ztk | Zt1, Zt2, , Ztk+1)
which is the correlation between Zt and Ztk after
removing the effect of the intervening variables
Zt1, Zt2, , Ztk+1. Its estimator is the sample
partial autocorrelation , rk,k .
Property: If {Zt} for t = 1, 2, , T is white noise,
then its sample partial autocorrelation function rkk
will distribute as N (0, 1/T ) for k = 1, 2,
This property will be applied to check whether the
model is appropriate or not. If the model fits the data,
the residuals will follow a white noise series and hence
95% of its PACF will lie between 2/ T and 2/ T .
7-14
Tests of Stationarity
1. Sample ACF tends to damp out to 0 as lag k
(a) cut off
(b) damp out exponentially or sinusoidally
2. Sample PACF tends to damp out to 0 as lag k
(a) cut off
(b) damp out exponentially or sinusoidally
7-15
Tests of White Noise

If the time series is white noise, we have
1. its ACF rk is approximately distributed as
N (0, 1/T ) for k = 1, 2, , and
2. its PACF is approximately distributed as
N (0, 1/T ) for k = 1, 2, .
Hence, if the time series is white noise, we have
1. its ACF rk lies between 2/ T and 2/ T ;
2. its PACF rk lies between 2/ T and 2/ T .

3. In addition, we can apply the Box-Pierce Q
Statistic
m 2
X
Q=n
rk
k=1
or the Ljung-Box Q Statistic

rk2
LB = n(n + 2)
k=1 n k
where n is sample size, m is lag length to test for
white noise.
m
X
If the time series is white noise, Q 2m and

LB 2m
7-16
Notation:
The backward shift operator B is defined by
BYt = Yt1
and hence
B iYt = Yti
The forward shift operator F = B 1 is defined by
F Yt = Yt+1
and hence
F iYt = Yt+i
Example 1:
Yt = t t1 = (1 B)t
Example 2:
Yt = Yt1 + t
implies
Yt Yt1 = t
or
(1 B)Yt = t
7-17
If || < 1, then
Yt = (1 B)1t
We have
Yt = (1 + B + 2B 2 + 3B 3 + )t
and hence
Yt = t + t1 + 2t2 + 3t3 +
Similarly, in Example 1
t = (1 B)1Yt
and hence
t = (1 + B + 2B 2 + 3B 3 + )Yt
t = Yt + Yt1 + 2Yt2 + 3Yt3 +
Remark: In Example 2, when = 1, we have
Yt = Yt1 + t
or
Yt Yt1 = t
which is a random walk series.
7-18
LINEAR MODELS FOR STATIONARY

SERIES
Properties of series are exhibited by ACF. Hence, we
build models which reflect the ACF structure.
Linear Filters
Often we deal with formation of new series {Yt} by a
linear operation applied to a given series {Xt}. The
system where Xt is input and Yt is output which
results from linear operation of Xt.
A linear time-invariant filter applied to series {Xt}
is to produce a new series {Yt} such that
Yt =
X
j=
j Xtj
If j satisfies j = 0 for j < 0, then

Yt =
X
j=0
j Xtj
The filter is one-sided. Time-invariant because

coefficients j do not depend on t.
7-19
Note
1. Xt may be controllable, e.g. in the process of
production, {Xt} is the input of raw material. Yt
be output of product or by-product.
2. Differencing operators are linear filters
Yt = Xt = Xt Xt1
and
Yt = 2Xt = Xt Xt1
= Xt 2Xt1 + Xt2
3. Moving averages are linear filters, e.g.
m
1
X
Xtj
Yt =
2m + 1 j=m
If {Xt} is stationary with mean x and autocovariance
k , then
X
Yt =
j Xtj
j=
with mean
Y =
X
j=
j E(Xtj ) = x
7-20
X
j=
and the autocovariance

Y (s) = Cov(Yt, Yt+s)
X
= Cov(
j Xtj ,
=
=
j=
j= k=
X
X
j= k=
X
k=
k Xt+sk )
j k Cov(Xtj , Xt+sk )
j k s+jk
7-21
Linear Process
{Yt} is a linear process if it can be represented as
output of one-sided linear filter white noise {t}. That
is:
X
Yt = +
j tj
j=0
where t are independent random variable with mean

0 and variance 2. In this situation,
Y =
and the autocovariance
Y (s) = Cov(Yt, Yt+s)
X
= Cov( j tj ,
=
=
j=0
X X
X
k=0
k t+sk )
j k Cov(tj , t+sk )
j=0 k=0
j j+s 2
j=0
because Cov(tj , t+sk ) = 2 when k = j + s and

equal to 0 when k 6= j + s.
7-22
Wolds Representation Theorem : If {Yt} is a

weakly stationary nondeterministic series with mean
, then Yt can always be expressed as:
Yt = +
X
j=0
j tj
2
with 0 = 1 and P
j=0 (j ) < where t are
uncorrelated random variables with mean 0 and
variance 2.
This results supports the use of model representation

of the form:
Yt = +
X
j=0
j tj
0 = 1
as a class of models for stationary series.
7-23
FINITE MOVING AVERAGE MODEL

A simple class of models is to set j = 0 for j > q.
{Yt} is said to be a moving average process of order q
(MA(q)) if it satisfies:
Yt = + t
q
X
j=1
j tj
where t are independent white noise with mean 0 and

variance 2. We write
Yt = + t
q
X
j=1
j tj
= + (B)t
where (B) = 1 Pqj=1 j B j is a MA average.
7-24
MA(1)
When q = 1, Yt = + t t1, we have
E(Yt) =
V ar(Yt) = 0 = 2(1 + 2)
1 = 2
and
k = 0
for |k| > 1
Hence
1 =
1 + 2
7-25
MA(2)
When q = 2,
Yt = + t 1t1 2t2
E(Yt) =
V ar(Yt) = 0 = 2(1 + 12 + 22)
1 = 2(1 + 12)
2 = 2(2)
and
k = 0
Hence
for |k| > 2
1 + 12
1 =
(1 + 12 + 22)
and
2 =
2
(1 + 12 + 22)
7-26
MA(q)
The model is
Yt = + t
q
X
j=1
j tj
= + (B)t
where (B) = 1 Pqj=1 j B j .
E(Yt) =
V ar(Yt) = 0 = 2(1 + 12 + 22 + + q 2)
k = 2(k +1k+1+ +qk q ) for k = 0, 1, 2, , q
and k = 0 for |k| > q. Hence, the ACF is
k =
(k + 1k+1 + + qk q )
1 + 1 2 + 2 2 + + q 2
for k = 0, 1, 2, , q and k = 0 for |k| > 0.
7-27
AUTOREGRESSIVE MODELS
The autoregressive model with order p, AR(p), is
Yt = 1Yt1 + 2Yt2 + + pYtp + + t
where t are independent white noise with mean 0 and
variance 2. We can re-write it as
Yt 1Yt1 2Yt2 pYtp = + t
or
(B)Yt = + t
where (B) = 1 Ppj=1 j B j is a AR average.
AR(p) model resembles a multiple linear regression
where Yt1,...,Ytp are independent variables.
Autoregression because Yt is regressed on its own past
values.
7-28
AR(1)
When p = 1,
Yt = Yt1 + + t
Is it stationary?
By successive substitution
Yt = (Yt2 + + t1) + + t
n
= Ytn +
n1
X
j=0
n1
X
j=0
j tj
Under the assumption that || < 1, as n , we

get
Yt =
=
X
j=0
j tj
j=0
j
X
+
tj
j=0
1
which is stationary. Note that
j
||
<
j=0
So, if || < 1, {Yt} represents a stationary series with

infinite MA representation as above with j = j
E(Yt) = =
1
7-29
k = Cov(Yt, Yt+k )
k 2
=
=
j=0
1 2
k 2 X
2j
If k = 0, we have
2
0 =
1 2
and hence
k = k
for k 0
7-30
Name of variable = Y.
Mean of working series =
Standard deviation
9.97686
= 1.141318
Number of observations =
500
Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0
1.302606
1.00000
|********************|
0.617829
0.47430
. |*********
0.352030
0.27025
. |*****
0.156284
0.11998
. |**
0.058037
0.04455
. |*.
0.057984
0.04451
. |*.
0.027178
0.02086
. | .
-0.087968
-0.06753
.*| .
-0.141527
-0.10865
**| .
-0.117101
-0.08990
**| .
10
-0.152208
-0.11685
**| .
Partial Autocorrelations
Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
1
0.47430
. |*********
0.05843
. |*.
-0.03681
.*| .
-0.01564
. | .
0.04006
. |*.
-0.01109
. | .
-0.10630
**| .
-0.05613
.*| .
0.00991
. | .
10
-0.06858
.*| .
7-31
Autocorrelation Check for White Noise

To
Lag
6
Chi
Autocorrelations
Square DF
159.47
Prob
0.000
0.474
0.270
0.120
0.045
0.045
0.021
12
187.08 12
0.000 -0.068 -0.109 -0.090 -0.117 -0.089 -0.089
18
197.51 18
0.000 -0.077 -0.067
24
223.09 24
0.000 -0.017 -0.057 -0.092 -0.093 -0.136 -0.098
0.002
0.044
0.077
0.042
Maximum Likelihood Estimation

Approx.
Parameter
Estimate
Std Error
T Ratio
Lag
MU
9.97880
0.08538
116.88
AR1,1
0.47378
0.03944
12.01
Constant Estimate
Variance
= 5.25101414
Estimate = 1.01335786
Std Error Estimate = 1.00665677

AIC
= 1427.82345
SBC
= 1436.25266
Number of Residuals=
500
Autocorrelation Check of Residuals

To
Lag
6
Chi
Autocorrelations
Square DF
4.53
Prob
0.476 -0.027
0.064 -0.003 -0.031
0.030
0.048
12
12.62 11
0.319 -0.054 -0.076 -0.004 -0.075 -0.013 -0.039
18
18.16 17
0.379 -0.027 -0.061
24
25.19 23
0.340 -0.018 -0.024 -0.054 -0.007 -0.097 -0.009
0.017
7-32
0.022
0.068
0.030
Autocorrelation Plot of Residuals

0
1.013358
1.00000
|********************|
-0.027426
-0.02706
.*| .
0.064949
0.06409
. |*.
3 -0.0029458
-0.00291
. | .
-0.031095
-0.03068
.*| .
0.030587
0.03018
. |*.
0.048235
0.04760
. |*.
-0.054253
-0.05354
.*| .
-0.076521
-0.07551
**| .
Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
1
-0.02706
.*| .
0.06341
. |*.
0.00044
. | .
-0.03498
.*| .
0.02885
. |*.
0.05368
. |*.
-0.05554
.*| .
-0.08692
**| .
Model for variable Y

Estimated Mean = 9.97879948
Autoregressive Factors
Factor 1: 1 - 0.47378 B**(1)
7-33
OPTIONS NOCENTER PS = 35 LS = 72;

DATA A ;
INFILE c:\AR1.DATA ;
INPUT Y ;
DATA A ; SET A ;
T+1 ;
PROC PLOT ;
PLOT Y*T ;
PROC ARIMA DATA=A ;
IDENTIFY VAR=Y
PROC ARIMA DATA=A ;

IDENTIFY VAR=Y NOPRINT ;
ESTIMATE P = 1
METHOD=ML PLOT;
7-34
AR(2)
The autoregressive model with order 2, AR(2), is
Yt = 1Yt1 + 2Yt2 + + t
where t are independent white noise with mean 0 and variance
2. We can re-write it as
Yt 1Yt1 2Yt2 = + t
or
(B)Yt = + t
where (B) = 1 1B 2B 2.
AR(2) model resembles a multiple linear regression where Yt1
and Yt2 are independent variables. As in AR(1) case, we can
use successive substitution to eventually express Yt as infinite MA
model such that
Yt = +
X
j=0
j tj
The infinite MA will be absolutely summable (this means that

P
j=0 |j |
< ).
7-35
Another method to express Yt in terms of the noises is:

1
Yt = (1 1B 2B 2) + (1 1B 2B 2) t
X
=
+
j tj
1 1 2 j=0
= + (B)t
where
=
1 1 2
and
2 1
(B) = (1 1B 2B )
{j } can be determined from (B) = (B)
X
j=0
1
j B j
that implies
(B)(B) = 1
Hence, we have
(1 1B 2B 2)(0 + 1B + 2B 2 + ) = 1
and therefore
0 = 1
1 10 = 0
and
j 1j1 2j2 = 0
for j 1
where 0 = 1 and j = 0 for j < 0.

Thus {j } satisfies second order difference equation.
7-36
Condition for Stationarity

The condition for stationarity is roots of
(Z) = 1 1Z 2Z 2 = 0
(2)
are greater than 1 in absolute value. Or roots of

m2 1m 2 = 0
(3)
are less than 1 in absolute value. This condition will make the
infinite MA
j=0 |j |
< be absolutely summable.
Note that roots of (3) are the reciprocals of roots of (2). That is,
if m1 and m2 are roots of (3) and z1 and z2 are roots of (2), then
m1 =
1
z1
and
m2 =
1
z2
This condition will lead us to get the formula

j = c1mj1 + c2mj2
7-37
for any j 0
Mean of AR(2)
For the model
Yt 1Yt1 2Yt2 = + t
We have
E(Yt) 1E(Yt1) 2E(Yt2) = + E(t)
and this implies
=
1 1 2
7-38
Variance, Autocovariance and Autocorrelation of AR(2)

Note that
k=0
k>0
Cov(t, Ytk ) =
0 = V ar(Yt) = V ar(1Yt1 + 2Yt2 + t)

= 210 + 220 + 2121 + 2
1 = Cov(Yt, Yt1)
= Cov(1Yt1 + 2Yt2 + t, Yt1)
= 10 + 21
This implies
1 =
1
0
1 2
For k > 0,
k = Cov(Yt, Ytk )
= Cov(1Yt1 + 2Yt2 + t, Ytk )
= 1k1 + 2k2
This implies
k = 1k1 + 2k2
for k > 0. This is called Yule-Walker Equation.
7-39
In Y-W Equation, k = 1 and k = 2 are very important for

AR(2). They are
1 = 10 + 21 = 1 + 21
(4)
2 = 11 + 20 = 11 + 2
(5)
and
Solving these two equations, we have
1 =
1
1 2
and
12
2 = 2 +
1 2
Higher lag values of k can then be computed recursively by the
difference equation. For example:
3 = 12 + 21
Equations (4) and (5) can also be used to solve for 1 and 2
such that
1(1 2)
1 21
2 21
2 =
1 21
1 =
7-40
ACF k satisfies the second order difference equation (Y-W

Equation):
k = 1k1 + 2k2
From the difference equation theory, solution k to the difference
equation has the form:
k = c1mk1 + c2mk2
for any k 0
(if m1 and m2 are distinct and real) where m1, m2 are roots of
m2 1m 2 = 0
c1 and c2 can be determined by initial conditions
1
0 = 1
and
1 =
1 2
In this situation, k decline exponentially as k increases.
When m1 and m2 are complex, said
m1 , m2 = R(cos i sin)
c1 and c2 will be complex also, said
c1 , c2 = a bi
So that
k = c1mk1 + c2mk2
= (a + bi)Rk (cos + i sin)k + (a bi)Rk (cos i sin)k
= Rk (a1cos(k) + a2sin(k))
7-41
where
1
R = |m1| = |m2| = (2) 2 < 1

and satisfies
cos =
1
1
2(2) 2
In this situation, k is damped sinusoid with damping factor R,
period 2/ and frequency .
7-42
PACF of AR(2)
The PACF of AR(2) are
11 = 1
and
2 21
22 =
1 21
( = 2 )
and
kk = 0
for k > 2
Hence, the ACF of AR(2) damp off exponentially or sinusoidly

while the PACF cut off after lag 2.
7-43
/*-------------------------------------------------------*/
/*----
EXAMPLE
----*/
/*-------------------------------------------------------*/
Approx.
Parameter
Estimate
Std Error
T Ratio
MU
6.96407
0.20628
33.76
AR1,1
0.51108
0.08640
5.92
Constant Estimate
Variance
Lag
= 3.40488718
Estimate = 1.04185881

AIC
= 290.170809
SBC
295.38115
100

To
Lag
6
Chi
Autocorrelations
Square DF
6.91
Prob
0.228 -0.005
0.030
0.106 -0.221 -0.014
12
13.53 11
0.260 -0.191
0.024
0.022
0.060 -0.127 -0.046
18
17.83 17
0.399 -0.162 -0.058
0.004
0.036
0.010
0.072
24
20.85 23
0.590 -0.086 -0.031 -0.032 -0.055 -0.048
0.092
Factor 1: 1 - 0.51108 B**(1)
7-44
0.063
Approx.
Parameter
Estimate
Std Error
T Ratio
6.96184
0.16749
41.57
MA1,1
-0.45979
0.10020
-4.59
MA1,2
-0.15518
0.10022
-1.55
MU
Constant Estimate
Variance
Lag
= 6.96184387
Estimate = 1.08819896

AIC
= 295.415416
SBC
= 303.230926
100
Correlations of the Estimates

Parameter
MU
MA1,1
MA1,2
MA1,1
0.001
1.000
0.394
MA1,2
-0.001
0.394
1.000

To
Lag
6
Chi
Autocorrelations
Square DF
Prob
9.45
0.051
0.049
0.100
0.182 -0.203 -0.012
0.008 -0.004
0.051
12
16.43 10
0.088 -0.195
0.018 -0.142 -0.063
18
21.02 16
0.178 -0.170 -0.078 -0.013
0.012 -0.005
0.059
24
24.00 22
0.347 -0.087 -0.037 -0.032 -0.052 -0.038
0.094
Moving Average Factors

Factor 1: 1 + 0.45979 B**(1) + 0.15518 B**(2)
7-45
Approx.
Parameter
Estimate
Std Error
T Ratio
6.97335
0.21712
32.12
MA1,1
-0.54556
0.09870
-5.53
MA1,2
-0.36785
0.10771
-3.42
MA1,3
-0.27311
0.09959
-2.74
MU
Constant Estimate
Variance
Lag
= 6.97334862
Estimate = 1.00988046

AIC
= 289.200417
SBC
= 299.621098
100
Correlations of the Estimates

Parameter
MU
MA1,1
MA1,2
MA1,3
MA1,1
-0.006
1.000
0.442
0.239
MA1,2
-0.012
0.442
1.000
0.448
MA1,3
-0.015
0.239
0.448
1.000

To
Lag
Chi
Autocorrelations
Square DF
Prob
1.60
0.661 -0.023 -0.031 -0.004 -0.078
12
11.02
0.275 -0.222
0.009
0.024
0.003
0.086
0.110 -0.147 -0.031
Moving Average Factors

Factor 1: 1 + 0.54556 B**(1) + 0.36785 B**(2) + 0.27311 B**(3)
7-46
GENERAL ORDER AUTOREGRESSIVE MODELS

The autoregressive model with order p, AR(p), is
Yt = 1Yt1 + 2Yt2 + + pYtp + + t
2.
or
(B)Yt = + t
where (B) = 1
Pp
j=1 j B
is a AR average.
If all roots of
(z) = 1 1z 2z 2 pz p = 0
are larger than one in absolute value, or all roots of
mp 1mp1 2mp2 p = 0
are smaller than one in absolute value, then the process is
stationary and has a convergent infinite MA representation.
7-47
That is
Yt = (B)1 + (B)1t
= + (B)t
where
= E(Yt) = (B)1
=
1 1 2 p
(B) =
and
j=0 |j |
X
j=0
j B j = (B)1
<
j is determined from the relation:

(B)(B) = 1
This implies that j satisfies
j 1j1 2j2 pjp = 0
for j > 0. Note that 0 = 1 and j = 0 for j < 0.
The solution of difference equation implies that j satisfies
j =
p
X
i=1
cimji
where mi are roots of

mp 1mp1 2mp2 p = 0
7-48
Autocovariance and Autocorrelation

Autocovariance s of AR(p) satisfy Yule-Walker Equation:
s = 1j1 + 2j2 + + psp
(6)
Divide (6) by 0, we get the Yule-Walker Equation for ACF, j :

s = 1j1 + 2j2 + + psp
ACF satisfies the same different equation as the s and s, but
with different initial conditions.
General solution to the above different equation is
s = c1ms1 + c2ms2 + + cpmsp
where mi are roots of
mp 1mp1 2mp2 p = 0
Yule-Walker equation are useful for determining AR parameters
1, , p. Equations can be expressed in matrix form as
P=
where
P =
p1
1
...
1
...
1
...

...
...
p1 p2 p3
7-49
p2
...
and =
...
p
p
Equations are used to solve for in terms of ACF, the solution is:
= P 1
Sample version of this solution replaces s by sample ACF rs and
the estimate of (which is called Yule-Walker estimates of AR
parameters) is
= R1 r
where
R=
p2
r1
r2
rp1
r1
...
1
...
r1
...
r
...
...
rp1 rp2 rp3
and r =
Variance 0 = V ar(Yt) can be expressed as

0 = 11 + 22 + + pp + 2
Hence,
2 = 0 11 22 pp
= 0(1 11 22 pp)
7-50
r1
r
...
rp
Partial Autocorrelation Function

When fit AR models to data, we need to choose an appropriate
order p for model. The PACF is useful here.
Suppose Yt is stationary with ACF, s. For k 1, consider the
first k Yule-Walker equation corresponding to AR(k) model:
s = 1j1 + 2j2 + + k sk
s = 1, , k
(7)
and let 1k , 2k , , kk denote to be the solution to Y-W

equation for 1, 2, , k . This equation can be solved for order
k = 1, 2, and t he quantity kk is the PACF at lag k.
k=1
1 = 110
11 = 1
k = 2 implies
1
2
12
22
This implies
12 =
and
1 1
1 1
1 1
1 1
12
22
1(1 2)
1 21
2 21
22 =
1 21
7-51
1 2
2 2
When we actually have an AR(p) process and we set k = p in

equation (7), we have
p2
p1
...
pp
...
and hence pp = p. When k > p, we get kk = 0

PACF k k at lag k is actually equal to the partial correlation
between Yt and Ytk , when we adjust for the intermediate values
Yt1, Yt2, , Ytk+1.
7-52
INVERTIBILITY OF MA MODELS
Yt = + (B)t
If all roots of
(z) = 1 1z 2z 2 q z q = 0
are larger than one in absolute value, or all roots of
mq 1mq1 2mq2 q = 0
are smaller than one in absolute value, then the MA process can
be expressed in form of infinite AR model. That is:
(B)1Yt = (B)1 + t
or
(B)Yt = + t
where
(B) = 1 1B 2B 2 = (B)1
with
j=1 |j |
< . That is:

Yt =
X
j=1
j Ytj + + t
Then MA process is said to be invertible.

7-53
MIXED AUTOREGRESSIVE MOVING AVERAGE

(ARMA) MODEL
Yt follows an ARM A(p, q) model if it satisfies:
Yt = 1Yt1 + 2Yt2 + + pYtp
+ + t 1t1 q tq
2, or
(B)Yt = + (B)t
where (B) = 1
(B) = 1
Pq
Pp
j=1 j B
j=1 j B
is a AR average,
is a AR average.
If all roots of
(z) = 1 1z 2z 2 pz p = 0
are larger than one in absolute value, then the process is
stationary and has convergent infinite MA representation:
Yt = (B)1 + (B)1(B)t
= + (B)t
where
= E(Yt) = (B)1
=
1 1 2 p
7-54
(B) =
and
j=0 |j |
X
j=0
j B j = (B)1(B)
<
If all roots of
(z) = 1 1z 2z 2 q z q = 0
are larger than one in absolute value,then the process is invertible
and has a convergent infinite AR model. That is:
(B)1(B)Yt = (B)1 + t
or
(B)Yt = + t
where
(B) = 1 1B 2B 2 = (B)1(B)
with
j=1 |j |
< .
7-55
/*-------------------------------------------------------*/
/*----
EXAMPLE : FORECASTING STOCK MARKET PRICES
----*/
/*-------------------------------------------------------*/
TITLE AT&T STOCK PRICES;
PROC ARIMA DATA=ATTSTOCK;
/*----
First Analysis
----*/
IDENTIFY VAR=X CENTER NLAG=13;

ESTIMATE P=1 METHOD=CLS NOCONSTANT;
ESTIMATE P=1 METHOD=ML
NOCONSTANT;
ESTIMATE P=1 METHOD=ULS NOCONSTANT ;

FORECAST OUT=B1 LEAD=12 ID=N;
/*----
Second Analysis
----*/
IDENTIFY VAR=X(1) CENTER NLAG=13;

ESTIMATE METHOD=ULS NOCONSTANT ;
FORECAST OUT=B2 LEAD=12 ID=N;
/*----
Third Analysis
----*/
IDENTIFY VAR=X(1) NLAG=13;

ESTIMATE METHOD=ULS NOCONSTANT ;
FORECAST LEAD=12 ID=N;
PROC PLOT DATA=B2(FIRSTOBS=40);
PLOT FORECAST*N=F X*N=* L95*N=L U95*N=U
/OVERLAY VAXIS=46 TO 60;
TITLE2 ARIMA(0,1,0) FORECAST LEAD=12;
7-56
Name of variable = X.
Mean of working series =
Standard deviation
3.4136
Number of observations =
52
Autocorrelations
0
11.652662
1.00000
10.890128
0.93456
10.046985
0.86221
9.490750
0.81447
8.717965
0.74815
7.973288
0.68425
7.242184
0.62150
6.447407
0.55330
5.745524
0.49307
|**********
5.150995
0.44204
10
4.394067
0.37709
11
3.370612
0.28926
|******
12
2.532964
0.21737
13
2.054507
0.17631
.
.
.
.
.
.
.
|********************|
|******************* |
0.138675
|*****************
0.229833
|****************
0.285334
|***************
0.327001
|**************
0.358410
|************
0.382707
0.401648
0.416048
|*********
0.427138
|********
0.435846
0.442076
|****
0.445701
|****
0.447735
|***********
"." marks two standard errors
7-57
Std
.
.
Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
1
0.93456
-0.08847
0.15987
-0.20256
0.04819
|******************* |
**|
. ****|
-0.10464
**|
-0.03046
*|
-0.00025
0.02826
|*
10
-0.14244
***|
11
-0.21337
. ****|
12
0.05050
|*
13
0.15800
|***
|***
|*
Autocorrelation Check for White Noise

To
Lag
6
12
Chi
Autocorrelations
Square DF
212.15
Prob
0.000
0.935
0.862
0.814
0.748
0.684
0.622
278.08 12
0.000
0.553
0.493
0.442
0.377
0.289
0.217
Approx.
Parameter
AR1,1
Variance
Estimate
Std Error
T Ratio
0.98453
0.04062
24.24
Estimate = 0.94924184

AIC
145.8511*
7-58
Lag
1
SBC
= 147.802344*
52
* Does not include log determinant.

To
Lag
6
Chi
Autocorrelations
Square DF
Prob
8.41
0.135
0.056 -0.152
12
16.17 11
0.135 -0.089 -0.090
18
23.05 17
24
34.06 23
0.337 -0.049 -0.033
0.063
0.111
0.236 -0.117 -0.135
0.148
0.186 -0.041 -0.176
0.028 -0.086 -0.124
0.064
0.166
0.148
0.116 -0.137
0.132 -0.142
Data have been centered by subtracting the value 57.795673077.

No mean term in this model.
Factor 1: 1 - 0.98453 B**(1)
7-59
Forecasts for variable X

Obs
Forecast Std Error
Lower 95%
Upper 95%
53
52.2534
0.8664
50.5553
53.9515
54
52.2568
1.2249
49.8560
54.6575
55
52.2601
1.4997
49.3207
55.1995
56
52.2635
1.7312
48.8704
55.6566
57
52.2669
1.9350
48.4744
56.0593
58
52.2702
2.1190
48.1170
56.4234
59
52.2736
2.2881
47.7890
56.7582
60
52.2770
2.4453
47.4842
57.0697
61
52.2803
2.5929
47.1984
57.3623
62
52.2837
2.7323
46.9285
57.6389
63
52.2870
2.8648
46.6721
57.9019
64
52.2904
2.9913
46.4276
58.1532
7-60
Plot of FORECAST*N.
Symbol used is F.
Plot of X*N.
Symbol used is *.
Plot of L95*N.
Symbol used is L.
Plot of U95*N.
Symbol used is U.
c 56 +
+
a 55 +F
+
s 54 +
+
t 53 +
+
+
F
*
U
F
F
*
U
52 +
f 51 +
+
o 50 +
+
r 49 +
+
L
48 +
X 47 +
+
46 +
+
L
-+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
N
7-61
MODELS FOR NONSTATIONARY TIME SERIES

For many nonstationary series which exhibited homogeneous
behavior, the first difference of series:
Wt = Yt Yt1 = (1 B)Yt
may be a stationary series, or if the first difference is not
stationary, its second difference
Wt = (1 B)2Yt
= (1 2B + B 2)Yt
= Yt 2Yt1 + Yt2
may be stationary.
So, a useful class of models for nonstationary series are models
such that dth difference
Wt = (1 B)dYt
is a stationary series and Wt follows an ARM A(p, q) model. So
(B)Wt = + (B)t
or
(B)(1 B)dYt = + (B)t
7-62
model for Yt is called an Autoregressive Integrated

Moving Average Model of order p, d, q or ARIM A(p, d, q).
Generally, when d > 0, it is often to be d = 1, (occasionally,
d = 2), ie
Wt = (1 B)Yt is ARM A(p, q)
or
Yt is ARIM A(p, 1, q)
To get Yt from Wt, we must sum or integrate Wt. i.e.
Yt = (1 B)1Wt
= (1 + B + B 2 + )Wt
= Wt + Wt1 + Wt2 +
7-63
ARIM A(p, d, q) MODEL

Yt is non-stationary such that
Wt = (1 B)dYt
is stationary ARM A(p, q) i.e.
(B)Wt = + (B)t
(B)(1 B)dYt = + (B)t
(B) = (B)(1 B)d
= 1 1B p+dB p+d
In this form, Yt has form of ARM A(p + d, q) model, but is
non-stationary with d roots of (B) = 0 equal to 1.
7-64
Unit Root Stochastic Process

For the AR(1) model:
Yt = Yt1 + + t
Yt =
X
j=0
11
X
j=0
j tj
which is non-stationary when = 1 unit root problem.
7-65
Trend Stationary (TS) and Difference Stationary

(DS)
Trend Stationary the trend is completely predictable
Difference Stationary the trend is stochastic but becomes
stationary after differencing.
Consider
Yt = 1 + 2 t + 3Yt1 + ut
1. Random Walk Model [RWM] without drift
1 = 2 = 0, 3 = 1:
Yt = Yt1 + ut
is non-stationary but
Yt = (Yt Yt1) = ut
is stationary.
2. Random Walk Model with drift 1 6= 0, 2 = 0,
3 = 1:
Yt = 1 + Yt1 + ut
is non-stationary but
Yt = (Yt Yt1) = 1 + ut
is stationary and Yt exhibits a positive (1 > 0) or negative
(1 < 0) trend.
7-66
3. Deterministic Trend Model 1 6= 0, 2 6= 0, but

3 = 0:
Yt = 1 + 2 t + ut
is non-stationary but stationary after detrend.
4. Random Walk with Drift and Deterministic
Trend 1 6= 0, 2 6= 0, and 3 = 1:
Yt = 1 + 2 t + Yt1 + ut
is non-stationary and
Yt = 1 + 2 t + ut
is still non-stationary.
5. Deterministic Trend with Stationary AR(1)
Component 1 6= 0, 2 6= 0, and 3 < 1:
Yt = 1 + 2 t + 3Yt1 + ut
which is stationary around the deterministic trend.
7-67
Integrated Stochastic Process

A time series Yt integrated of order d, denoted by Yt I(1), if Yt
differencing d times. Hence, random walk model without drift,
random walk model with drift, deterministic trend model and
random walk with stationary AR(1) component are I(1) while
random walk with drift and deterministic trend is I(2).
Properties of Integrated Series
1. If Xt I(0) and Yt I(1), then Zt = Xt + Yt I(1).
2. If Xt I(d), then Zt = a + bXt I(d).
3. If Xt I(d1), Yt I(d2), then Zt = aXt + bYt I(d2)
where d1 d2.
4. If Xt I(d) and Yt I(d), then Zt = aXt + bYt I(d)
where d is generally equal to d but sometimes d < d (when
cointegrated).
7-68
Problems
1. Consider
Yt = 1 + 2Xt + ut .
The estimate
xt yt
P 2
xt
If Xt I(1) and Yt I(0), then Xt is non-stationary and its
variance will increase indefinitely, then dominating the
numerator with the result that 2 will converge to zero
asymptotically and it will not even have an asymptotic
distribution.
2 =
2. Spurious Regression
Consider
Yt = Yt1 + ut
Xt = Xt1 + vt
Y0 = 0
and
X0 = 0
where ut and vt are independent.

However, when we simulate independent ut and vt from
N (0, 1) and fit
Yt = 1 + 2Xt + et .
We find that estimate of 2 is significantly different from zero
and R2 is also significantly different from zero.
7-69
The Unit Root Test

Consider
Yt = Yt1 + ut
11
ut is error term.
Yt = (Yt Yt1) = Yt1 Yt1 + ut
= ( 1)Yt1 + ut
= Yt1 + ut
(8)
where = 1.
H0 : = 1
H1 : < 1
H0 : = 0
H1 : < 0
is equivalent to
(9)
If H0 is true, then Yt = ut is white noise. To test for (9), we

simply regress Yt on Yt1 and obtain the estimated slope
However, unfortunately, the estimate does not
coefficient, .
follow the t distribution even in large samples.
Dickey and Fuller show that follows the (tau) statistic. The
test is known as the Dickey-Fuller (DF) test.
If the hypothesis H0 : = 0 is rejected, we can use the usual
(Students) t test.
7-70
The DF test is estimated in three different forms:

1. Yt is a random walk :
Yt = Yt1 + ut
(10)
2. Yt is a random walk with drift :

Yt = 1 + Yt1 + ut
(11)
3. Yt is a random walk with drift around a

deterministic trend :
Yt = 1 + 2 t + Yt1 + ut
(12)
If the hypothesis H0 : = 0 is rejected, then Yt is a stationary

time series with zero mean for (10), is stationary with a nonzero
mean for (11) and is stationary around a deterministic trend for
(12)
7-71
The Augmented Dickey-Fuller (ADF) test

In the DF test for (10), (11) and (12), it is assumed that ut was
uncorrelated. If they are, we use ADF test for the following:
Yt = 1 + 2 t + Yt1 + i
m
X
i=1
Yti + t
(13)
where t is a white noise and Yti = Yti Yti1. The number

of lagged difference terms to include so that t is white noise. The
ADF test follows the same asymptotic distribution as the DF test
and so that same critical values can be used.
7-72
Cointegration
If Yt I(1) and Xt I(1), but ut I(0) where
Yt = 1 + 2Xt + ui .
(14)
Yt and Xt is said to be cointegrated.

As both Yt and Xt I(1), they have stochastic trends and their
linear combination ut I(0) cancels out the stochastic trends.
As a results, the cointegrating regression (14) is meaningful and
2 is called the cointegrating parameter.
Economically speaking, two variables will be cointegrated if they
have a long-term, or equilibrium, relationship between them.
7-73
Testing for Cointegration

A number of tests have been established, we consider two simple
methods:
1. Engle-Granger (EG) or Augmented
Engle-Granger (AEG) test
Apply the DF or ADF unit root test on the residuals ut
estimated from the cointegrating regression.
Since the estimated ut are based on the estimated
cointegrating parameter 2. The DF and ADF critical values
are not appropriate. Engle and Granger have calucated these
values known as Engle-Granger (EG) or Augmented
Engle-Granger (AEG) tests.
2. Cointegrating regression Durbin-Watson
(CRDW) test
Use the Durbin-Watson d statistic obtained from the
cointegrating regression but now
H0 : d = 0
H1 : d > 0
as d 2(1 ).
Examples : Refer to Gugarati p825-829.
7-74

Time Series Lecture Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Time Series Lecture Notes

Uploaded by

Copyright:

Available Formats

TIME SERIES ECONOMETRICS:

SOME BASIC CONCEPTS

t is the expected value of the process at time t. The

The variance function is:

for any integer l. Equation (1) implies:

Such a sequence is called a purely random sequence

Example of Stationary Series

Example of Nonstationary Series

Example: Random Walk with Drift

and Y0 = 0 where is a constant. Such a series is

The estimator for is the sample mean

Sampling Properties of Estimators

3. rk is approximately normally distributed.

95% of its ACF will lie between 2/ T and 2/ T .

General Characteristics of Sample ACF

95% of its PACF will lie between 2/ T and 2/ T .

Tests of White Noise

1. its ACF rk lies between 2/ T and 2/ T ;

2. its PACF rk lies between 2/ T and 2/ T .

or the Ljung-Box Q Statistic

If the time series is white noise, Q 2m and

LINEAR MODELS FOR STATIONARY

If j satisfies j = 0 for j < 0, then

The filter is one-sided. Time-invariant because

and the autocovariance

where t are independent random variable with mean

because Cov(tj , t+sk ) = 2 when k = j + s and

Wolds Representation Theorem : If {Yt} is a

This results supports the use of model representation

as a class of models for stationary series.

FINITE MOVING AVERAGE MODEL

where t are independent white noise with mean 0 and

for |k| > 1

for |k| > 2

for k = 0, 1, 2, , q and k = 0 for |k| > 0.

Under the assumption that || < 1, as n , we

which is stationary. Note that

So, if || < 1, {Yt} represents a stationary series with

Autocorrelation Check for White Noise

0.000 -0.068 -0.109 -0.090 -0.117 -0.089 -0.089

0.000 -0.077 -0.067

0.000 -0.017 -0.057 -0.092 -0.093 -0.136 -0.098

Maximum Likelihood Estimation

Std Error Estimate = 1.00665677

Autocorrelation Check of Residuals

0.064 -0.003 -0.031

0.319 -0.054 -0.076 -0.004 -0.075 -0.013 -0.039

0.379 -0.027 -0.061

0.340 -0.018 -0.024 -0.054 -0.007 -0.097 -0.009

Autocorrelation Plot of Residuals

Model for variable Y

OPTIONS NOCENTER PS = 35 LS = 72;

PROC ARIMA DATA=A ;

The infinite MA will be absolutely summable (this means that

Another method to express Yt in terms of the noises is:

{j } can be determined from (B) = (B)

where 0 = 1 and j = 0 for j < 0.

Condition for Stationarity

are greater than 1 in absolute value. Or roots of

< be absolutely summable.

This condition will lead us to get the formula

Variance, Autocovariance and Autocorrelation of AR(2)

0 = V ar(Yt) = V ar(1Yt1 + 2Yt2 + t)

In Y-W Equation, k = 1 and k = 2 are very important for