You are on page 1of 74

TIME SERIES ECONOMETRICS:

SOME BASIC CONCEPTS


Reference : Gujarati, Chapters 21, 22
1. Assume the underlying time series data is
stationary.
2. Sometimes autocorrelation because the underlying
time series data is non-stationary.
3. Sometimes one obtains a very high R2 and
significant regression coefficients even though there
is no meaningful relationship between the two
variables the problem of spurious, or
nonsense, regression.

7-1

Stochastic Processes
Let yt be the observation made at time t. The units of
time vary with application; they could be years,
quarters, months, days,... We assume that the
observations are equally spaced in time. The sequence
of random variables {Y1, Y2, , YT } is called a
stochastic process. Its mean function is:
t = E(Zt)

t = 0, 1, 2,

t is the expected value of the process at time t. The


autocovariance function is:
t,s = Cov(Zt, Zs)
= E(Zt t)(Zs s)

t, s = 0, 1, 2,

The variance function is:


V ar(Zt) = t,t = Cov(Zt, Zt)
= E(Zt t)2
The autocorrelation function is:
Cov(Zt, Zs)
t,s = Corr(Zt, Zs) =
[V ar(Zt)V ar(Zs)]1/2
7-2

t, s = 0, 1,

STATIONARITY
The time series Zt is weakly stationary if
t = E(Zt) =
and
t,s = Cov(Zt, Zs) = Cov(Ztl , Zsl )

(1)

for any integer l. Equation (1) implies:


t,s = 0,k
where k = |t s|. Thus, for a stationary process we
can simply write
k = Cov(Zt, Ztk )

and

Note that k = k /0

7-3

k = Corr(Zt, Ztk )

WHITE NOISE
Let {t} be a sequence of independent random
variables with mean 0 and variance 2 and let
Yt = + t
Then
E(Yt) =
k = Cov(Yt, Ytk )

k=0

k 6= 0

= Cov(t, tk ) =
and

k=0

k 6= 0

k =

Such a sequence is called a purely random sequence


or white noise sequence.

7-4

Example of Stationary Series


Let {t} be a white noise sequence which is
distributed as N (0, 2). Define a new process {Yt} by
Yt = + t + t1
Then
E(Yt) =
0 = var(Yt)
= var(t + t1)
= var(t) + var(t1) = 2 2
1 = cov(Yt, Yt1)
= cov(t + t1 , t1 + t2)
= 2
k = cov(Yt, Ytk )
= cov(t + t1 , tk + tk1)
= 0
for s > 1
Hence

k = 2

k=0
|k| = 1

and k = 1/2

|k| > 1
7-5

k=0
|k| = 1
|k| > 1

Example of Nonstationary Series


In practice, we usually find series which are not
stationary. For example, economic or business series
show a trend or change in mean level over time
reflecting growth or nonstationary due to seasonal
features in series.
Important practical matter in time series involves
methods of transforming nonstationary series to
stationary one, or modeling nonstationarity. Two
fundamental approaches for dealing with
nonstationarity are:
1. Work with change or differences in series since
these may be stationary.
2. remove non-stationary components,
e.g. nonconstant mean, by linear regression
technique.

7-6

RANDOM WALK
Let at be iid N (0, 2) and let
Zt = Zt1 + at

t = 1, 2,

and
Z0 = 0
Then
Zt = a1 + a2 + + at
Zt is called random walk with mean t = 0, variance
V ar(Zt) = t 2, and autocovariance, t,s = t 2 for
1 t s. Since t, V ar(Zt) and t,s depend on t, Zt
is not stationary.

7-7

Example: Random Walk with Drift


Let t be iid N (0, 2) and let
Yt = Yt1 + + t

t = 1, 2,

and Y0 = 0 where is a constant. Such a series is


called a random walk with drift . We have
Yt = Y0 + t + Ptj=1 j Its mean t = E(Yt) = t, its
variance is: V ar(Yt) = t 2. Thus {Yt} is not
stationary, with both mean and variance depend on t.
Note that the series of changes or first difference of
{Yt}, defined by
Zt = Yt Yt1 = + t
is a white noise series.

7-8

ESTIMATION OF MEAN,
AUTOCOVARIANCES,
AND AUTOCORRELATIONS FOR
STATIONARY SERIES
Suppose Y1, Y2, , YT be a sample realization of a
stationary time series, {Yt} with mean
= E(Yt)
autocovariance function
k = Cov(Yt, Yt+k ) = Cov(Yt, Ytk )
and autocorrelation function
k = Corr(Yt, Yt+k ) =

7-9

k
0

The estimator for is the sample mean


T
1 X

Y =
Yt
T t=1
Estimator for k is
k
1 TX
(Yt Y )(Yt+k Y )
k = 0, 1, 2,
ck =
T t=1
where k is small relative to T . Note that
T
1 X
c0 =
(Yt Y )2
T t=1
is the sample variance. Estimator for k is sample
ACF
k
(Yt Y )(Yt+k Y )
ck PTt=1
k = 0, 1, 2,
rk = =
PT
2

c0
t=1 (Yt Y )
A plot of rk versus k is called a correlogram.

7-10

Sampling Properties of Estimators


1. Y is an unbiased estimator of . That is:
E(Y ) =
2.
T
1 X

V ar(Y ) = V ar(
Yt )
t=1
T

TX
1 T k
0
1 + 2
=
k
T
T
k=1
If Yt are independent, then k = 0 for all k 6= 0
and so V ar(Y ) = 0/T .
When T is large, then

3. rk is approximately normally distributed.


4.
E(rk ) k
and
5.
1
V ar(rk )
T

X
s=

s + s+k sk 4k ssk + 2s k

7-11

Special case
When the series is white noise, so s = 0 for s 6= 0,
then
1
V ar(rk )
for k 6= 0
T
In fact, rk is approximately distributed as N (0, 1/T )
for k = 1, 2,
This property will be applied to check whether the
model is appropriate or not. If the model fits the data,
the residuals will follow a white noise series and hence

95% of its ACF will lie between 2/ T and 2/ T .

7-12

General Characteristics of Sample ACF


1. Stationary
(a) Sample ACF tends to damp out to 0 as lag k
increases fairly rapidly
(b) cut off
(c) damp out exponentially or sinusoidally
2. Non-stationary
(a) Sample ACF tends to damp out very slowly,
linearly
(b) sinusoidally but damping out very slowly
strong seasonal component

7-13

PARTIAL AUTOCORRELATION
FUNCTION
For a stationary and normal-distributed time series
{Zt}, the partial autocorrelation , (PACF), at
lag k is defined as:
k,k = Corr(Zt , Ztk | Zt1, Zt2, , Ztk+1)
which is the correlation between Zt and Ztk after
removing the effect of the intervening variables
Zt1, Zt2, , Ztk+1. Its estimator is the sample
partial autocorrelation , rk,k .
Property: If {Zt} for t = 1, 2, , T is white noise,
then its sample partial autocorrelation function rkk
will distribute as N (0, 1/T ) for k = 1, 2,
This property will be applied to check whether the
model is appropriate or not. If the model fits the data,
the residuals will follow a white noise series and hence

95% of its PACF will lie between 2/ T and 2/ T .

7-14

Tests of Stationarity
1. Sample ACF tends to damp out to 0 as lag k
increases fairly rapidly
(a) cut off
(b) damp out exponentially or sinusoidally
2. Sample PACF tends to damp out to 0 as lag k
increases fairly rapidly
(a) cut off
(b) damp out exponentially or sinusoidally

7-15

Tests of White Noise


If the time series is white noise, we have
1. its ACF rk is approximately distributed as
N (0, 1/T ) for k = 1, 2, , and
2. its PACF is approximately distributed as
N (0, 1/T ) for k = 1, 2, .
Hence, if the time series is white noise, we have

1. its ACF rk lies between 2/ T and 2/ T ;

2. its PACF rk lies between 2/ T and 2/ T .


3. In addition, we can apply the Box-Pierce Q
Statistic
m 2
X
Q=n
rk
k=1

or the Ljung-Box Q Statistic


rk2
LB = n(n + 2)
k=1 n k
where n is sample size, m is lag length to test for
white noise.
m
X

If the time series is white noise, Q 2m and


LB 2m
7-16

Notation:
The backward shift operator B is defined by
BYt = Yt1
and hence
B iYt = Yti
The forward shift operator F = B 1 is defined by
F Yt = Yt+1
and hence
F iYt = Yt+i
Example 1:
Yt = t t1 = (1 B)t
Example 2:
Yt = Yt1 + t
implies
Yt Yt1 = t
or
(1 B)Yt = t
7-17

If || < 1, then
Yt = (1 B)1t
We have
Yt = (1 + B + 2B 2 + 3B 3 + )t
and hence
Yt = t + t1 + 2t2 + 3t3 +
Similarly, in Example 1
t = (1 B)1Yt
and hence
t = (1 + B + 2B 2 + 3B 3 + )Yt
t = Yt + Yt1 + 2Yt2 + 3Yt3 +
Remark: In Example 2, when = 1, we have
Yt = Yt1 + t
or
Yt Yt1 = t
which is a random walk series.
7-18

LINEAR MODELS FOR STATIONARY


SERIES
Properties of series are exhibited by ACF. Hence, we
build models which reflect the ACF structure.
Linear Filters
Often we deal with formation of new series {Yt} by a
linear operation applied to a given series {Xt}. The
system where Xt is input and Yt is output which
results from linear operation of Xt.
A linear time-invariant filter applied to series {Xt}
is to produce a new series {Yt} such that
Yt =

X
j=

j Xtj

If j satisfies j = 0 for j < 0, then


Yt =

X
j=0

j Xtj

The filter is one-sided. Time-invariant because


coefficients j do not depend on t.

7-19

Note
1. Xt may be controllable, e.g. in the process of
production, {Xt} is the input of raw material. Yt
be output of product or by-product.
2. Differencing operators are linear filters
Yt = Xt = Xt Xt1
and
Yt = 2Xt = Xt Xt1
= Xt 2Xt1 + Xt2
3. Moving averages are linear filters, e.g.
m
1
X
Xtj
Yt =
2m + 1 j=m
If {Xt} is stationary with mean x and autocovariance
k , then

X
Yt =
j Xtj
j=

with mean
Y =

X
j=

j E(Xtj ) = x
7-20

X
j=

and the autocovariance


Y (s) = Cov(Yt, Yt+s)

X
= Cov(
j Xtj ,
=
=

j=

j= k=

X
X
j= k=

X
k=

k Xt+sk )

j k Cov(Xtj , Xt+sk )
j k s+jk

7-21

Linear Process
{Yt} is a linear process if it can be represented as
output of one-sided linear filter white noise {t}. That
is:

X
Yt = +
j tj
j=0

where t are independent random variable with mean


0 and variance 2. In this situation,
Y =
and the autocovariance
Y (s) = Cov(Yt, Yt+s)

X
= Cov( j tj ,
=
=

j=0

X X

X
k=0

k t+sk )

j k Cov(tj , t+sk )

j=0 k=0

j j+s 2
j=0

because Cov(tj , t+sk ) = 2 when k = j + s and


equal to 0 when k 6= j + s.

7-22

Wolds Representation Theorem : If {Yt} is a


weakly stationary nondeterministic series with mean
, then Yt can always be expressed as:
Yt = +

X
j=0

j tj

2
with 0 = 1 and P
j=0 (j ) < where t are
uncorrelated random variables with mean 0 and
variance 2.

This results supports the use of model representation


of the form:
Yt = +

X
j=0

j tj

0 = 1

as a class of models for stationary series.

7-23

FINITE MOVING AVERAGE MODEL


A simple class of models is to set j = 0 for j > q.
{Yt} is said to be a moving average process of order q
(MA(q)) if it satisfies:
Yt = + t

q
X
j=1

j tj

where t are independent white noise with mean 0 and


variance 2. We write
Yt = + t

q
X
j=1

j tj

= + (B)t
where (B) = 1 Pqj=1 j B j is a MA average.

7-24

MA(1)
When q = 1, Yt = + t t1, we have
E(Yt) =
V ar(Yt) = 0 = 2(1 + 2)
1 = 2
and
k = 0

for |k| > 1

Hence
1 =

1 + 2

7-25

MA(2)
When q = 2,
Yt = + t 1t1 2t2
E(Yt) =
V ar(Yt) = 0 = 2(1 + 12 + 22)
1 = 2(1 + 12)
2 = 2(2)
and
k = 0
Hence

for |k| > 2

1 + 12
1 =
(1 + 12 + 22)

and
2 =

2
(1 + 12 + 22)

7-26

MA(q)
The model is
Yt = + t

q
X
j=1

j tj

= + (B)t
where (B) = 1 Pqj=1 j B j .
E(Yt) =
V ar(Yt) = 0 = 2(1 + 12 + 22 + + q 2)
k = 2(k +1k+1+ +qk q ) for k = 0, 1, 2, , q
and k = 0 for |k| > q. Hence, the ACF is
k =

(k + 1k+1 + + qk q )
1 + 1 2 + 2 2 + + q 2

for k = 0, 1, 2, , q and k = 0 for |k| > 0.

7-27

AUTOREGRESSIVE MODELS
The autoregressive model with order p, AR(p), is
Yt = 1Yt1 + 2Yt2 + + pYtp + + t
where t are independent white noise with mean 0 and
variance 2. We can re-write it as
Yt 1Yt1 2Yt2 pYtp = + t
or
(B)Yt = + t
where (B) = 1 Ppj=1 j B j is a AR average.
AR(p) model resembles a multiple linear regression
where Yt1,...,Ytp are independent variables.
Autoregression because Yt is regressed on its own past
values.

7-28

AR(1)
When p = 1,
Yt = Yt1 + + t
Is it stationary?
By successive substitution
Yt = (Yt2 + + t1) + + t
n

= Ytn +

n1
X
j=0

n1
X
j=0

j tj

Under the assumption that || < 1, as n , we


get
Yt =
=

X
j=0

j tj

j=0
j
X

+
tj
j=0
1

which is stationary. Note that

j
||
<
j=0

So, if || < 1, {Yt} represents a stationary series with


infinite MA representation as above with j = j

E(Yt) = =
1
7-29

k = Cov(Yt, Yt+k )
k 2
=
=
j=0
1 2

k 2 X

2j

If k = 0, we have
2
0 =
1 2
and hence
k = k

for k 0

7-30

Name of variable = Y.
Mean of working series =
Standard deviation

9.97686

= 1.141318

Number of observations =

500

Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0

1.302606

1.00000

|********************|

0.617829

0.47430

. |*********

0.352030

0.27025

. |*****

0.156284

0.11998

. |**

0.058037

0.04455

. |*.

0.057984

0.04451

. |*.

0.027178

0.02086

. | .

-0.087968

-0.06753

.*| .

-0.141527

-0.10865

**| .

-0.117101

-0.08990

**| .

10

-0.152208

-0.11685

**| .

Partial Autocorrelations
Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
1

0.47430

. |*********

0.05843

. |*.

-0.03681

.*| .

-0.01564

. | .

0.04006

. |*.

-0.01109

. | .

-0.10630

**| .

-0.05613

.*| .

0.00991

. | .

10

-0.06858

.*| .

7-31

Autocorrelation Check for White Noise


To
Lag
6

Chi

Autocorrelations

Square DF
159.47

Prob
0.000

0.474

0.270

0.120

0.045

0.045

0.021

12

187.08 12

0.000 -0.068 -0.109 -0.090 -0.117 -0.089 -0.089

18

197.51 18

0.000 -0.077 -0.067

24

223.09 24

0.000 -0.017 -0.057 -0.092 -0.093 -0.136 -0.098

0.002

0.044

0.077

0.042

Maximum Likelihood Estimation


Approx.
Parameter

Estimate

Std Error

T Ratio

Lag

MU

9.97880

0.08538

116.88

AR1,1

0.47378

0.03944

12.01

Constant Estimate
Variance

= 5.25101414

Estimate = 1.01335786

Std Error Estimate = 1.00665677


AIC

= 1427.82345

SBC

= 1436.25266

Number of Residuals=

500

Autocorrelation Check of Residuals


To
Lag
6

Chi

Autocorrelations

Square DF
4.53

Prob
0.476 -0.027

0.064 -0.003 -0.031

0.030

0.048

12

12.62 11

0.319 -0.054 -0.076 -0.004 -0.075 -0.013 -0.039

18

18.16 17

0.379 -0.027 -0.061

24

25.19 23

0.340 -0.018 -0.024 -0.054 -0.007 -0.097 -0.009

0.017

7-32

0.022

0.068

0.030

Autocorrelation Plot of Residuals


Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0

1.013358

1.00000

|********************|

-0.027426

-0.02706

.*| .

0.064949

0.06409

. |*.

3 -0.0029458

-0.00291

. | .

-0.031095

-0.03068

.*| .

0.030587

0.03018

. |*.

0.048235

0.04760

. |*.

-0.054253

-0.05354

.*| .

-0.076521

-0.07551

**| .

Partial Autocorrelations
Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
1

-0.02706

.*| .

0.06341

. |*.

0.00044

. | .

-0.03498

.*| .

0.02885

. |*.

0.05368

. |*.

-0.05554

.*| .

-0.08692

**| .

Model for variable Y


Estimated Mean = 9.97879948
Autoregressive Factors
Factor 1: 1 - 0.47378 B**(1)

7-33

OPTIONS NOCENTER PS = 35 LS = 72;


DATA A ;
INFILE c:\AR1.DATA ;
INPUT Y ;
DATA A ; SET A ;
T+1 ;
PROC PLOT ;
PLOT Y*T ;
PROC ARIMA DATA=A ;
IDENTIFY VAR=Y

PROC ARIMA DATA=A ;


IDENTIFY VAR=Y NOPRINT ;
ESTIMATE P = 1

METHOD=ML PLOT;

7-34

AR(2)
The autoregressive model with order 2, AR(2), is
Yt = 1Yt1 + 2Yt2 + + t
where t are independent white noise with mean 0 and variance
2. We can re-write it as
Yt 1Yt1 2Yt2 = + t
or
(B)Yt = + t
where (B) = 1 1B 2B 2.
AR(2) model resembles a multiple linear regression where Yt1
and Yt2 are independent variables. As in AR(1) case, we can
use successive substitution to eventually express Yt as infinite MA
model such that
Yt = +

X
j=0

j tj

The infinite MA will be absolutely summable (this means that


P

j=0 |j |

< ).

7-35

Another method to express Yt in terms of the noises is:


1

Yt = (1 1B 2B 2) + (1 1B 2B 2) t

X
=
+
j tj
1 1 2 j=0
= + (B)t
where
=

1 1 2

and
2 1

(B) = (1 1B 2B )

{j } can be determined from (B) = (B)

X
j=0
1

j B j

that implies

(B)(B) = 1
Hence, we have
(1 1B 2B 2)(0 + 1B + 2B 2 + ) = 1
and therefore
0 = 1
1 10 = 0
and
j 1j1 2j2 = 0

for j 1

where 0 = 1 and j = 0 for j < 0.


Thus {j } satisfies second order difference equation.
7-36

Condition for Stationarity


The condition for stationarity is roots of
(Z) = 1 1Z 2Z 2 = 0

(2)

are greater than 1 in absolute value. Or roots of


m2 1m 2 = 0

(3)

are less than 1 in absolute value. This condition will make the
infinite MA

j=0 |j |

< be absolutely summable.

Note that roots of (3) are the reciprocals of roots of (2). That is,
if m1 and m2 are roots of (3) and z1 and z2 are roots of (2), then
m1 =

1
z1

and

m2 =

1
z2

This condition will lead us to get the formula


j = c1mj1 + c2mj2

7-37

for any j 0

Mean of AR(2)
For the model
Yt 1Yt1 2Yt2 = + t
We have
E(Yt) 1E(Yt1) 2E(Yt2) = + E(t)
and this implies
=

1 1 2

7-38

Variance, Autocovariance and Autocorrelation of AR(2)


Note that

k=0

k>0

Cov(t, Ytk ) =

0 = V ar(Yt) = V ar(1Yt1 + 2Yt2 + t)


= 210 + 220 + 2121 + 2

1 = Cov(Yt, Yt1)
= Cov(1Yt1 + 2Yt2 + t, Yt1)
= 10 + 21
This implies
1 =

1
0
1 2

For k > 0,
k = Cov(Yt, Ytk )
= Cov(1Yt1 + 2Yt2 + t, Ytk )
= 1k1 + 2k2
This implies
k = 1k1 + 2k2
for k > 0. This is called Yule-Walker Equation.
7-39

In Y-W Equation, k = 1 and k = 2 are very important for


AR(2). They are
1 = 10 + 21 = 1 + 21

(4)

2 = 11 + 20 = 11 + 2

(5)

and
Solving these two equations, we have
1 =

1
1 2

and

12
2 = 2 +
1 2
Higher lag values of k can then be computed recursively by the
difference equation. For example:
3 = 12 + 21
Equations (4) and (5) can also be used to solve for 1 and 2
such that

1(1 2)
1 21
2 21
2 =
1 21

1 =

7-40

ACF k satisfies the second order difference equation (Y-W


Equation):
k = 1k1 + 2k2
From the difference equation theory, solution k to the difference
equation has the form:
k = c1mk1 + c2mk2

for any k 0

(if m1 and m2 are distinct and real) where m1, m2 are roots of
m2 1m 2 = 0
c1 and c2 can be determined by initial conditions
1
0 = 1
and
1 =
1 2
In this situation, k decline exponentially as k increases.
When m1 and m2 are complex, said
m1 , m2 = R(cos i sin)
c1 and c2 will be complex also, said
c1 , c2 = a bi
So that
k = c1mk1 + c2mk2
= (a + bi)Rk (cos + i sin)k + (a bi)Rk (cos i sin)k
= Rk (a1cos(k) + a2sin(k))
7-41

where
1

R = |m1| = |m2| = (2) 2 < 1


and satisfies
cos =

1
1

2(2) 2
In this situation, k is damped sinusoid with damping factor R,
period 2/ and frequency .

7-42

PACF of AR(2)
The PACF of AR(2) are
11 = 1
and

2 21
22 =
1 21

( = 2 )

and
kk = 0

for k > 2

Hence, the ACF of AR(2) damp off exponentially or sinusoidly


while the PACF cut off after lag 2.

7-43

/*-------------------------------------------------------*/
/*----

EXAMPLE

----*/

/*-------------------------------------------------------*/
Approx.
Parameter

Estimate

Std Error

T Ratio

MU

6.96407

0.20628

33.76

AR1,1

0.51108

0.08640

5.92

Constant Estimate
Variance

Lag

= 3.40488718

Estimate = 1.04185881

Std Error Estimate = 1.02071485


AIC

= 290.170809

SBC

295.38115

Number of Residuals=

100

Autocorrelation Check of Residuals


To
Lag
6

Chi

Autocorrelations

Square DF
6.91

Prob

0.228 -0.005

0.030

0.106 -0.221 -0.014

12

13.53 11

0.260 -0.191

0.024

0.022

0.060 -0.127 -0.046

18

17.83 17

0.399 -0.162 -0.058

0.004

0.036

0.010

0.072

24

20.85 23

0.590 -0.086 -0.031 -0.032 -0.055 -0.048

0.092

Autoregressive Factors
Factor 1: 1 - 0.51108 B**(1)

7-44

0.063

Approx.
Parameter

Estimate

Std Error

T Ratio

6.96184

0.16749

41.57

MA1,1

-0.45979

0.10020

-4.59

MA1,2

-0.15518

0.10022

-1.55

MU

Constant Estimate
Variance

Lag

= 6.96184387

Estimate = 1.08819896

Std Error Estimate = 1.04316775


AIC

= 295.415416

SBC

= 303.230926

Number of Residuals=

100

Correlations of the Estimates


Parameter

MU

MA1,1

MA1,2

MA1,1

0.001

1.000

0.394

MA1,2

-0.001

0.394

1.000

Autocorrelation Check of Residuals


To
Lag
6

Chi

Autocorrelations

Square DF

Prob

9.45

0.051

0.049

0.100

0.182 -0.203 -0.012

0.008 -0.004

0.051

12

16.43 10

0.088 -0.195

0.018 -0.142 -0.063

18

21.02 16

0.178 -0.170 -0.078 -0.013

0.012 -0.005

0.059

24

24.00 22

0.347 -0.087 -0.037 -0.032 -0.052 -0.038

0.094

Moving Average Factors


Factor 1: 1 + 0.45979 B**(1) + 0.15518 B**(2)

7-45

Approx.
Parameter

Estimate

Std Error

T Ratio

6.97335

0.21712

32.12

MA1,1

-0.54556

0.09870

-5.53

MA1,2

-0.36785

0.10771

-3.42

MA1,3

-0.27311

0.09959

-2.74

MU

Constant Estimate
Variance

Lag

= 6.97334862

Estimate = 1.00988046

Std Error Estimate = 1.00492809


AIC

= 289.200417

SBC

= 299.621098

Number of Residuals=

100

Correlations of the Estimates


Parameter

MU

MA1,1

MA1,2

MA1,3

MA1,1

-0.006

1.000

0.442

0.239

MA1,2

-0.012

0.442

1.000

0.448

MA1,3

-0.015

0.239

0.448

1.000

Autocorrelation Check of Residuals


To
Lag

Chi

Autocorrelations

Square DF

Prob

1.60

0.661 -0.023 -0.031 -0.004 -0.078

12

11.02

0.275 -0.222

0.009

0.024

0.003

0.086

0.110 -0.147 -0.031

Moving Average Factors


Factor 1: 1 + 0.54556 B**(1) + 0.36785 B**(2) + 0.27311 B**(3)

7-46

GENERAL ORDER AUTOREGRESSIVE MODELS


The autoregressive model with order p, AR(p), is
Yt = 1Yt1 + 2Yt2 + + pYtp + + t
where t are independent white noise with mean 0 and variance
2.
or
(B)Yt = + t
where (B) = 1

Pp

j=1 j B

is a AR average.

If all roots of
(z) = 1 1z 2z 2 pz p = 0
are larger than one in absolute value, or all roots of
mp 1mp1 2mp2 p = 0
are smaller than one in absolute value, then the process is
stationary and has a convergent infinite MA representation.

7-47

That is
Yt = (B)1 + (B)1t
= + (B)t
where
= E(Yt) = (B)1

=
1 1 2 p
(B) =
and

j=0 |j |

X
j=0

j B j = (B)1

<

j is determined from the relation:


(B)(B) = 1
This implies that j satisfies
j 1j1 2j2 pjp = 0
for j > 0. Note that 0 = 1 and j = 0 for j < 0.
The solution of difference equation implies that j satisfies
j =

p
X
i=1

cimji

where mi are roots of


mp 1mp1 2mp2 p = 0
7-48

Autocovariance and Autocorrelation


Autocovariance s of AR(p) satisfy Yule-Walker Equation:
s = 1j1 + 2j2 + + psp

(6)

Divide (6) by 0, we get the Yule-Walker Equation for ACF, j :


s = 1j1 + 2j2 + + psp
ACF satisfies the same different equation as the s and s, but
with different initial conditions.
General solution to the above different equation is
s = c1ms1 + c2ms2 + + cpmsp
where mi are roots of
mp 1mp1 2mp2 p = 0
Yule-Walker equation are useful for determining AR parameters
1, , p. Equations can be expressed in matrix form as
P=
where

P =

p1

1
...

1
...

1
...


...
...

p1 p2 p3
7-49

p2

...

and =

...

p
p
Equations are used to solve for in terms of ACF, the solution is:
= P 1
Sample version of this solution replaces s by sample ACF rs and
the estimate of (which is called Yule-Walker estimates of AR
parameters) is
= R1 r

where

R=

p2

r1

r2

rp1

r1
...

1
...

r1
...

r
...
...

rp1 rp2 rp3

and r =

Variance 0 = V ar(Yt) can be expressed as


0 = 11 + 22 + + pp + 2
Hence,
2 = 0 11 22 pp
= 0(1 11 22 pp)

7-50

r1

r
...

rp

Partial Autocorrelation Function


When fit AR models to data, we need to choose an appropriate
order p for model. The PACF is useful here.
Suppose Yt is stationary with ACF, s. For k 1, consider the
first k Yule-Walker equation corresponding to AR(k) model:
s = 1j1 + 2j2 + + k sk

s = 1, , k

(7)

and let 1k , 2k , , kk denote to be the solution to Y-W


equation for 1, 2, , k . This equation can be solved for order
k = 1, 2, and t he quantity kk is the PACF at lag k.
k=1

1 = 110

11 = 1

k = 2 implies

1
2

12
22

This implies
12 =
and

1 1
1 1
1 1
1 1

12
22

1(1 2)
1 21

2 21
22 =
1 21
7-51

1 2
2 2

When we actually have an AR(p) process and we set k = p in


equation (7), we have

p2

p1

...

pp

...

and hence pp = p. When k > p, we get kk = 0


PACF k k at lag k is actually equal to the partial correlation
between Yt and Ytk , when we adjust for the intermediate values
Yt1, Yt2, , Ytk+1.

7-52

INVERTIBILITY OF MA MODELS

Yt = + (B)t
If all roots of
(z) = 1 1z 2z 2 q z q = 0
are larger than one in absolute value, or all roots of
mq 1mq1 2mq2 q = 0
are smaller than one in absolute value, then the MA process can
be expressed in form of infinite AR model. That is:
(B)1Yt = (B)1 + t
or
(B)Yt = + t
where
(B) = 1 1B 2B 2 = (B)1
with

j=1 |j |

< . That is:


Yt =

X
j=1

j Ytj + + t

Then MA process is said to be invertible.


7-53

MIXED AUTOREGRESSIVE MOVING AVERAGE


(ARMA) MODEL
Yt follows an ARM A(p, q) model if it satisfies:
Yt = 1Yt1 + 2Yt2 + + pYtp
+ + t 1t1 q tq
where t are independent white noise with mean 0 and variance
2, or
(B)Yt = + (B)t
where (B) = 1
(B) = 1

Pq

Pp

j=1 j B

j=1 j B

is a AR average,

is a AR average.

If all roots of
(z) = 1 1z 2z 2 pz p = 0
are larger than one in absolute value, then the process is
stationary and has convergent infinite MA representation:
Yt = (B)1 + (B)1(B)t
= + (B)t
where
= E(Yt) = (B)1

=
1 1 2 p
7-54

(B) =
and

j=0 |j |

X
j=0

j B j = (B)1(B)

<

If all roots of
(z) = 1 1z 2z 2 q z q = 0
are larger than one in absolute value,then the process is invertible
and has a convergent infinite AR model. That is:
(B)1(B)Yt = (B)1 + t
or
(B)Yt = + t
where
(B) = 1 1B 2B 2 = (B)1(B)
with

j=1 |j |

< .

7-55

/*-------------------------------------------------------*/
/*----

EXAMPLE : FORECASTING STOCK MARKET PRICES

----*/

/*-------------------------------------------------------*/
TITLE AT&T STOCK PRICES;
PROC ARIMA DATA=ATTSTOCK;
/*----

First Analysis

----*/

IDENTIFY VAR=X CENTER NLAG=13;


ESTIMATE P=1 METHOD=CLS NOCONSTANT;
ESTIMATE P=1 METHOD=ML

NOCONSTANT;

ESTIMATE P=1 METHOD=ULS NOCONSTANT ;


FORECAST OUT=B1 LEAD=12 ID=N;
PROC ARIMA DATA=ATTSTOCK;
/*----

Second Analysis

----*/

IDENTIFY VAR=X(1) CENTER NLAG=13;


ESTIMATE METHOD=ULS NOCONSTANT ;
FORECAST OUT=B2 LEAD=12 ID=N;
PROC ARIMA DATA=ATTSTOCK;
/*----

Third Analysis

----*/

IDENTIFY VAR=X(1) NLAG=13;


ESTIMATE METHOD=ULS NOCONSTANT ;
FORECAST LEAD=12 ID=N;
PROC PLOT DATA=B2(FIRSTOBS=40);
PLOT FORECAST*N=F X*N=* L95*N=L U95*N=U
/OVERLAY VAXIS=46 TO 60;
TITLE2 ARIMA(0,1,0) FORECAST LEAD=12;
7-56

Name of variable = X.
Mean of working series =
Standard deviation

3.4136

Number of observations =

52

Autocorrelations
Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
0

11.652662

1.00000

10.890128

0.93456

10.046985

0.86221

9.490750

0.81447

8.717965

0.74815

7.973288

0.68425

7.242184

0.62150

6.447407

0.55330

5.745524

0.49307

|**********

5.150995

0.44204

10

4.394067

0.37709

11

3.370612

0.28926

|******

12

2.532964

0.21737

13

2.054507

0.17631

.
.
.
.
.
.
.

|********************|

|******************* |

0.138675

|*****************

0.229833

|****************

0.285334

|***************

0.327001

|**************

0.358410

|************

0.382707

0.401648

0.416048

|*********

0.427138

|********

0.435846

0.442076

|****

0.445701

|****

0.447735

|***********

"." marks two standard errors

7-57

Std

.
.

Partial Autocorrelations
Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1
1

0.93456

-0.08847

0.15987

-0.20256

0.04819

|******************* |
**|

. ****|

-0.10464

**|

-0.03046

*|

-0.00025

0.02826

|*

10

-0.14244

***|

11

-0.21337

. ****|

12

0.05050

|*

13

0.15800

|***

|***
|*

Autocorrelation Check for White Noise


To
Lag
6
12

Chi

Autocorrelations

Square DF
212.15

Prob

0.000

0.935

0.862

0.814

0.748

0.684

0.622

278.08 12

0.000

0.553

0.493

0.442

0.377

0.289

0.217

Approx.
Parameter
AR1,1
Variance

Estimate

Std Error

T Ratio

0.98453

0.04062

24.24

Estimate = 0.94924184

Std Error Estimate = 0.97429043


AIC

145.8511*
7-58

Lag
1

SBC

= 147.802344*

Number of Residuals=

52

* Does not include log determinant.


Autocorrelation Check of Residuals
To
Lag
6

Chi

Autocorrelations

Square DF

Prob

8.41

0.135

0.056 -0.152

12

16.17 11

0.135 -0.089 -0.090

18

23.05 17

24

34.06 23

0.337 -0.049 -0.033

0.063

0.111

0.236 -0.117 -0.135

0.148

0.186 -0.041 -0.176

0.028 -0.086 -0.124

0.064

0.166

0.148

0.116 -0.137

0.132 -0.142

Data have been centered by subtracting the value 57.795673077.


No mean term in this model.
Autoregressive Factors
Factor 1: 1 - 0.98453 B**(1)

7-59

Forecasts for variable X


Obs

Forecast Std Error

Lower 95%

Upper 95%

53

52.2534

0.8664

50.5553

53.9515

54

52.2568

1.2249

49.8560

54.6575

55

52.2601

1.4997

49.3207

55.1995

56

52.2635

1.7312

48.8704

55.6566

57

52.2669

1.9350

48.4744

56.0593

58

52.2702

2.1190

48.1170

56.4234

59

52.2736

2.2881

47.7890

56.7582

60

52.2770

2.4453

47.4842

57.0697

61

52.2803

2.5929

47.1984

57.3623

62

52.2837

2.7323

46.9285

57.6389

63

52.2870

2.8648

46.6721

57.9019

64

52.2904

2.9913

46.4276

58.1532

7-60

Plot of FORECAST*N.

Symbol used is F.

Plot of X*N.

Symbol used is *.

Plot of L95*N.

Symbol used is L.

Plot of U95*N.

Symbol used is U.

c 56 +
+

a 55 +F
+

s 54 +
+

t 53 +
+
+

F
*

U
F

F
*

U
52 +

f 51 +
+

o 50 +
+

r 49 +
+

L
48 +

X 47 +
+

46 +
+

L
-+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
N

7-61

MODELS FOR NONSTATIONARY TIME SERIES


For many nonstationary series which exhibited homogeneous
behavior, the first difference of series:
Wt = Yt Yt1 = (1 B)Yt
may be a stationary series, or if the first difference is not
stationary, its second difference
Wt = (1 B)2Yt
= (1 2B + B 2)Yt
= Yt 2Yt1 + Yt2
may be stationary.
So, a useful class of models for nonstationary series are models
such that dth difference
Wt = (1 B)dYt
is a stationary series and Wt follows an ARM A(p, q) model. So
(B)Wt = + (B)t
or
(B)(1 B)dYt = + (B)t
7-62

model for Yt is called an Autoregressive Integrated


Moving Average Model of order p, d, q or ARIM A(p, d, q).
Generally, when d > 0, it is often to be d = 1, (occasionally,
d = 2), ie
Wt = (1 B)Yt is ARM A(p, q)
or
Yt is ARIM A(p, 1, q)
To get Yt from Wt, we must sum or integrate Wt. i.e.
Yt = (1 B)1Wt
= (1 + B + B 2 + )Wt
= Wt + Wt1 + Wt2 +

7-63

ARIM A(p, d, q) MODEL


Yt is non-stationary such that
Wt = (1 B)dYt
is stationary ARM A(p, q) i.e.
(B)Wt = + (B)t
(B)(1 B)dYt = + (B)t
(B) = (B)(1 B)d
= 1 1B p+dB p+d
In this form, Yt has form of ARM A(p + d, q) model, but is
non-stationary with d roots of (B) = 0 equal to 1.

7-64

Unit Root Stochastic Process


For the AR(1) model:
Yt = Yt1 + + t
Yt =

X
j=0

11

X
j=0

j tj

which is non-stationary when = 1 unit root problem.

7-65

Trend Stationary (TS) and Difference Stationary


(DS)
Trend Stationary the trend is completely predictable
Difference Stationary the trend is stochastic but becomes
stationary after differencing.
Consider
Yt = 1 + 2 t + 3Yt1 + ut
1. Random Walk Model [RWM] without drift
1 = 2 = 0, 3 = 1:
Yt = Yt1 + ut
is non-stationary but
Yt = (Yt Yt1) = ut
is stationary.
2. Random Walk Model with drift 1 6= 0, 2 = 0,
3 = 1:
Yt = 1 + Yt1 + ut
is non-stationary but
Yt = (Yt Yt1) = 1 + ut
is stationary and Yt exhibits a positive (1 > 0) or negative
(1 < 0) trend.
7-66

3. Deterministic Trend Model 1 6= 0, 2 6= 0, but


3 = 0:
Yt = 1 + 2 t + ut
is non-stationary but stationary after detrend.
4. Random Walk with Drift and Deterministic
Trend 1 6= 0, 2 6= 0, and 3 = 1:
Yt = 1 + 2 t + Yt1 + ut
is non-stationary and
Yt = 1 + 2 t + ut
is still non-stationary.
5. Deterministic Trend with Stationary AR(1)
Component 1 6= 0, 2 6= 0, and 3 < 1:
Yt = 1 + 2 t + 3Yt1 + ut
which is stationary around the deterministic trend.

7-67

Integrated Stochastic Process


A time series Yt integrated of order d, denoted by Yt I(1), if Yt
differencing d times. Hence, random walk model without drift,
random walk model with drift, deterministic trend model and
random walk with stationary AR(1) component are I(1) while
random walk with drift and deterministic trend is I(2).
Properties of Integrated Series
1. If Xt I(0) and Yt I(1), then Zt = Xt + Yt I(1).
2. If Xt I(d), then Zt = a + bXt I(d).
3. If Xt I(d1), Yt I(d2), then Zt = aXt + bYt I(d2)
where d1 d2.
4. If Xt I(d) and Yt I(d), then Zt = aXt + bYt I(d)
where d is generally equal to d but sometimes d < d (when
cointegrated).

7-68

Problems
1. Consider
Yt = 1 + 2Xt + ut .
The estimate

xt yt
P 2
xt
If Xt I(1) and Yt I(0), then Xt is non-stationary and its
variance will increase indefinitely, then dominating the
numerator with the result that 2 will converge to zero
asymptotically and it will not even have an asymptotic
distribution.
2 =

2. Spurious Regression
Consider
Yt = Yt1 + ut
Xt = Xt1 + vt
Y0 = 0
and

X0 = 0

where ut and vt are independent.


However, when we simulate independent ut and vt from
N (0, 1) and fit
Yt = 1 + 2Xt + et .
We find that estimate of 2 is significantly different from zero
and R2 is also significantly different from zero.
7-69

The Unit Root Test


Consider
Yt = Yt1 + ut

11

ut is error term.
Yt = (Yt Yt1) = Yt1 Yt1 + ut
= ( 1)Yt1 + ut
= Yt1 + ut

(8)

where = 1.
H0 : = 1

H1 : < 1

H0 : = 0

H1 : < 0

is equivalent to
(9)

If H0 is true, then Yt = ut is white noise. To test for (9), we


simply regress Yt on Yt1 and obtain the estimated slope
However, unfortunately, the estimate does not
coefficient, .
follow the t distribution even in large samples.
Dickey and Fuller show that follows the (tau) statistic. The
test is known as the Dickey-Fuller (DF) test.
If the hypothesis H0 : = 0 is rejected, we can use the usual
(Students) t test.

7-70

The DF test is estimated in three different forms:


1. Yt is a random walk :
Yt = Yt1 + ut

(10)

2. Yt is a random walk with drift :


Yt = 1 + Yt1 + ut

(11)

3. Yt is a random walk with drift around a


deterministic trend :
Yt = 1 + 2 t + Yt1 + ut

(12)

If the hypothesis H0 : = 0 is rejected, then Yt is a stationary


time series with zero mean for (10), is stationary with a nonzero
mean for (11) and is stationary around a deterministic trend for
(12)

7-71

The Augmented Dickey-Fuller (ADF) test


In the DF test for (10), (11) and (12), it is assumed that ut was
uncorrelated. If they are, we use ADF test for the following:
Yt = 1 + 2 t + Yt1 + i

m
X
i=1

Yti + t

(13)

where t is a white noise and Yti = Yti Yti1. The number


of lagged difference terms to include so that t is white noise. The
ADF test follows the same asymptotic distribution as the DF test
and so that same critical values can be used.

7-72

Cointegration
If Yt I(1) and Xt I(1), but ut I(0) where
Yt = 1 + 2Xt + ui .

(14)

Yt and Xt is said to be cointegrated.


As both Yt and Xt I(1), they have stochastic trends and their
linear combination ut I(0) cancels out the stochastic trends.
As a results, the cointegrating regression (14) is meaningful and
2 is called the cointegrating parameter.
Economically speaking, two variables will be cointegrated if they
have a long-term, or equilibrium, relationship between them.

7-73

Testing for Cointegration


A number of tests have been established, we consider two simple
methods:
1. Engle-Granger (EG) or Augmented
Engle-Granger (AEG) test
Apply the DF or ADF unit root test on the residuals ut
estimated from the cointegrating regression.
Since the estimated ut are based on the estimated
cointegrating parameter 2. The DF and ADF critical values
are not appropriate. Engle and Granger have calucated these
values known as Engle-Granger (EG) or Augmented
Engle-Granger (AEG) tests.
2. Cointegrating regression Durbin-Watson
(CRDW) test
Use the Durbin-Watson d statistic obtained from the
cointegrating regression but now
H0 : d = 0

H1 : d > 0

as d 2(1 ).
Examples : Refer to Gugarati p825-829.
7-74

You might also like