Liu Et Al-2018-Journal of Applied Econometrics - Sup-1

Online Appendix to
Improving Markov Switching Models

using Realized Variance
Jia Liu∗ John M. Maheu†
1 Estimation Steps
1.1 MS-RV Model

(1) s1:T y1:T , θ, φ, P
The latent state variable s1:T is sampled using the forward filter backward sampler (FFBS)
in Chib (1996). The forward filter part contains the following steps.

i. Set the initial value of filter p(s1 = j y1 , θ, φ, P ) = πj , for j = 1, . . . , K, where π is the
stationary distribution, which can be computed by solving π = P | π.
ii. Prediction step: p(st y1:t−1 , θ, φ, P ) ∝ K

P
j=1 Pj,st · p(st−1 = j y1:t−1 , θ, φ, P ).

iii. Update step:

p(st y1:t , θ, φ, P ) ∝ f (rt µst , σs2t ) · g(RVt ν + 1, νσs2t ) · p(st y1:t−1 , θ, φ, P ),
where function f (·) and g(·) denote normal density and inverse-gamma density, respec-
tively.
The underlying states are drawn using backward sampler as follows.

i. For t = T , draw sT from p(sT y1:T , θ, φ, P ).

ii. For t = T − 1, . . . , 1, draw st from Pst ,st+1 · p(st y1:t , θ, φ, P ).
Let nj = Tt=1 1(st = j) denotes the number of observations belong to state j.

P

(2) µj r1:T , s1:T , σj2 for j = 1, . . . , K
∗
DeGroote School of Business, McMaster University, Canada, liuj46@mcmaster.ca
†
DeGroote School of Business, McMaster University, Canada and RCEA, Italy, maheujm@mcmaster.ca
1
µj is sampled using the Gibbs sampling for linear regression model. Given prior µj ∼
N(mj , vj2 ), µj is sampled from conditional posterior N(mj , vj 2 ), where
vj2 2
P
st =j rt + mj σj σj2 vj2
mj = , and vj 2 = .
σj2 + nj vj2 σj2 + nj vj2

(3) σj2 y1:T , µj , ν, s1:T for j = 1, . . . , K
The prior of σj2 is assumed to be σj2 ∼ G(v0 , s0 ). The conditional posterior of σj2 is given
as follows,
Y1 (rt − µj )2 νσj2

2 2 (ν+1)
p(σj y1:T , µj , ν, s1:T ) ∝ exp − · (νσj ) exp −
st =j
σj 2σj2 RVt
·(σj2 )v0 −1 exp −s0 σj2 .

The conditional posterior of σj2 is not of any known form, therefore Metropolis-Hasting
algorithm is applied to sample σj2 . Combining RV likelihood function and prior provides
a good proposal density q(·) for sampling σj2 , which follows a gamma distribution and is
derived as follows,
Y νσj2

· (σj2 )v0 −1 exp −s0 σj2

2 2 (ν+1)

p(σj y1:T , µj , ν, s1:T ) ∝ (νσj ) exp −
st =j
RVt
!
X 1
∼ G nj (ν + 1) + v0 , ν + s0 ≡ q(σj2 ).
s =j
RV t
t

(4) ν y1:T , {σj2 }K
j=1 , s1:T
The prior of ν is assumed to be ν ∼ IG(a, b). The posterior of ν is given as follows.
T
( )
2 (ν+1) 2
2
Y (νσ s ) −ν−2 νσ s −a−1 b
p(ν y1:T , σst , s1:T ) ∝
t
RVt exp − t
· (ν) exp −
t=1
Γ(ν + 1) RVt ν
ν is drawn from a random walk proposal and negative draws will be dropped.

(5) P s1:T
Using conjugate prior for rows of the transition matrix P : Pj ∼ Dir(αj1 , · · · , αjK ), the
posterior is given by Dir(αj1 + nj1 , · · · , αjK + njK ), where vector (nj1 , nj2 , · · · , njK ) records
the numbers of switches from state j to the other states.
1.2 MS-logRV Model

Forward filter backward sampler is used to sample s1:T . The sampling of P is same as step
(5) in MS-RV model estimation. The sampling of µj is same as step (2) in MS-RV model
except replacing σj2 with exp(ζj ).
2
Let nj = Tt=1 1(st = j) denotes the number of observations belong to state j. Assuming
P
2
the prior of ζj is ζj ∼ N (mζ,j , vζ,j ), the conditional posterior of ζj is given as follows,
( " #)
1 2 2
2 (log RV − ζ + δ )

Y ζj (rt − µ j ) t j 2 j
p(ζj y1:T , µj , δj2 , s1:T ) ∝ exp − − · exp − 2
st =j
2 2 exp(ζj ) 2δj
" #
(ζj − mζj )2
· exp − 2
2vζ,j
Metropolis-Hasting algorithm is applied to sample ζj . The proposal density is formed as

follows,
Y (rt − µj )2 (ζj − µ∗ )2

2 ζj
p(ζj y1:T , µj , δj , s1:T ) ∝
exp − − · exp −
st =j
2 2 exp(ζj ) 2σ 2∗
" #
2 ∗
P
nj ζj (r
st =j t − µ j ) exp(−µ )(−ζ j ) (ζj − µ )∗ 2
∝ exp − − −
2 2 2σ 2∗
∼ N(µ∗∗ ∗2
j , σj ) ≡ q(ζj ), where
2
log(RVt ) + 21 nj vζ2 δj2 + δj2 mζ,j δj2 vζ,j
2
P
vζ,j st =j
µ∗j= 2
, ∗2
σj = ,
nj vζ,j + δj2 2
nj vζ,j + δj2
" #
1 X
µ∗∗ ∗
j = µ + σ
∗2
(rt − µj )2 exp(−µ∗ ) − nj .
2 s =j t
Using conjugate prior δj2∼ IG(v0 , s0 ), the posterior density of δj2 is given by
( " #)
1 2 2
(log RV − ζ + δ )
Y 1

2 2 t j 2 j −v0 −1 s0
p(δj y1:T , µj , σj ) ∝ exp − · δj exp − 2
s =j
δj 2δj2 δj
t
Metropolis-Hasting is used to sample δj with the following proposal

2
P !
n j s =j (log RV t − ζj )
q(δj2 ) ≡ IG + v0 , t
+ s0 .
2 2
1.3 MS-RAV Model

The sampling step of s1:T , {µj }K j=1 and P are same as in MS-RV model estimation. Let
0 0 0 0
y1:t = {y1 , . . . , yt }, where yt = {rt , RAVt }. {σj }K j=1 and ν are sampled as follows.
The prior of σj2 is assumed to be σj2 ∼ G(v0 , s0 ). The conditional posterior of σj2 is given
as follows,
( " 2 #)
2

0
Y 1 (r t − µ j ) σ j Γ(ν) 1
p(σj2 y1:T , µj , ν, s1:T ) ∝ exp − · σj2ν exp −
st =j
σ j 2σ 2
j Γ(ν − 21 ) RAVt2
·(σj2 )v0 −1 exp −s0 σj2 .

3
0
σj2 is drawn from random walk proposal and negative draws discarded.
The prior for ν is assumed to be ν ∼ IG(a, b). The posterior of ν is given as follows,
T
( 2ν " 2 #)
−2ν−1
2 K
Y σj Γ(ν) RAV t σ j Γ(ν) 1
p(ν RAV1:T , {σj }j=1 , s1:T ) ∝ exp −
t=1
Γ(ν − 12 ) Γ(ν) Γ(ν − 12 ) RAVt2

b
·ν −a−1 exp .
ν
Random walk proposal is used to sample ν and negative values discarded.
1.4 MS-logRAV Model

The estimation of MS-logRAV model is very similar to that of MS-logRV model except
changing the return variance exp(ζst ) to exp(2ζst ).
1.5 MS-RCOV
See step (1) and (5) in Appendix 9.1 for the estimation of s1:T and P . Let nj = Tt=1 1(st = j)
P
denotes the number of observations belong to state j.
Given conjugate prior Mj ∼ N(Gj , Vj ), the posterior density of Mj is given by

Mj R1:T , s1:T , Σj ∼ N(M , V ), where
!
−1 −1
X
−1 −1 −1

V = Σj nj + Vj , M = V Σj Rt + Gj Vj .
st =j
The prior of Σj is assumed to be Σj ∼ W(Ψ, τ ). The conditional posterior of Σj is given as

follows,
Y
1

− 21 | −1

p(Σj Y1:T , Mj , κ, s1:T ) ∝
|Σj | exp − (Rt − Mj ) Σj (Rt − Mj )
st =j
2

Y κ
− κ+d+1 1 −1
· |Σj | 2 |RCOVt | 2 exp − tr(Σj RCOVt )
st =j
2

τ −d−1 1
·|Σj | 2 exp − tr(Ψ−1 Σj ) .
2
Metropolis-Hasting algorithm is applied to sample Σj . The proposal density qj (·) is formed
as follows,
Y κ

1

τ −d−1

1

−1 −1

p(Σj Y1:T , Mj , κ, s1:T ) ∝
|Σj | 2 exp − tr(Σj RCOVt ) · |Σj | 2 exp − tr(Ψ Σj )
st =j
2 2
" #−1 
X
∼ W  (κ − 1 − d) RCOVt−1 + Ψ−1 , nj κ + τ  ≡ qj (·).
st =j
4
Assuming the prior of κ is κ ∼ G(a, b), the posterior density of κ is given as follows,
T κ
Y |Σst (κ − d − 1)| 2 κ+d+1
|RCOVt |− 2

p(κ Y1:T , Mst , Σst , S) ∝
κd κ
t=1 2 2 Γ( 2 )

1 −1
· κa−1 exp(−bκ).

· exp − tr (κ − d − 1)Σst RCOVt
2
Metropolis-Hasting algorithm with random walk proposal is used to sample κ.
1.6 Univariate Joint IHMM-RV

Define vector C = {c1 , · · · , cK } and another K ×K matrix A, which will be used in sampling
Γ and η. The MCMC steps are illustrated as follows. Several estimation steps are based on
Maheu and Yang (2016) and Song (2014).

(1) u1:T s1:T , Γ, P
Draw u1 ∼ Uniform(0, γs1 ) and draw ut ∼ Uniform(0, Pst−1 ,st ) for t = 2, . . . , T .
(2) Adjust the Number of States K

r r
i. Check if max{P1,K+1 , · · · , PK,K+1 } > min{u1:T }. If yes, expand the number of clusters
by making the following adjustments (ii) - (vi), otherwise, move to step (3).
ii. Set K = K + 1.
r
iii. Draw uβ ∼ Beta(1, η), set γK = uβ γK and the new residual probability equals to
r r
γK+1 = (1 − uβ )γK .
r r
iv. For j = 1, · · · , K, draw uβ ∼ Beta(γK , γK+1 ), set Pj,K = uβ Pj,K and Pj,K+1 = (1 −
r
uβ )Pj,K . Also, add an additional row to transition matrix P . PK+1 ∼ Dir(αγ1 , · · · , αγK ).
v. Expand the parameter size by 1 by drawing µK+1 ∼ N(m, v 2 ) and σK+1

2
∼ IG(v0 , s0 ).
vi. Go back to step(i.).
(3) s1:T |y1:T , u1:T , θ, φ, P, Γ

In this step, the latent state variable is sampled using the forward filter backward sampler
Chib (1996). The forward filter part contains the following steps:

i. Set the initial value of filter p(s1 = j y1 , u1 , θ, φ, P ) = 1(u0 < γj ) and normalize it.
ii. Prediction step:

K
X

p(st y1:t−1 , u1:t−1 , θ, φ, P ) ∝ 1(ut < Pj,st ) · p(st−1 = j y1:t−1 , u1:t−1 , θ, φ, P ).
j=1
5
iii. Update step:

p(st y1:t , u1:t , θ, φ, P ) ∝ f (rt µj , σj2 ) · g(RVt ν + 1, νσj2 ) · p(st y1:t−1 , u1:t−1 , θ, φ, P ).
The underlying states are drawn using backward sampler as follows.

i. For t = T , draw sT from p(sT y1:T , u1:T , θ, φ, P ).

ii. For t = T − 1, · · · , 1, draw st from 1(ut < Pj,st+1 ) · p(st y1:t , u1:t , P, θ, φ).
Then we count the number of active clusters and removing inactive states by making following
adjustments.
i. Calculate the number of active states (states with at least one observation assigned to
it) denoted by L. If L < K, remove the inactive states by adjusting the value of states.
ii. Adjust the order of state-dependent parameters µ, σ 2 and Γ according to the adjusted
state s1:T .
iii. Set K = L. Recalculate the residual probabilities of ΓrK+1 for j = 1, · · · , K. Then set
the values of parameter µj , σj2 and γj , to be zero for j > K.

(4) Γs1:T , η, α
i. Let nj,i denotes the number of state moves from state j to i. Calculate nj,i for i =
1, · · · , K and j = 1, · · · , K.
ii. For i = 1, · · · , K and j = 1, · · · , K, if nj,i > 0, then for l = 1, · · · , nj,i , draw xl ∼
αγi
Bernoulli( l−1+αγ i
). If xl = 1, set Aj,i = Aj,i + 1.
iii. Draw Γ ∼ Dir(c1 , . . . , cK , η), where ci = K

P
j=1 Aji .

(5) P s1:T , Γ, α
r
For j = 1, · · · K, draw Pj ∼ Dir(αγ1 + nj,1 , · · · , αγk + nj,K , αγK+1 ).

(6) θy1:T , s1:T , ν
See the step (2) and step (3) in Appendix 9.1. for the estimation of state-dependent
parameters µj , σj2 , for j = 1, . . . , K.

(7) ν y1:T , s1:T , {σj2 }K
j=1 , ν
Same as the step (4) in Appendix 9.1.

(8) η s1:T , Γ, α
PK
i c
Recompute C vector again as in step (4) and define ν and λ, where ν ∼ Bernoulli( PKi=1c +η )
i=1 i
PK
and λ ∼ Beta(η + 1, i=1 ci ). Then draw a new value of η ∼ G(a1 + K − ν, b1 − log(λ)).

(9) αs1:T , C
PK
n
Define νj0 , λ0j , for j = 1, · · · , K, where νj0 ∼ Bernoulli( PKi=1n j,i+α ) and λ0j ∼ Beta(α +
i=1 j,i
1, K
P PK PK 0 P K 0
i=1 n j,i ). Then draw α ∼ G(a 2 + j=1 c j − ν
j=1 j , b2 − j=1 log(λj )).
6
1.7 Univariate Joint IHMM-logRV and Joint IHMM-logRAV
See step (1) - (5), (8) and (9) in Appendix 9.6 for the estimation of auxiliary variable u1:T ,
latent state variable s1:T , Γ, transition matrix P , DP concentration parameter η and α.
The estimation of θ = {µj , ζj , δj2 }∞
j=1 in IHMM-logRV are same as the MS-logRV model, see
Appendix 9.3. The parameter estimation of IHMM-logRAV can be done similarly.
1.8 Joint IHMM-RCOV

See step (1) - (5), (8) and (9) in Appendix 9.6 for the estimation of u1:T , s1:T , Γ, P , η and α.
The estimation of θ = {Mj , Σj }∞ j=1 and κ are same as the estimation of MS-RCOV model,
see Appendix 9.5.
2 Priors
The prior specifications of univariate joint and benchmark models are provided in Table 1.
Table 2 shows the prior specifications of multivariate models.
3 Results

Figure 1 plots E σs2t y1:T for the IHMM-RV and IHMM models. Volatility estimates vary
over a larger range from the IHMM-RV model. For example, it appears that the IHMM
overestimates the return variance during calm market periods and underestimates the return
variance in several high volatile periods, such as the October 1987 crash and the financial
crisis in 2008. In contrast, the return variance from the IHMM-RV model is closer to RV
during these times. The differences between the models is due to the additional information
from ex-post volatility.
Figure 2 displays the posterior average of active states in both IHMM and IHMM-RCOV
models at each point in the out-of-sample period. It shows that more states are used in the
joint return-RCOV model in order to better capture the dynamics of returns and volatility.
References
Chib S. 1996. Calculating posterior distributions and modal estimates in Markov mixture
models. Journal of Econometrics 75: 79–97.
Maheu JM, Yang Q. 2016. An infinite hidden Markov model for short-term interest rates.
Journal of Empirical Finance 38: 202–220.
Song Y. 2014. Modelling regime switching and structural breaks with an infinite hidden
Markov model. Journal of Applied Econometrics 29: 825–842.
7
Table 1: Prior Specifications of Univariate Return Models
Panel A: Priors for MS and Joint MS Models
Model µst σs2t ν δs2t Pj
MS N(0, 1) IG(2, var(r
c t )) - Dir(1, . . . , 1)
MS-RV N(0, 1) G(RVt , 1) IG(2, 1) Dir(1, . . . , 1)
MS-RAV N(0, 1) G(RVt , 1) IG(2, 1) Dir(1, . . . , 1)
MS-logRV N(0, 1) N(log(RVt ), 5) - IG(2, 0.5) Dir(1, . . . , 1)
MS-logRAV N(0, 1) N(log(RAVt ), 5) - IG(2, 0.5) Dir(1, . . . , 1)
Panel B: Priors for IHMM and Joint IHMM Models
Model µst σs2t ν δs2t η α
IHMM N(0, 1) IG(2, var(r
c t )) - - G(1, 4) G(1, 4)
IHMM-RV N(0, 1) G(RVt , 1) IG(2, 1) - G(1, 4) G(1, 4)
IHMM-logRV N(0, 1) N(log(RVt ), 5) - IG(2, 0.5) G(1, 4) G(1, 4)
IHMM-logRAV N(0, 1) N(log(RAVt ), 5) - IG(2, 0.5) G(1, 4) G(1, 4)
var(r
c t ) is the sample variance, RVt , log(RVt ) and log(RAVt ) are the sample means. All are
computed using in-sample data.
Table 2: Prior Specification of Multivariate Models

Panel A: Priors for Multivariate MS Models
Model Mst Σs t κ Pj
MS N(0, 5I) IW(Cov(Rt ), 5)
d - Dir(1, . . . , 1)
MS-RCOV N(0, 5I) W( 31 RCOVt , 3) G(20, 1)1κ>4 Dir(1, . . . , 1)
Panel B: Priors for Multivariate IHMM Models
Model Mst Σs t κ η α
IHMM N(0, 5I) IW(Cov(Rt ), 5)
d - G(1, 4) G(1, 4)
IHMM-RCOV N(0, 5I) W( 31 RCOVt , 3) G(20, 1)1κ>4 G(1, 4) G(1, 4)
0 denotes zero vector, I is the identity matrix. Cov(R
d t ), and RCOVt are computed
using in sample data.
8
RV E [σ2s | y1:T, IHMM] E [σ2s | y1:T, IHMM-RV]
t t
7
Variance Estimates
1885/03 1901/03 1917/03 1933/03 1949/03 1965/03 1981/03 1997/03 2013/03

Time

Figure 1: E σs2t y1:T , IHMM , E σs2t y1:T , IHMM-RV and Realized Variance
IHMM
IHMM-RCOV
11
Posterior Average of Active States
1951/01 1961/01 1971/01 1981/01 1991/01 2001/01 2011/01

Time
Figure 2: Number of Active Clusters: IHMM and IHMM-RCOV Models

Liu Et Al-2018-Journal of Applied Econometrics - Sup-1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Liu Et Al-2018-Journal of Applied Econometrics - Sup-1

Uploaded by

Copyright:

Available Formats

Online Appendix to

Improving Markov Switching Models

ii. Prediction step: p(st y1:t−1 , θ, φ, P ) ∝ K

iii. Update step:

The underlying states are drawn using backward sampler as follows.

Let nj = Tt=1 1(st = j) denotes the number of observations belong to state j.

1.2 MS-logRV Model

Metropolis-Hasting algorithm is applied to sample ζj . The proposal density is formed as

Metropolis-Hasting is used to sample δj with the following proposal

1.3 MS-RAV Model

1.4 MS-logRAV Model

The prior of Σj is assumed to be Σj ∼ W(Ψ, τ ). The conditional posterior of Σj is given as

Metropolis-Hasting algorithm with random walk proposal is used to sample κ.

1.6 Univariate Joint IHMM-RV

(2) Adjust the Number of States K

v. Expand the parameter size by 1 by drawing µK+1 ∼ N(m, v 2 ) and σK+1

vi. Go back to step(i.).

(3) s1:T |y1:T , u1:T , θ, φ, P, Γ

ii. Prediction step:

The underlying states are drawn using backward sampler as follows.

iii. Draw Γ ∼ Dir(c1 , . . . , cK , η), where ci = K

1.8 Joint IHMM-RCOV

Table 2: Prior Specification of Multivariate Models

1885/03 1901/03 1917/03 1933/03 1949/03 1965/03 1981/03 1997/03 2013/03

1951/01 1961/01 1971/01 1981/01 1991/01 2001/01 2011/01

Figure 2: Number of Active Clusters: IHMM and IHMM-RCOV Models

You might also like