You are on page 1of 21

This article was downloaded by: [131.96.195.

50] On: 29 October 2015, At: 13:18


Publisher: Institute for Operations Research and the Management Sciences (INFORMS)
INFORMS is located in Maryland, USA

Marketing Science
Publication details, including instructions for authors and subscription information:
http://pubsonline.informs.org

A Hidden Markov Model of Customer Relationship


Dynamics
Oded Netzer, James M. Lattin, V. Srinivasan,

To cite this article:


Oded Netzer, James M. Lattin, V. Srinivasan, (2008) A Hidden Markov Model of Customer Relationship Dynamics. Marketing
Science 27(2):185-204. http://dx.doi.org/10.1287/mksc.1070.0294

Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions

This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact permissions@informs.org.

The Publisher does not warrant or guarantee the articles accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.

Copyright 2008, INFORMS

Please scroll down for articleit is on subsequent pages

INFORMS is the largest professional society in the world for professionals in the fields of operations research, management
science, and analytics.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
informs
Vol. 27, No. 2, MarchApril 2008, pp. 185204
issn 0732-2399  eissn 1526-548X  08  2702  0185 doi 10.1287/mksc.1070.0294
2008 INFORMS

A Hidden Markov Model of Customer


Relationship Dynamics
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

Oded Netzer
Graduate School of Business, Columbia University, New York, New York 10027, on2110@columbia.edu

James M. Lattin, V. Srinivasan


Graduate School of Business, Stanford University, Stanford, California 94305
{jlattin@stanford.edu, srinivasan_seenu@gsb.stanford.edu}

T his research models the dynamics of customer relationships using typical transaction data. Our proposed
model permits not only capturing the dynamics of customer relationships, but also incorporating the effect of
the sequence of customer-rm encounters on the dynamics of customer relationships and the subsequent buying
behavior. Our approach to modeling relationship dynamics is structurally different from existing approaches.
Specically, we construct and estimate a nonhomogeneous hidden Markov model to model the transitions
among latent relationship states and effects on buying behavior. In the proposed model, the transitions between
the states are a function of time-varying covariates such as customer-rm encounters that could have an endur-
ing impact by shifting the customer to a different (unobservable) relationship state. The proposed model enables
marketers to dynamically segment their customer base and to examine methods by which the rm can alter
long-term buying behavior. We use a hierarchical Bayes approach to capture the unobserved heterogeneity
across customers. We calibrate the model in the context of alumni relations using a longitudinal gift-giving data
set. Using the proposed model, we probabilistically classify the alumni base into three relationship states and
estimate the effect of alumni-university interactions, such as reunions, on the movement of alumni between
these states. Additionally, we demonstrate improved prediction ability on a hold-out sample.
Key words: customer relationship management; hidden Markov models; dynamic choice models; segmentation;
Bayesian analysis
History: This paper was received July 13, 2005, and was with the authors 18 months for 2 revisions; processed
by Prasad Naik.

1. Introduction between the customer and the rm on customer-rm


In order to implement CRM, a company must have an relationships and the customers choice behavior.
integrated database available at every customer touch point Marketers often engage in activities that are aimed
and analyze that data well.    CRM allows companies to at creating an enduring impact on the relationship
automate the way they interact with their customers, and between the customers and the rm,1 such as loyalty
to communicate with relevant, timely messages. (Source: programs and university reunions. These interactions
Peter Heffringpresident of Teradatas CRM division, between the customer and the rm are designed to
2002). move the customer into a different state with different
behavioral propensities (e.g., where the customer is
Customer relationship management (CRM) has
less likely to switch to a competitor or to exhibit price
been a prominent aspect of business marketing for the
sensitivity). Once the customer is engaged in a certain
past decade. Given the wide adoption of CRM in the behavior, this behavior is likely to affect subsequent
business world, we aim to develop a model that could relationship with the rm.
help businesses analyze transaction data to assess cus- The objective of this research is to capture the dy-
tomer relationships and put forward a support system namics of customer relationships. We suggest a mod-
for marketing decisions. Recently, marketing scientists eling framework for estimating and understanding
have started to develop models that relate customer the relationship dynamics which is formed by a series
relationships and database marketing through mea-
sures like customer duration and customer lifetime 1
Because the model proposed in this paper applies to the cus-
value (e.g., Reinartz and Kumar 2003). However, far tomers relationship with rms, brands, services, or nonprot and
less attention has been given to modeling the dynamics for-prot organizations, we use the term rm to represent the
of customer relationships and the effect of encounters business partner for the relationship with the customer.

185
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
186 Marketing Science 27(2), pp. 185204, 2008 INFORMS

of customer-rm interactions. The proposed model 2. Relationship Marketing and


allows one to probabilistically identify the customers Dynamics in Buying Behavior
state of relationship at any given time and enables
comparing the impact of alternative customer-rm 2.1. Relationship Marketing Dynamics
encounters on moving the customer to a higher state Research in the area of relationship marketing has
of relationship. been emerging in the past decade both from the con-
We propose a hidden Markov model (HMM) in sumer behavior perspective (e.g., Fournier 1998) and
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

which the states are a nite set of relationship states. from the empirical modeling perspective (e.g., Bolton
The transitions between the states are determined by 1998, Thomas 2001).
a set of time-varying covariates such as customer- Theoretical models (e.g., Dwyer et al. 1987) sug-
rm interactions, leading to a nonhomogenous HMM. gest that relationships evolve (not always monoton-
The relationship-state dependence is dened by the de- ically) through several discrete levels. In particular,
pendency between the relationship state and the it is suggested that relationships develop as a conse-
likelihood of the customers purchase behavior. The quence of changes in the relationships environment
number of states is determined by the complexity of and interactions between the relationships partners
the relationship and its dynamics over time. To dis- (Aaker et al. 2004, Fournier 1998, Hinde 1979). Fur-
tinguish between relationship-state dependence and thermore, Oliver (1997) suggests that a discrete shift
zero-order heterogeneity (Fader and Lattin 1993), un- in the relationship occurs if the aggregate satisfaction
observed heterogeneity is captured through a set of from a sequence of critical incidents is strong enough
random-effect coefcients. The HMM is estimated to move the customer to a different conceptual
using a Markov chain Monte Carlo (MCMC) hierar- plane of loyalty. Thus, transitions between relation-
chical Bayes procedure. ship stages might be triggered by discrete encounters
We apply the model to a university-alumni cus- between relationship parties. For example, offering an
tomer relationship data set. This empirical application airline traveler an upgrade to business class could
stresses the value of the model for CRM marketers. serve as a critical incident (Flanagan 1954). If the act
We identify three states, which correspond to dor- of upgrade and the experiences of the traveler dur-
mant, occasional, and active (very frequent) donors. ing the business class ight pass the customers sat-
The states are relatively sticky (large diagonal ele- isfaction threshold, this critical incident could have a
ments in the transition matrix). Attending a reunion
long-term impact on the travelers relationship with
seems to have a strong impact on moving alumni
the airline and the travelers subsequent choice of
from the dormant to the occasional donation state
ights. A sequence of discrete encounters between
and from the occasional to the active state. In con-
the customer and the rm constructs a relationship.
trast to the commonly used highest customer life-
Such encounters include transactions, service encoun-
time value approach, using the HMM we nd only
ters, customer initiated interactions, or exposure and
a small effect of reunion attendance on alumni in
response to marketing actions initiated by the rm.
the frequent donation state. Volunteering to univer-
sity roles, on the other hand, seems to have its pri- We use the notion that relationships are built from
mary impact on alumni in the dormant and active a series of customer-rm encounters as the building
states, but not on alumni in the occasional state. In our block of our model.
empirical application, we also nd superior predictive Recently, with the increase in popularity of CRM
validity of the HMM relative to a heterogeneous, yet software applications in the business world, more aca-
static, latent class model and a dynamic and hetero- demic research has been focused on building relation-
geneous recency-frequency model. ship models using marketing databases. This includes
The remainder of this paper is organized as follows. models of customer lifetime duration (e.g., Allenby
Section 2 relates the current work to the relation- et al. 1999, Bolton 1998, Reinartz and Kumar 2003,
ship marketing and dynamic choice modeling liter- Schmittlein and Peterson 1994) and customer lifetime
ature. Section 3 develops the HMM for capturing value (e.g., Libai et al. 2002, Rust et al. 2004). With
the dynamics of customer relationships and describes the exception of Reinartz and Kumar (2003), these
the hierarchical Bayes estimation procedure. In 4 we models do not take into consideration the dynam-
describe the application of the proposed model in the ics in the relationship that result from changes in
context of alumni relations using longitudinal gift- the customers environment, which is the main focus
giving data from the alumni association of a major of the current study. Indeed, in a review of service
private university. Section 5 concludes this paper with and relationship marketing models, Rust and Chung
a discussion of the theoretical and practical contribu- (2006) suggest that future research should model the
tions of this research, as well as an outline of direc- dynamics in customers preferences as a function of
tions for future research. the dynamic interactions between the customers and
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185204, 2008 INFORMS 187

the rm. In a review of customer lifetime value mod- nd positive and signicant state dependence effects
els, Jain and Singh (2002) suggest incorporating fac- across product categories even after controlling for
tors that drive consumer purchase over time and the heterogeneity; other studies nd mixed results (e.g.,
stochasticity in the buying behavior in the customer Jeuland et al. 1980). In the context of CRM, Pfeifer and
relationship model. Carraway (2000) use a Markov model between the
observed purchase recency states to capture dynam-
2.2. Dynamics in Buying Behavior and ics in customer lifetime value. Morrison et al. (1982)
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

Hidden Markov Models modied the brand switching Markov model to clas-
Methodologically, the model we developed is more sify Merrill Lynchs customers into prime and not
similar to the literature on marketing dynamics. Many prime states using managerial judgment. In the con-
marketing settings involve dynamics in consumer text of alumni donations, Soukup (1983) used an
behavior. These situations include both individual- ad-hoc dichotomization of past donations (donor and
level and market aggregate dynamics. The difculty noncontributor) to dene the customers state of
with capturing such dynamics is that in most market- donation behavior.
ing data sets the number of observations or time peri- A limitation of the observed states models is
ods observed is relatively small, and the nature and their restrictive account for buyer behavior dynamics,
structure of dynamics is often latent. To capture the whereby an ad-hoc specication of state dependence
latent structure of dynamics in a relatively parsimo- is added to an otherwise static model. A second short-
nious way, researchers developed various approaches coming of these models is that they often ignore other
that could be generally divided into discrete or con- important sources of dynamics in buying behavior,
tinuous state space structures. such as the enduring effects of marketing stimuli.
If the dynamics are assumed gradual or smooth, Indeed, for exogenous variables that are correlated
one could use a continuous state structure to capture over time, and are not controlled for, previous behav-
the dynamics. For example, time series autoregres- ior might be a determinant of current behavior simply
sive error models are used to capture the dynamics because it captures the effect of the omitted variables
in sales and the long-term effect of marketing activi- (Erdem and Sun 2001). This problem is likely to be
ties such as advertising (see Dekimpe and Hanssens more severe in the context of relationship marketing
2000 and Pauwels et al. 2004 for a review). Naik et al. because marketing actions such as loyalty programs
(1998) and Xie et al. (1997) use Kalman ltering to (Lewis 2004) and customer initiated interactions such
capture the dynamics in advertising scheduling and as service encounters (Bolton 1998) might alter the
new product introduction, respectively. In the choice customers relationship with the rm, and therefore
modeling literature, a smooth dynamic effect is often might have an enduring effect on the customers buy-
captured by a state-dependent term in the utility func- ing behavior. Finally, often the researcher or marketer
tion using an exponentially smoothed sum (Guadagni does not observe the consumer or market states that
and Little 1983, Srinivasan and Kesavan 1976) or a govern the dynamics.
simple running average (Bucklin and Lattin 1991) of To overcome the problem of unobserved states one
past purchases. could describe a set of latent states and transitions
However, the continuous state space is inadequate between these states and translate these latent states
to capture dynamics that are postulated to develop to the observed behavior through a stochastic model.
in a discrete manner such as an instantaneous regime This process can be described as an HMM. MacDon-
shift in the market conditions or consumer prefer- ald and Zucchini (1997, Chapter 4) describe several
ences (e.g., due to an inclusion or a drop of a brand applications of HMMs in areas ranging from biology,
from the consumers consideration set). One could geology, and climatology to nance and criminology.
model such dynamics, by allowing consumers (or The most common application of HMMs is in the
markets) to transition over time between a set of dis- area of speech recognition (Rabiner 1989, Rabiner and
crete states. Probably the simplest demonstration of Juang 1993). In econometrics, Hamilton (1989) pro-
such discrete states in the choice modeling literature posed an HMM to estimate the impact of discrete
is the state-dependent model (Heckman 1981). In this regime shifts on the growth rates of the real gross
model, the observed previous choice of the customer national product.
(captured by a lagged dependent variable) consti- Within the marketing literature, HMMs are closely
tutes the customer state in the current choice occasion. related to the family of latent class models (Kamakura
Choice modelers include state dependence in their and Russell 1989). Like most latent class models,
econometric models to capture heterogeneity across HMMs classify individuals into a set of states or
individuals as well as the serial correlation in pur- segments based on their buying behavior. However,
chases over time (McAlister et al. 1991). Using scanner unlike the latent class models, in HMMs the mem-
panel data, Keane (1997) and Erdem and Sun (2001) bership in the latent states is dynamic and follows
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
188 Marketing Science 27(2), pp. 185204, 2008 INFORMS

a Markov process. A handful of attempts have been 3. Model Development


made to model dynamic change in the latent segment
3.1. The Hidden Markov Model2
membership in marketing applications (e.g., Poulsen
The model described in this section is an individual-
1990, Ramaswamy 1997). Wedel and Kamakura (2000,
level model of buying behavior. We consider a set
chapter 10) and Dillon et al. (1994) survey these stud-
of customers, each of whom is involved in repeated
ies as well as alternative forms of dynamics in segmen-
interactions with a brand, rm, service provider, or
tations (e.g., Bckenholt and Dillon 2000, Bckenholt
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

institution. The marketer observes the choice history


and Langeheine 1996). Wedel and Kamakura conclude
for each individual and the marketing environment
that the issue of nonstationarity in marketing segmen-
at every time period. These data are similar to typical
tation should be further investigated. More recently,
transaction data commonly used in choice models.
Smith et al. (2006) develop a Markov switching crite-
We dene a relationship encounter as an interaction
rion for HMMs and empirically tested it in the context
between the customer and the rm. Such interactions
of dynamic effectiveness of advertising on brand sales.
might include purchase transactions, exposure to rela-
Montgomery et al. (2004) used a time-continuous
tionship marketing activities, or other nonpurchase
HMM, which combines discrete states and continu-
related exposure to rm. Relationships are made of a
ous transition times to study web-path analysis. Fader
longitudinal sequence of relationship encounters. We
et al. (2004) proposed a changepoint model to pre-
further dene a set of hidden (latent or unobserved)
dict new product sales. Liechty et al. (2003) applies an relationship states, which differ with respect to the
HMM to identify visual attention mode in advertising strength of the relationship between the customer
viewing. Du and Kamakura (2006) use an HMM to and the rm and the conditional likelihood of choice
identify latent states in American families life cycles. given the relationship state. The transitions between
Moon et al. (2007) use a random-effect HMM to aug- the states are probabilistically determined and are
ment unobserved competitors promotions in a phar- affected by relationship encounters. This structure of
maceutical context. latent states and observed behavior can be modeled
Our HMM of customer relationships pushes for- by an HMM.
ward the marketing literature related to dynamic An HMM is a model of stochastic process that
latent class models in several aspects. First, relation- is not directly observable but can be observed only
ships are constructed from a series of interactions through another set of stochastic processes that pro-
between the customer and rm. Because we are inter- duces a set of observations. In the proposed HMM,
ested in understanding the effect of these interactions the transition between the relationship states is char-
on dynamics, we relax the assumption made in all acterized by a Markov process. This stochastic process
the marketing HMM applications mentioned above is then transformed into the observed buyer behavior
of stationary transitions between the latent states. We through the stochastic process of choice. Specically,
use a nonhomogeneous HMM (Hughes and Guttorp we develop an HMM of repeated binary choices that
1994) in which the Markovian transitions are a func- relates the transitions between the latent relationship
tion of time-varying covariates. To our knowledge, states to the observed buying behavior (see Figure 1
this is the rst paper to do so in the marketing lit- for a graphical representation of the proposed HMM).
erature. Allowing for time-varying covariates in the The proposed HMM consists of three main compo-
transitions is important if one wishes to understand nents:
the drivers of the dynamics rather than merely build (1) The initial state distributionthe probability that
a model that ts the dynamics in the data. Second, customer i is in state s at time 1 is P Si1 = s = is .
CRM data sets are often collected at the individ- (2) The transitionsa sequence of Markovian transi-
ual level. When modeling dynamics using individual- tions (Qi t1t  that express, in a probabilistic manner,
level data it is crucial to account for heterogeneity the likelihood that the series of customer-rm interac-
in order to distinguish cross-individual heterogeneity tions in the previous time period were strong enough
from dynamics. Most HMM applications estimate the to transition the customer to another state. The prob-
model at the aggregate level or using aggregate data ability that a customer transitions from state st1 at
(see Liechty et al. 2003, Montgomery et al. 2004, and time t 1 to state st at time t is P Sit = s   Sit1 = s
Moon et al. 2007 for exceptions). Finally, from a sub- = qitss .
stantive point of view, our application of the HMM to
the area of customer relationships brings an advanced 2
To keep this manuscript in a manageable length, we provide only
methodological modeling approach to help address a brief description of HMMs. The interested reader is referred to
an emerging managerial need to manage customer Rabiner (1989) and MacDonald and Zucchini (1997) for a detailed
relationships over time. treatment of the topic.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185204, 2008 INFORMS 189

Figure 1 A Hidden Markov Model of Customer Relationships

Relationship states State dependent


choice

S = NS: Very strong P(Choicet | S = NS)


Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

Customer- S = NS1: Strong P(Choicet |S = NS1)


firm Observed
interactions choice at
at times time t
1,,t 1

S = 1: Very weak P(Choicet |S = 1)

(3) The state dependent choicethe probability that alumni donations used in our empirical application,
the customer will choose the product at time t condi- we found this assumption to be both behaviorally and
tioned on her state is P Yit = 1  Sit = s = mit  s 3 where empirically grounded.
Each one of the matrix elements in Equation (1)
Sit is the state of customer i at time t in a Markov
represents a probability of transition. We assume that
process with NS states, and
the customers propensity for transition is affected
Yit is the choice made by customer i at time t.
by his/her relationship encounters with the rm. We
3.2. The Models Components model the transitions between the states as a thresh-
old model, where a discrete transition occurs if the
3.2.1. The Markov Chain Transition Matrix. We propensity for transition passes a threshold level. As
model the transitions between states as a Markov pro- mentioned previously, the idea that a movement to a
cess. The transition matrix is dened as discrete level of relationship occurs when the aggre-
State at t gate measure of satisfaction or dissatisfaction from
State at t 1 1 2 3 NS 1 NS relationship encounters passes a threshold has roots
1 qit11 qit12 qit13 qit1NS1 qit1NS in the relationship and service marketing literature
(Oliver 1997).
Qi t1t = 2 qit21 qit22 qit23 qit1NS1 qit2NS (1) The norm theory of Kahneman and Miller (1986)
       postulates that past experiences create a norm, against
      
which current experiences are judged. Thus, it might
NS qitNS1 qitNS2 qitNS3 qitNSNS1 qitNSNS
be reasonable to expect that current relationship
encounters be judged relative to the status quo. If the
where qitss = P Sit = s   Sit1 = s is the conditional cumulative experience from the encounters between
probability that individual i moves from state s at the customer and the rm is highly negative (e.g.,
time t 1 to state s  at time t, and where 0 qitss 1 service failure), it is likely to shift the propensity for

s s  , and s qitss = 1. In applying our model in the transition below the threshold needed for a transition
context of alumni-university relationships (see 4), to a lower state. On the other hand, if the encounter
we put a structure on the general transition matrix is highly positive (e.g., an important product benet
in Equation (1). Specically, we dene the transition learned from an advertisement campaign), it is likely
matrix as a random walk with a sudden death, to shift the propensity for transition above the thresh-
whereas from each state the customer/alumni could old needed for a transition to a higher state. If the
move to an adjacent state or drop immediately to relationship encounters in the previous period did not
dormancy. This assumption was made primarily for have a strong impact on the customer, the customer
model parsimony. Nevertheless, in the context of
is likely to stay in her current state.
Putting this process in mathematical terms, and
3
In what follows, we assume no information is available about assuming that the unobserved part of the propen-
the competition. We take this approach because this is generally sity for transition is independently and identically
the case in CRM transaction data sets, where the rm does not
observe transactions with the competition. Nevertheless, if infor-
distributed (IID) of the extreme value type, we can
mation about the competition exists, one could extend the model model the nonhomogeneous transition probabilities
to incorporate competitive information. following the ordered logit model (Greene 1997).
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
190 Marketing Science 27(2), pp. 185204, 2008 INFORMS

Specically, the terms qitss in the transition matrix in existence or uniqueness of a stationary distribution.
Equation (1) could be written as In our empirical application, all the estimated indi-
vidual transition probabilities were strictly positive
qits1 = Prtransition from s to state 1 conrming that the transition matrices are aperiodic
exp1is ait is  and irreducible, thus guaranteeing an existence and
= (2) uniqueness of the stationary distribution.
1 + exp1is ait is 
3.2.3. The State-Dependent Choice. Given the
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

 customers state, the customer choices are assumed


qitss = Prtransition from s to s 
to be conditionally independent. Thus, given relation-
exps  is ait is  ship state s, we model the probability of a dichoto-
=
1 + exps  is ait is  mous choice following the well-known binary logit
exps  1is ait is  model,5
(3)
1 + exps  1is ait is  exp 0s + xit s 
mit  s =  s = 1    NS (5)
1 + exp 0s + xit s 
qitsNS = Prtransition from s to NS
where
expNS 1is ait is 
= 1
1 + expNS 1is ait is 
(4)  0s is the state-specic coefcient for state s,
xit is a vector of time-varying covariates associated
for s 1    NS and s  2    NS 1, with the choice of individual i at time t, and
where s is a vector of state-specic response coefcients.
is is a vector of parameters capturing the effect The full vector of conditional choice probabilities is
of relationship encounters of individual i on mit = mit  1 mit  2    mit  NS .
the propensity for transition from state s, The difference between the vectors of covariates
ait is a vector of time-varying covariates for ait  to be included in the transition matrix and in
individual i between time t 1 and time t, the vector of covariates in state-dependent choice xit 
and is noteworthy. The conceptual distinction is between
s  is  is the s  ordered logit threshold for individ- those variables that have an enduring impact on the
ual i in state s, where s  1    NS 1. attitude of the customer toward the product or ser-
vice and those that affect only the short-term choice
Note that the marginal effects of the time-varying behavior. For example, advertising is often assumed
covariates in the transition matrix are state spe- to have a long-term impact on attitude while a price
cic, thus allowing for different impacts of the time- promotion affects only the short-term choice behavior.
varying covariates depending on the customers state. In the transition matrix, one should include covariates
3.2.2. The Initial State Distribution. For an that are hypothesized to have an enduring impact on
HMM with time homogeneous transition matrix, the customers buying behavior (e.g., advertisement,
the initial state distribution is commonly dened as service encounters, or relationship-based marketing
the stationary distribution of the transition matrix activities). On the other hand, the covariates in the
(MacDonald and Zucchini 1997). However, because state-dependent choice vector are assumed to primar-
our transition matrix is a function of time-varying ily have an immediate effect on the customer (e.g.,
covariates, we calculate the stationary distribution of price and display promotions).6 The researcher could
the transition matrix by solving the equation i = also test empirically (e.g., using t measures) the
i , under the constraint NS
i Q
s=1 is = 1, where Qi is appropriate location for each covariate in the model.
the transition matrix with the parameter estimates fol- To ensure identication of the states, we restrict the
lowing 3.2.1 and all covariates are set to their mean choice probabilities to be nondecreasing in the rela-
value across individuals and time periods.4 Generally, tionship states. Because both the intercepts and the
the stationarity conditions above do not guarantee
5
In the context of alumni university donations, we model the state-
4
An alternative approach would be to take the stationary distribu- dependent and dichotomous choices rather than donation amounts
tion of the transition matrix with the covariates set to zero. For the because we believe that the act of choice is a stronger determinant
empirical application in 4, the two approaches yielded very simi- of relationship strength than quantity measures. Nevertheless, the
lar results. In general, one should use the stationary distribution of model could be extended to capture quantity or amount outcomes
the transition matrix with the covariates set to zero if the data set using Poisson or Tobit models.
6
is not left truncated (i.e., we observe the initial interaction between The covariate in the state-dependent choice vector could still have
the customer and the rm), and the stationary distribution of the an indirect long-term effect through the longitudinal effect of the
transition matrix at the mean of the covariates otherwise. current choice on future buying behavior.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185204, 2008 INFORMS 191

response parameters are state-specic, we impose this Using our notations for the three components of the
restriction at the mean of the vector of covariates, xit . HMM in Equations (1)(5), we can rewrite Equa-
Thus, the vector xit is mean-centered and the restric- tion (7) as
tion  01  02  0NS is imposed by
Pi Yi1 = yi1    YiT = yiT 

s
NS 
 0s = 01 + exp0s  s = 2    NS (6) 
NS  NS  
T

s  =2 = is1 qis s1


Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

s1 =1 s2 =1 sT =1 =2
3.2.4. Accounting for Heterogeneity. Because our 
model is dynamic, one must ensure that the zero- 
T
y
mii s 1 mi  s 1yi   (7a)
order heterogeneity is fully accounted for to distin- =1
guish it from time dynamics. Heckman (1981) sug-
gests that ignoring heterogeneity might lead to a A problem with Equation (7a) is that it has NS T
strong spurious state dependence, even when the elements and is therefore computationally intractable
actual choices are not correlated over time. Similarly, for even modest values of T . Following MacDonald
a model that accounts for heterogeneity but ignores and Zucchini (1997), we can rewrite Equation (7a) in
state dependence may overestimate the degree of a matrix products form that simplies computation:
heterogeneity (Keane 1997). Our proposed HMM
LiT = P Yi1 = yi1    YiT = yiT 
addresses the second problem by offering a exi-
ble specication of state dependence. To distinguish i1 Qi 12 m
= i m iT 1
i2    Qi T 1T m (8)
between heterogeneity and dynamics, we dene
random-effect parameters in the transition matrix where
y
Qi t1t . We incorporate heterogeneity by allowing m it  s = mitit s 1 mit  s 1yit  and m
it is a NS NS diago-
the threshold parameters (s  is  in Equations (2)(4) nal matrix with the elements of m it  s on the diagonal,
to vary across individuals.7 This specication allows and 1 is a NS 1 vector of ones. To avoid underow
for heterogeneity in the stickiness to different states, of the likelihood function in Equation (8) we divide
since the distance between the low and high thresh- the joint state likelihood after every time period by
olds could vary across individuals. Heterogeneity in Lit /NS, accumulate the logarithms of these scale fac-
the transition matrix also implies heterogeneity in the tors, and add it to the logarithm of the likelihood
initial state distribution. An alternative heterogeneity function (for details see MacDonald and Zucchini
specication would be to allow the state-dependent 1997, p. 79). The scaled log-likelihood function across
choice to vary across individuals. However, from a individuals is simply the sum of the individual scaled
managerial point of view, such heterogeneity speci- log-likelihood over i 1    N .
cation implies individual-specic state interpretation,
3.4. Estimation Procedure
thus losing the ability to classify the customers into a
In this section, we describe the procedure used to esti-
common set of states.
mate our model. In choosing the estimation proce-
3.3. The Likelihood of an Observed dure, we focus on properly accounting for observed
Sequence of Choices and unobserved heterogeneity. We estimate the HMM
Due to the Markovian structure of the model, the parameters: the transition matrix parameters and the
individual choice probabilities are correlated through state-dependent choice parameters described in Equa-
the common underlying path of the hidden states. tions (2)(6) using the joint likelihood function in
Accordingly, the joint likelihood of a sequence of Equations (7)(8).
choices is given by the sum over all possible routes We estimate our HMM using a standard hierarchi-
the individual could take over time between the cal Bayes estimation procedure (Rossi and Allenby
underlying states: 2003) using two sets of parameters: random-effect
parameters i  and parameters that are common
Pi Yi1 = yi1    YiT = yiT  across individuals  . We dene i = s  is1   
NS  s  iNS1 s  1    NS and  = 1 2    NS

NS NS  
T
= P Si1 = s1  P Si = s  Si1 = s1  01 02    0NS 1 2    NS . Heterogeneity is
s1 =1 s2 =1 sT =1 =2 introduced into the model for the random-effect
 parameters by dening uninformative priors i

T
 . We also estimate a model in which the
P Yi = yi  Si = s   (7) MVN 
=1 observed individual characteristics, such as demo-
graphics, are introduced into the model in a hier-
7
To keep the model parsimonious we avoid a full random-effect archical manner (e.g., Allenby and Ginter 1995). We
specication of the transition matrix Qi t1t . complete the specication by assuming appropriate
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
192 Marketing Science 27(2), pp. 185204, 2008 INFORMS

and diffuse priors on  and  . We sequentially draw of the alternative models in terms of their predic-
from the set of conditional posterior distributions.8 tion ability on a hold-out sample, and a survey-based
The conditional posterior distributions of i and  do analysis of the behavioral dimensions underlying the
not have a closed form. Thus, we use the Metropolis- alumni-university relationship states.
Hasting algorithm to draw from these posterior dis-
tributions. To reduce the degree of autocorrelation 4.1. Application to Alumni Relations
between draws of the Metropolis-Hasting algorithm To empirically illustrate the ability of the proposed
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

and to improve the mixing of the MCMC we use model to capture the dynamics in customer relation-
an adaptive Metropolis adjusted Langevin algorithm ships and choice behavior, we apply the proposed
(Atchad 2006). We found this recent approach to be HMM in the context of university-alumni relations
very useful in our HMM application. and gift-giving (i.e., donation) behavior. The objec-
It should be noted that unlike the HMM Bayesian tive of the empirical application is to show how
estimation methods that augment the latent state one can use observed alumni gift-giving data to
memberships (Djuric and Chun 2002, Kim and Nelson (1) dynamically classify the alumni into relation-
1999, Moon et al. 2007, Scott 2002), we use a Bayesian ship strength states, (2) understand the factors that
approach to estimate directly the likelihood function inuence the dynamics in gift-giving behavior, (i.e.,
in Equation (8) with random-effect parameters. assess what alumni-university interactions are likely
to move alumni to a higher relationship state), and
3.5. Recovering the State Membership (3) predict future gift-giving behavior.
Distribution There are several reasons for choosing the alumni
An attractive feature of the HMM is the ability to gift-giving data as an empirical application for our
use it to probabilistically recover the individuals state model. First, we believe that the dynamics in the
at any given time period. This measure could be gift-giving behavior, due to the strong relationship
directly derived from the likelihood function in Equa- underlying its construct, is stronger than the dynam-
tion (8). Two approaches have been suggested for ics found in typical scanner panel data. Second, this
recovering the state membership distribution: lter- data set contains most of the components suggested
ing and smoothing (see Hamilton 1989 for a dis- for a good CRM data set. Specically, it includes four
cussion). Filtering uses only the information known out of the ve elements suggested by Winer (2001):
up to time t to recover the individuals state at time t, transaction data, customer contacts, descriptive infor-
whereas smoothing uses the full information available mation about the customers, and longitudinal data.
in the data. The ltering approach is more appealing As discussed further below, this data set is somewhat
for marketing applications, where decisions are made limited in terms of tracking exposure and response to
based only on the history of the observed behavior. marketing activities. As a result, we use other time-
The ltering probability that individual i is in state varying covariates that may affect the alumni state
s at time t conditioned on the individuals history of such as reunion attendance. Finally, in the sagging
choices is given by economy, private and public schools face severe nan-
cial problems. Charitable contributions to universities
P Sit = s  Yi1 Yi2    Yit  dropped in 2002 for the rst time in 15 years (New
i1 Qi 12 m
= i m it  s /Lit
i2    Qi t1ts m (9) York Times 2003). Therefore, addressing the problem
of managing the $24 billion market of U.S. alumni
where fundraising is of signicant nancial consequence.
Previous research on alumni gift-giving behavior
Qi t1ts is the sth column of the transition matrix
is limited. The few articles that have been published
Qi t1t , and
in this area investigated the effect on alumni gift
Lit is the likelihood of the observed sequence
giving of (1) institutional characteristics (Baade and
of choices up to time t from Equation (8).
Sundberg 1996, Harrison et al. 1995), (2) reunions
(Willemain et al. 1994), and (3) individual character-
4. Empirical Application istics such as demographics, nancial aid, and partic-
In this section, we describe the empirical application ipation in college athletics (Okunade 1993, Okunade
of the proposed HMM in the context of the rela- and Berl 1997, Taylor and Martin 1995). Recently, in
tionship between alumni and their alma mater. We the marketing literature, Arnett et al. (2003) related
describe, in order, the data set, the alternative mod- alumni-university relations to the behavioral identity
els estimated, the estimation results, the comparison salience model. To our knowledge, no study has pre-
viously investigated and modeled the dynamics of
8
The complete set of conditional distributions and the iteration gift-giving behavior and the factors that can alter this
sequence of the MCMC simulation appear in Appendix A. dynamic behavior.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185204, 2008 INFORMS 193

4.2. Data Description Table 1 Descriptive Statistics of the Calibration and


The data used to calibrate and validate the model are Hold-Out Sample
sampled from the database provided by the alumni Key characteristics Percentage/Average
association of a large west coast university. Our data
set consists of over 17,000 randomly sampled alumni. Overall observations (gift opportunities) 27,526
Overall number of alumni 1,256
This represents 10% of the total university alumni Mean observations per individual 21.9
base (see Appendix B in Netzer 2004 for a detailed Proportion of donation years 43%
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

description of the data set). From this data set, we use Mean donation $478
in our analysis 1,256 alumni who graduated with an Q25 yearly donation $30
undergraduate degree9 between 1966 and 1988 and Median yearly donation $100
Q75 yearly donation $200
donated at least once in the rst 10 years following
Gender
graduation (see Table 1 for descriptive statistics of the Female 40%
sample). Male 60%
For each alumna/alumnus the data provide cumu- School
lative information on total gift giving since the time Earth 3%
of graduation, as well as detailed disaggregate data Humanities 79%
about his/her gift giving since 1976 (or time of Engineering 15%
Other (e.g., Undeclared) 3%
graduation, whichever is more recent). The data set
Degree
also contains disaggregate information about differ- Undergraduate 73%
ent alumni-university interactions for the years 1976 Undergraduate + Graduate 27%
2001, including participation in university events and Spouse is university alumna/alumnus
volunteering for alumni roles. We use the observa- Yes 56%
tions in the years 19761998 to calibrate the model, No 44%
and the last three years of possible gift giving for each Alumni association membership
Member 69%
alumna/alumnus (19992001) for validation.
Not member 31%
A necessary condition for the identication of the
proposed HMM is dynamics in donation behavior
over time. To examine whether such dynamics exist in effects. These covariates determine the vector xit in
our data we used the Run Test (Frank 1962). The Run Equation (5). In the scanner data context, price promo-
Test strongly supports the existence of individual- tion and display are examples of activities with short-
level dynamics in the donation behavior, in particular term inuence. In the current application, we use
it suggest that alumni go through periods of donation reunion year (any multiple of ve years since grad-
and nondonation (for the full description of the Run uation) as such a covariate. We consider a reunion
Test results see Netzer 2004). year as a short-term inuence to donate due to
increased salience of the university during that year,
4.3. Variables Description which might raise the likelihood of giving in the spe-
The variables of this data set can be divided into three cic year but not in subsequent years. However, it
categories: is actual participation in a reunion that might lead
1. Alumni-university interactionsThis set of vari- to a change in the strength of relationship between
ables denes the interactions between the alumni and the alumna/alumnus and the university and to a
the university in the vector ait in Equations (2)(4). long-term impact on subsequent donation behavior.
These are recorded over time post graduation. In this Accordingly, reunion year is included in the state-
study, we consider two types of interactions (besides dependent choice (the vector xit , and participation in
donations): reunion attendance and volunteering for a reunion in the transition matrix (the vector ait .
a university role. It should be noted that our mod- 3. Choice behaviorGift-giving behavior. This is the
eling approach assumes that these alumni-university dependent variable, which is captured by the inci-
interactions occur, and are observed, prior to dona- dence of donation (0 or 1). If one is interested in
tion. This assumption seems to be valid for these two merely predicting donations, this variable could be
variables, because donations generally occur post the replaced by the actual amount donated using an
reunion attendance and volunteering decisions. ordered logit or a Tobit model. However, in the con-
2. Inuence attemptsThese alumni-university in- text of relationship dynamics model, the alumni asso-
teractions are postulated to have mainly short-term ciation considers the actual act of giving as a stronger
determinant of relationship than the amount given.
9
Following the alumni associations recommendation, we sampled According to the alumni association, we aggregated
only alumni who received their undergraduate degree (possibly gift-giving and events data by calendar year. To ac-
followed by higher degrees) from the university. count for time dynamics that are not related to the
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
194 Marketing Science 27(2), pp. 185204, 2008 INFORMS

relational marketing encounters, a linear time trend, The distinction between the HMMs and the re-
representing the number of years since graduation, is cency-frequency model is noteworthy. The recency-
included in the state-dependent vector. Additionally, frequency model accounts for dynamics in an ad-hoc
the data set contains alumni characteristics such as fashion in which the structure of the effect of past
year and major of graduation, gender, marriage to a choices on the current choice is dened a priori through
university alumna/alumnus, and membership in the the recency and frequency terms. The HMM offers
alumni association. a more behaviorally structured approach for state
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

dependence through a Markovian transition between


4.4. Estimated Models a set of relationship states. Furthermore, although
We estimated the HMM of customer relationship dy- some structure is imposed on the transitions between
namics described in 3, using the MCMC hierarchical the states, the choice of the number of states allows
Bayes estimation described above. Additionally, we us to determine the structure of dynamics based on
estimated a restricted version of this model, with no the complexity of the dynamics of the data at hand.
heterogeneity, as well as a nondynamic benchmark
4.5. Estimation Results
model and the state dependent recency-frequency
Models 2 and 3 were estimated using the maximum
model commonly used in relationship marketing.
likelihood procedure MAXLIK in the GAUSS statis-
Model 1This is the full HMM.10 tical software. Models 1 and 4 were estimated using
Model 2HMM with no heterogeneity. This model a MCMC hierarchical Bayes procedure, using the
is similar to Model 1, but with common parameter Gibbs sampler and the Metropolis-Hastings algorithm
across individuals. A limitation of this model is that coded in GAUSS. In the hierarchical Bayes estima-
this model cannot distinguish between heterogeneity tion, the rst 90,000 iterations were used as a burn-
and dynamics in the gift-giving behavior. in period, and the last 10,000 iterations were used
Model 3Nondynamic model. This is the latent to estimate the conditional posterior distributions and
class model (Kamakura and Russell 1989) with the moments. To asses convergence of the MCMC we
same number of latent states as in Model 1. This adopt the method proposed by Gelman and Rubin
model does not allow individuals to move between (1992), which compares the within to between vari-
the states over time. ance for each parameter estimated across multiple
Model 4Recency-frequency model. This model is chains. Across three parallel chains, the scale reduc-
frequently used in relationship management applica- tion estimate for all the parameters estimated is
tions (e.g., Bult and Wansbeek 1995). In this model, lower than 1.2, suggesting that convergence has been
the recency since the last donation and frequency achieved.
of donations up to time t enter as covariates in the 4.5.1. Selecting the Number of States. The rst
model.11 In the recency-frequency model, the proba- stage in estimating the HMM is selecting the num-
bility of choice follows the binary logit formulation: ber of states. Model selection measures can be used
expxit i  to choose the number of states. Due to the sensitiv-
P Yit = 1 = (10)
1 + expxit i  ity of some Bayesian model selection criteria, such as
where the marginal log-likelihood and Bayes factor, to the
i is a vector of random-effect parameters, and specied priors (Rossi and Allenby 2003), we compare
xit is a vector of time-varying covariates. alternative model selection criteria. Specically, we
The vector xit includes all the covariates used in contrast the Bayes factor with the log-marginal den-
the HMM (reunion years, reunion participation, vol- sity,12 the marginal validation log-likelihood measure
(Andrews and Currim 2003), the deviance informa-
unteering to university roles and years since grad-
tion criterion (DIC) proposed by Spiegelhalter et al.
uation) as well as recency since last donation and
(2002), and the Markov switching criterion (MSC),
donation frequency. We incorporate heterogeneity in
recently developed for HMMs states and variables
this model by estimating a random-effect intercept
selection by Smith et al. (2005).13 The validation log-
and the recency-frequency parameters using hierar-
likelihood is calculated following Equation (8), using
chical Bayes estimation.
12
The Schwarz Bayesian information criterion (BIC; Schwarz 1978)
10
In addition to the full HMM, we also estimated a hierarchical commonly used for model selection in classic statistical applica-
HMM in which the random-effect coefcients are a function of indi- tions, asymptotically approximates the Bayesian posterior marginal
vidual characteristics (see Table 6). density.
11 13
We do not model the monetary donation amounts to be consistent The MSC was originally developed for a stationary, aggregate
with the proposed HMM. Moreover, adding the running average data HMM. We adapted this criterion to our random-effect non-
of donation amounts as a predictor did not signicantly improve stationary HMM. The details of the modied MSC appear in
the models t or prediction for donation incidents. Appendix B.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185204, 2008 INFORMS 195

Table 2 Choosing the Number of States Table 3 Estimation Results for the Hidden Markov Models

Number 2 Marginal Log Bayes Validation Model 1 Model 2


of states log-density factor DIC MSC 2 log-likelihood HMM with HMM no
Parameter heterogeneity heterogeneity
1 267085 264930 504765 47685
2 255571 5757 253788 488066 36891 01 state dependent 2778 (0.297) 2659 (0.092)
3 253971 800 251763 483328 35836 intercept (State 1)
4 256891 1462 256582 486395 37175 02 state dependent 0956 (0.244) 1076 (0.038)
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

intercept (State 2)
03 state dependent 2388 (0.432) 2578 (0.225)
the hold-out sample (the years 19992001 in the data intercept (State 3)
set). The log marginal density and Bayes Factor are Reunion year (State 1) 0646 (0.375) 0592 (0.128)
calculated using a harmonic mean of the individ- Reunion year (State 2) 0017 (0.300) 0048 (0.070)
Reunion year (State 3) 0359 (1.397) 0950 (0.128)
ual likelihoods across iterations (Newton and Raftery
1994) from the output of the Metropolis-Hasting Years since graduation (State 1) 0071 (0.099) 0056 (0.009)
Years since graduation (State 2) 0053 (0.099) 0062 (0.008)
sampler. Years since graduation (State 3) 0992 (0.458) 1247 (0.287)
Based on all measures, the best-tting model is the
hi1 high threshold (State 1) 2159 (0.132) 2507 (0.078)
model with three states. This model minimizes the lo2 low threshold (State 2) 1807 (0.177) 2141 (0.078)
2 log-marginal density, validation 2 log-likelihood, hi2 high threshold (State 2) 1005 (0.140) 1500 (0.037)
DIC and MSC, and shows a favorable Bayes factor14 lo3 low threshold (State 3) 3461 (0.129) 9490 (8.214)
(see Table 2). hi3 high threshold (State 3) 1001 (0.137) 2085 (1.021)
V hi1  0200 (0.080)
4.5.2. HMM Estimates. Table 3 reports the poste- V lo2  0672 (0.227)
rior means and posterior standard deviations (param- V hi2  0232 (0.076)
eter estimates and standard errors for Model 2) of V lo3  0168 (0.103)
the two variants of the HMM based on the calibra- V hi3  0212 (0.054)
tion sample. In the heterogeneous HMM, the param- Volunteering (State 1) 1604 (0.523) 1479 (0.230)
eters that capture the effect of reunion attendance, Volunteering (State 2) 0092 (0.534) 0049 (0.190)
Volunteering (State 3) 0969 (0.517) 1080 (0.260)
reunion participation and volunteering are all positive
Reunion participation (State 1) 2895 (0.621) 2526 (0.263)
as expected. With the exception of state 1, likelihood
Reunion participation (State 2) 1044 (0.734) 0959 (0.959)
of donation is increasing with years since graduation. Reunion participation (State 3) 0779 (0.861) 0181 (0.677)
To get a better understanding of the magnitude of the
2 log-marginal density/2 log 25,397.1 25,565.9
parameter estimates in Table 3 we plug these param- likelihood
eters into Equations (2)(6) to get the state dependent

Numbers in parentheses are posterior standard deviations for Model 1
choice and transition probabilities.
and standard errors for Model 2.
The interpretation of the three states is primarily
V  refers to the posterior standard deviation across individuals.
determined by the state-specic intrinsic propensity
to donate (the parameters 01 , 02 , and 03 . At the
mean of the covariates years since graduation and 10% to 68%. Reunion attendance also has a strong
reunion year, the conditional probability of donation impact on moving alumni from the occasional to the
given state 1 is 6%, given state 2 it is 46%, and given active state; it increases this likelihood from 28% to
state 3 it is 100%. Accordingly, we label these three 53%. More importantly, reunion attendance decreases
states as dormant, occasional, and active states, the likelihood of dropping from the occasional state
respectively. to the sticky dormant state from 14% to only 5%. In
An interesting feature of our model is the abil- contrast, the effect of reunion participation on keeping
ity to investigate the effects of time-varying covari- alumni in the active state is moderate (i.e., from 68%
ates on the transitions between the states. Speci- to 82%). Volunteering to university roles, on the other
cally, we compare the effect of reunion attendance hand, has its primary impact on alumni in the dor-
and volunteering to university roles. The middle and mant and active states, whereas the effect on alumni
right matrices in Table 4 demonstrate the average in the occasional state is minimal. While targeting
effect (across alumni) of reunion attendance and vol- the most active customers is consistent with the cus-
unteering on the state transitions. Attending a reunion tomer lifetime value approach and with the prac-
increases the likelihood that an alumna/alumnus in tice of rewarding loyal customers, the result that
the dormant state moves to the occasional state from reunion attendance has the highest impact on alumni
in the dormant and occasional state is consistent with
14
The log Bayes factor compares the model with NS states to the
the theory of intermittent reinforcement. Due to the
model with NS 1 states. A log Bayes factor larger than ve sug- probabilistic nature of the transitions, active alumni
gests strong evidence in favor of the model with NS states. are likely to transition into the occasional state once in
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
196 Marketing Science 27(2), pp. 185204, 2008 INFORMS

Table 4 The Mean Posterior Transition Matrices

No interactions Reunion attendance Volunteering

t t t

t 1 Dormant Occasional Active Dormant Occasional Active Dormant Occasional Active

Dormant 90% 10% 0% 32% 68% 0% 64% 36% 0%


Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

[89%90%] [10%11%] [ ] [17%48%] [52%82%] [ ] [49%74%] [25%51%] [ ]


Occasional 14% 58% 28% 5% 42% 53% 13% 57% 30%
[14%15%] [56%59%] [27%30%] [1%12%] [17%56%] [31%80%] [7%21%] [47%60%] [20%45%]
Active 3% 29% 68% 1% 17% 82% 1% 14% 85%
[3%3%] [28%31%] [66%69%] [0%4%] [2%35%] [61%97%] [0%2%] [9%21%] [77%90%]

95% condence interval in parenthesis.

several time periods and therefore should be affected state, by moving them away from the sticky dor-
by reunion attendance. mant state and by increasing the likelihood of a tran-
It is important to note that the results presented sition to the relatively sticky active state. We use the
in Table 4 are the mean of the posterior distribu- posterior mean of the transition matrix parameters to
tion across individuals. Inference from our estimation gain better understanding of the long-term effect of
could and should be derived at the individual level the time-varying covariates in the transition matrix.
in a similar manner. Specically, we calculate the effect of reunion atten-
The time-varying covariates reunion attendance and dance and volunteering over the course of 20 years
volunteering are endogenous to the alumna/alumnus, for an alumna/alumnus with an average initial state
thus one should be careful about treating these as deci- distribution at time 0 (average donation rate of 41%)
sion variables in the context of our data set. Never- and attended a reunion and/or volunteered to a uni-
theless, given the observed behavior, one could still versity role at year 1.
use these alumni-university interactions to dynam- As is evident from Figure 2, both time-varying
ically segment alumni to the relationship states. covariates have a long-term impact on the propensity
Table 4 demonstrates the value of adding time- to donate. The immediate as well as long-term effects
varying covariates such as customer initiated interac- of reunion attendance are stronger than the effects
tion or marketing interventions. Future research could of volunteering to a university role. Five years after
explore the application of our model to data sets in the reunion attendance or volunteering year, approx-
which the time-varying covariates are also decision imately 50% of the effect of these events still carries
variables. over.
The transition probabilities in Table 4 suggest that One interesting product of the transition matrices
reunion attendance might have an enduring impact in Table 4 is the stationary distribution of the tran-
on the donation behavior of alumni in the occasional sition matrix at the mean of the covariates, which is

Figure 2 The Long-Term Effects of the Time-Varying Covariates


70
Reunion attendance
65 Volunteering
Reunion attendance + volunteering
Probability of donation (%)

60 Baseline

55

50

45

40

35

30
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185204, 2008 INFORMS 197

also our initial state distribution. The stationary dis- Table 5 Mean Posterior Transition Following a Donation at
tribution of the alumni in the three states is 46%, 29%, Time t
and 25% in the dormant, occasional, and active states, Donation at t
respectively.
In the HMM, following each choice, the model up- t
dates the state membership distribution. This allows t 1 Dormant Occasional Active
estimating the effect of a donation at time t on the
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

Dormant 53% 47% 0%


probability of being in state s at time t and therefore [47%58%] [42%53%] []
on the probability of donation at time t + 1 (see Table Occasional 1% 48% 51%
5). Using Equation (11) we calculated the state mem- [1%2%] [43%51%] [47%55%]
bership probability at time t following a donation at Active 0% 16% 84%
time t. [0%0%] [14%18%] [71%86%]

P Sit = s  Yit = 1 Sit1  = PrSit1 Qi t1ts mits 


/PrSit1 Qi t1t mit 1  (11) Not only is the dormant state the most sticky
on average, as suggested by Table 4, but also alumni
where in this state are most homogeneous in terms of their
PrSit1  is individual is state membership distri- likelihood of staying in this state. On the other hand,
bution at time t 1, alumni in the occasional and active states are rela-
Qi t1ts is the sth column of the transition matrix tively heterogeneous in terms of their propensity to
Qi t1t , and stay in these states.
mit is a NS NS diagonal matrix with the elements We relate the random-effect parameters to observed
of mit  s on the diagonal. heterogeneity using a hierarchical Bayes structure (see
As expected, donations have a very strong impact Appendix A for details). Table 6 presents the parame-
on the transition probabilities. ter estimates of the observed heterogeneity covariates.
The transition matrices in Tables 4 and 5 present Several individual characteristics are signicantly
the mean of the posterior distribution across alumni. related the propensity for transition between the
Next, we describe the heterogeneity in these matrices states. Specically, membership in the alumni asso-
across alumni. ciation increases the likelihood of transitioning up
4.5.3. Posterior Distributions and Observed Het- from the dormant and occasional states. Females and
erogeneity. In our HMM, alumni can differ with alumni families in which both household members
respect to their propensity to switch between states are alumni tend to avoid falling into dormancy from
due to the random-effect transition threshold param- the occasional state, but given in an active state tend
eters. Figure 3 depicts the distribution of the alumnis either to stay active or drop to dormancy. Further-
posterior propensity to stay in each one of the states more, graduates of the earth sciences, humanities and
given in that state in the previous period (the diago- engineering schools are less likely to transition to the
nal elements of the rst transition matrix in Table 4). active state relative to the othermajors.

Figure 3 Posterior Distribution of the Propensity to Stay in the Three States


800
Dormant state
700 Occasional state
Active state
600

500
Frequency

400

300

200

100

0
0 10 20 30 40 50 60 70 80 90 100
(%)
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
198 Marketing Science 27(2), pp. 185204, 2008 INFORMS

Table 6 Mean Posteriors for Observed Heterogeneity Parameters

hi1 lo2 hi2 03 lo3


high threshold low threshold high threshold drop to dormant low threshold
Variable dormant occasional occasional threshold active active

Intercept 3320 3622 1261 1720 0603


Alumni association member 0466 0177 0260 0277 0243
Spouse is university alumna/alumnus 0687 0362 0109 0413 0278
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

Years since graduation in 1976 0085 0047 0007 0081 0018


Female 0180 0523 0204 0722 0401
Earth sciences major 0030 8641 3351 0150 0102
Humanities major 0023 2660 2429 1678 0052
Engineering major 0076 1587 2609 0494 0567
Only undergraduate degree 0084 0300 0113 0193 0335
from the university

The 90% condence interval does not include zero.

The 95% condence interval does not include zero.

4.6. Predictive Ability to that of the full HMM. This is not surprising because
We use the hold-out data to assess the prediction abil- in Model 2 heterogeneity in the hold-out period is
ity of the HMM and compare it to the four bench- elicited from the history of choices in the calibration
mark models.15 The parameters estimated based on period even when the models parameters do not vary
the calibration period are used to predict the 1,256 across individuals. Consequently, in Model 2 one can-
(alumni) 3 (years) = 3 768 observations of possi- not distinguish between the effect of time dynamics
ble gift giving in the validation period. We compare and cross-individual heterogeneity.
the prediction ability of the alternative models using One of the major advantages of the HMM in the
the hit rate measures16 (overall hit rates as well as context of customer relationships is the ability to dy-
hits and misses of donation and nondonation peri- namically segment the rms customer base. An alter-
ods), the root-mean-square prediction error (RMSPE) native, static segmentation approach is the latent class
between the predicted choice probabilities and the model (Model 3). In Table 7, the HMM segmenta-
actual choices across alumni and time periods, and tion provides better donation predictive validity and
the validation log-likelihood (see Table 7). t relative to the latent class model. To investigate
The RMSPE of each model is compared to the further the difference between the dynamic and the
RMSPE of a random choice rule. The random choice static segmentation models, we analyzed the predic-
rules RMSPE was calculated based on the aggregate tive ability of the two models for alumni for whom
donation probability in the calibration period (43.0%).
the two segmentation methods coincide and alumni
The random choice rules RMSPE is 0.495. The pre-
for whom they diverge. Specically, for the HMM we
diction ability of all models is signicantly better than
classied each alumnus to the state with the high-
that of a random choice rule, based on the RMSPE.
est state membership probability in the last year of
The HMM predicts the hold-out choices signi-
cantly better than the nondynamic latent class model
(z = 95; p-value < 0001) and the dynamic, observed Table 7 Fit and Predictive Ability Measures
states, recency-frequency model (z = 51; p-value <
Model
0001). The improvement in prediction ability of the
HMM relative to the alternative models is consis- Model 4
tent across all prediction measures used. The recency- Model 1 Model 2 Model 3 recency-
HMM with HMM no nondynamic frequency
frequency model seriously under-predicts donation Measure heterogeneity heterogeneity latent class model
years, which are arguably more important to predict
than nondonation years. Calibration: 2 log 25,397.1 25,565.8 27,399.8 26,045.1
marginal density/
While the t of the HMM with heterogeneity 2 log likelihood
(Model 1) is better than that of the HMM with no Overall hit rate (%) 77.6 77.3 67.9 72.5
heterogeneity (Model 2), the prediction ability of the Donation hit rate (%) 78.8 78.0 67.2 67.7
parsimonious HMM with no heterogeneity is similar Non donation 76.5 76.6 68.5 76.9
hit rate (%)
RMSPE 0.392 0.393 0.448 0.426
15
The parameter estimates of the recency-frequency model, and the Improvement over the 20.7 20.6 9.4 14.0
latent class model can be obtained from the authors. random RMSPE (%)
16 Validation 2 log 3,583.6 3,597.0 4,484.2 4,148.6
To estimate hit rates, we predicted a donation if the predicted
likelihood
probability of donation is larger than 50%.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185204, 2008 INFORMS 199

Table 8 Predictive Ability by Segmentation Method behavior through attitudinal bonds such as commit-
ment, identity salience, self-connection, and satisfac-
Latent class HMM hit
Segment allocation N hit rates (%) rates (%) tion (e.g., Arnett et al. 2003, Fournier 1998, Morgan
and Hunt 1994). To explore the underlying attitudi-
Dormant in HMM but 142 420 772 nal dimensions of the three relationship states, we
not in latent class
Dormant in both 393 824 819
complement the observed behavior data with survey-
Occasional in HMM but 123 550 629 based data. This approach is consistent with the call
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

not latent class of Gupta and Zeithaml (2006) for incorporating attitu-
Occasional in both 154 526 682 dinal and perceptual constructs with behavioral out-
Active in HMM but not 216 586 767 come models.
in latent class
In the years 1998, 2000, and 2002, the alumni asso-
Active in both 228 852 854
ciation conducted a survey that measured alumni
Total 1256 679 776
engagement and attitude towards the university. Over
1,600 randomly sampled alumni were surveyed. The
vast majority of these alumni were surveyed only in
the calibration sample. The latent class solution sug-
one of the three years. The questionnaire included
gested three segments that are similar to the three
questions about the relationship between alumni and
HMM states in terms of donation propensity (dor-
the university in dimensions such as satisfaction,
mant 14%, occasional 45%, active 80%). We classied
emotional connection, pride and others. Of the total
alumni into these three latent class states based on the
survey sample, 128 surveys matched our original data
highest probability rule. Table 8 compares the predic-
set of 17,000 alumni. For this subset of 128 alumni, we
tive ability of the HMM and the latent class models
calibrated the HMM (Model 1) and calculated, using
to predict the hold-out donation years of alumni that
Equation (9), the alumni probability of membership
were classied to the same state/segment by the two in each of the three relationship states in the year of
methods and alumni for which the segmentation of the survey.17 We then assigned each alumna/alumnus
the two methods departed. to the relationship state with the highest probability.
The predictive ability of the latent class model is Table 9 compares the mean responses to the relation-
similar to that of the HMM for the alumni that were ship questions across alumni in the three relationship
classied to the dormant or active segment/state by states.
both models. However, for alumni that were classied On all relationship measures, except afnity to the
by the latent class model into a different segment than graduating class, the average ratings toward the uni-
the one suggested by the HMM, the predictive ability versity are increasing in the states from dormant
of the latent class model was poor. Additionally, the to occasional to active. The last column in Table 9
HMM outpredicted the latent class model for alumni presents the ANCOVA of the different relationship
in the more transient occasional state. This analysis measures on the states membership with lagged
might suggest that the HMM segmentation matches donation as a covariate. Controlling for the lagged
the true alumni segmentation better than the static choice provides a more conservative estimate of the
latent class model. difference in the relationship measures between the
In summary, the previous sections demonstrated states over and beyond the impact of an immediate
the insights one could gain from including time- donation.
varying covariates in the states transitions. The This analysis revealed that even after controlling
hold-out sample analysis suggests that an additional for the effect of typical state dependence, alumni in
advantage of the relationship HMM is in improving the active and occasional states had stronger feelings
the ability to predict future choices over the static and toward the university than those in the dormant state.
observed state models. Specically, the difference was signicant in terms
of the emotional connection to the university, afn-
4.7. Behavioral and Attitudinal Dimensions ity with the graduating class, the feeling of responsi-
To broaden the applicability of the HMM to most bility toward the university, the perception of being
common CRM data sets, the model uses only valued by the university, and the likelihood of rec-
observed behavioral measures (such as choice) to ommending the university to others. Indeed, it has
elicit the relationship states. This could raise con- been suggested in the relationship marketing litera-
cerns that the relationship states in the HMM rep- ture that these dimensions of commitment, respon-
resent merely differences in the likelihood of choice sibility, and self-connection (Fournier 1998, Morgan
rather than true differences in relationship reected
by attitude. It has been suggested that repeated trans- 17
Because the gift-giving data ended at the end of 2001, the rela-
actions could be transformed into true relational tionship state in 2001 was used for the survey of 2002.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
200 Marketing Science 27(2), pp. 185204, 2008 INFORMS

Table 9 Behavioral Dimensions of Relationship in the Three States

ANCOVA
Question Scale Dormant Occasional Active P -values

How satised are you with your university experience? 15 451 475 480 0220
How strong is your feeling about the university? 15 431 450 460 0098
Do you feel proud of your university degree? 14 347 362 369 0833
Do you feel your university experience helped shape your life? 14 289 324 343 0069
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

Do you feel a strong emotional connection to the university? 14 272 314 322 0022
Do you feel a strong responsibility to help the university? 14 228 266 303 0015
Do you feel strong afnity with your graduating class? 14 194 252 226 0009
How strongly would you recommend the university 14 335 367 374 0042
to prospective students?
How good of a job is the university doing in 14 274 293 296 0223
serving your needs as an alum?
To what extent do you feel the university values its alumni? 13 225 241 255 0047
Do your parents/grandparents have a degree Yes/No 19% 18% 12% 0109
from this university?
Did you receive nancial aid from the university as a student? Yes/No 40% 40% 39% 0557
Median lifetime donation (from the actual donation data) $100 $475 $1,382
Sample size (N) 64 29 35 128

Bold numbers reect signicant differences (at the 5% level) between the three relationship states.

and Hunt 1994) are important factors of customer- and the transitory, occasional state, while the effect
rm relational behavior. of volunteering is stronger on alumni in the sticky
These attitudinal measures give behavioral support dormant and active states. Both of these covariates
to the projection from observed donation behavior to have a long-term impact on the donation behavior. In
latent states of relationship. Furthermore, the high rat- terms of prediction ability, we demonstrate that, for
ings for positive word-of-mouth among alumni in the our empirical application, the proposed model pre-
active and occasional states suggest that alumni in dicts future donations signicantly better than the
these states do not only have a higher propensity to static latent class model and the dynamic, observed
donate, but are also more likely to be active on other states, recency and frequency model. Additionally, we
dimensions. Indeed, Arnett et al. (2003) used actual use the hold-out sample to demonstrate that HMM
donation and word-of-mouth to measure relationship dynamic segmentation provides superior segmenta-
marketing success in the context of alumni-university tion to the static latent class segmentation. More
relationships. generally, the empirical application demonstrates the
value of the proposed model for CRM marketers. The
5. General Discussion HMM enables marketers to use buyer behavior data
In this paper, we use data on alumni gift-giving be- to dynamically segment their customers into relation-
havior to estimate a hidden Markov model of relation- ship states and estimate the evolution of customer
ship dynamics. The HMM, which was estimated using relationships over time.
a hierarchical Bayes MCMC procedure to account for The proposed model extends the marketing litera-
observed and unobserved heterogeneity, offers several ture by suggesting a Markovian framework for esti-
insights into the drivers of these dynamics. mating the dynamics in customer relationships. Using
The main contribution of this research is in suggest- the nonhomogeneous HMM, one can estimate the
ing a behaviorally grounded model that helps mar- long-term impact of customer-rm interactions on
keters to infer the underlying structure of relationship customer relationships, an issue that has been largely
states. Using the model, the researcher can dynam- neglected in the customer relationship literature.
ically classify customers into the relationship states Methodologically, the proposed model extends the
and assess the dynamic effect of alternative time- observed states models, such as the state-dependence
varying covariates on the transition between the rela- and recency-frequency models by offering a latent
tionship states and consequent buying behavior. specication of state dependence, which incorporates
The empirical application to the problem of uni- the effect of time-varying covariates into the state-
versity-alumni relationships demonstrates the use of dependence structure. The ability to determine the
the model to a dynamic relationship problem. Exam- number of states based on the data at hand relaxes
ining the time-varying covariates in the transition the ad-hoc specication of dynamics in the observed
matrix, we nd that the impact of reunion atten- state models. The proposed model also extends the
dance is the strongest for alumni in the dormant family of HMMs. To our knowledge, this is the rst
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185204, 2008 INFORMS 201

nonhomogeneous HMM in the marketing literature such as service encounters, satisfaction, and action-
that investigates the impact of time-varying covari- able behavior (Bolton 1998). Future research could
ates, such as customer-rm interactions, on the tran- also investigate an HMM with multivariate outcomes
sition between the latent states. Indeed, Wedel and such as donation amounts, donation incidence and
Kamakura (2000, p. 176) point out that the issue of participation in university events, as multiple modes
nonstationarity in marketing segmentation in gen- of behavior that determine the customer relationship
eral, and specically nonstationarity that could be state.
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

related to time-varying covariates, has received lim- To summarize, we believe that from the modeling
ited attention. perspective, we have provided CRM practitioners
The ability to incorporate time-varying covariates with an implementable model for evaluating cus-
in the transition matrix opens the opportunity to tomer relationships, their evolution over time, and the
investigate which marketing activities are most effec- effect of customer-rm interactions on altering this
tive in building customer-rm relationships and driv- evolution. These factors are necessary to transform a
ing actionable behavior, and to determine the optimal CRM system from an information system into a deci-
targeting of such marketing activities. Future research sion support system. This research takes a fundamen-
could investigate this issue using a data set that incor- tal step in that direction.
porates both exposure to marketing activities and
observed buying behavior.
Acknowledgments
It is worthwhile to distinguish between the pro-
The authors thank Amy Paulsen and the alumni associa-
posed nonhomogeneous HMM, which incorporates tion members for help with the data, and Michaela Dra-
time-varying covariates in the transition matrix, and ganska, Ricardo Montoya, Olivier Toubia, and participants
a nonstationary HMM (Djuric and Chun 2002), which in seminars at Carnegie Mellon University, Chicago Busi-
models the transition probabilities as a function of the ness School, Columbia University, Hebrew University at
state duration. Because our main interest is in inves- Jerusalem, Hong-Kong University of Science and Technol-
tigating the effect of customer-rm interactions on ogy, London Business School, New York University, North-
HMM dynamics, we adopt the rst approach. How- western University, Stanford University, The Interdisci-
ever, future research could investigate the value of the plinary Center at Hertzelia, The Israeli Institute of Technol-
additional exibility offered by nonstationary HMMs ogy, University of Maryland, University of Texas at Austin,
in marketing applications. University of Texas at Dallas, and University of Toronto
To broaden the applicability of the proposed model for helpful comments and suggestions. The authors are also
to most common CRM data sets, we used only grateful for the support of the Marketing Science Institute
observed buying behavior to elicit the relationship through the Alden G. Clayton Dissertation Competition.
states. Using survey data, we show that the relation-
ship states, which were estimated based on observed Appendix A. Hierarchical Bayes Estimation
Algorithm
buying behavior, are also different in terms of behav-
The parameters in our model could be divided into two
ioral dimensions of relationship, such as satisfac- groups: (1) parameters that vary across individuals (ran-
tion, emotional connection, and responsibility. Future dom-effect parameters) and (2) parameters that do not vary
research could use longitudinal survey data to esti- across individuals (xed parameters). The set of parameters
mate the relationship states. Estimating the dynamics in each group is determined by the models heterogeneity
in relationship states directly from attitudinal vari- specication as described in 3.2.4 and 3.4.
ables would be helpful in providing insight into the We denote by i the set of random-effect parameters and
mediating factors of the connection between observed by  the set of xed parameters. For the HMM described in
buying behavior and the relationship states. 3, the random-effect parameters includes: i = s  i1   
To increase the external validity of the model, it s  iNS1 , s  1    NS. The vector  includes:  =
would be constructive to investigate the application 1 2    NS 01 02    0NS 1 2    NS .
of this model in relationship marketing contexts other Observed individual characteristics, such as demograph-
than university-alumni relationships. Some possible ics, can be introduced into the model in a hierarchical
applications are other institutional gift giving; con- manner (Allenby and Ginter 1995). Thus, the vector of het-
erogeneous parameters (i ) can be written as a function of
tinuous service provider relationships, which face
observed and unobserved individual characteristics
high churn rates such as banks and telephone carri-
ers (Bolton 1998); institutional memberships (Thomas i =  Zi + i (A1)
2001); direct selling efforts; and dynamic brand
choice problems. The HMM framework could also be where Zi is a vector of individual characteristics for indi-
extended to study alternative measures of customer vidual i,  is a matrix of parameters, relating the individ-
relationships. For example, one could explore the ual characteristics to the random-effect parameters (i ), and
connection between alternative relationship measures !i" N 0  .
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
202 Marketing Science 27(2), pp. 185204, 2008 INFORMS

The MCMC procedure recursively generates draws from (4) Generate 


the conditional distribution of the models parameters: Similar to (A2) the conditional distribution of  can be
dened by
i   Yi Xi ai Zi   
  i  Z    Y X a i 

  i  Z  LYN 0 V0 

  Y X a i  V0 1/2 exp1/2 0  V10  0 LY (A5)


Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

where Yi , Xi , and ai are the vectors of donations, covariates where *0 and V*0 are diffused priors, and LY is the like-
in the state-dependent choice, and covariates in the transi- lihood function from Equation (8). Because (A5) does not
tion matrix, respectively, as described in 4.3. have a closed form, the M-H algorithm is used to draw from
(1) Generate i the conditional distribution of . The acceptance probability
at step k + 1 is dened by
f i   Yi Xi ai Zi    
  Pracceptance
N i    Zi   LYi 

exp1/2k+1 0  V10 k+1 0 LY  k+1 
 1/2 exp1/2i  Zi  1 
 i  Zi LYi  (A2) = min 1 
exp1/2k 0  V10 k 0 LY  k 
where L(Yi ) is the likelihood function from Equation (8).
Because (A2) does not have a closed form, the Metropolis- (A6)
Hastings algorithm is used to draw from the conditional dis-
We dene diffuse priors for the conditional distribution of 
tribution of i . The Metropolis-Hastings algorithm proceeds
k by setting *0 to a n* 1 vector of zeros, and v*0 = 30In* ,
as follows: Lets dene i as the accepted draw of i in iter- where n* = dim.
k+1
ation k and i as the draw in iteration k + 1. Then, the
k+1 k
sequence of draws is given by i = i +
, where
 is Appendix B. Adapting the Markov Switching
2
a draw from N 0 % &, and % and & are chosen adaptively Criterion to Our Estimation Algorithm
to reduce the autocorrelation among the MCMC draws, with Recently, Smith, Naik and Tsai (2006; hereafter SNT) devel-
an acceptance rate of approximately 20%, following Atchad oped the Markov switching criterion (MSC) for HMMs
(2006). states and variables selection. The MSC as described by
k+1
The probability of accepting i is SNT was developed for a HMM with a stationary transi-
Pracceptance tion matrix, estimated using aggregate data and common
   k+1   k+1 
 parameters across individuals. In contrast, our HMM is esti-
k+1 
exp 1/2 i  Zi 1 i  Zi L Yi  i mated using individual-level data, with some of the param-
= min   k   k 
 k 
1 
exp 1/2 i  Zi 1  i  Zi
 L Y i  i eters varying across individuals. Additionally, we allow the
transition matrix to be a function of time-varying covari-
(A3)
ates, thus relaxing the assumption of stationary transition
(2) Generate  matrix commonly assumed in HMMs. Therefore, to apply
Dene v = vec . the MSC as a criterion for choosing the number of states,
we need to adapt it to our specic application.
v  i  Z   = MVN un Vn 
Following Equation (15) in SNT the MSC is given by
where
Vn = Z Z 1 1 1 
NS
Ts Ts + ,s K
  + V0  MSC = 2 logf Y  i  + (B1)
un = Vn Z 1 
+ V01 u0  
s=1 .s Ts ,s K 2

Z = Z1 Z2    ZN  is an N nz matrix of covariates,
= 1 2    N  is an N n" matrix which stacks i }, where
= vec  , 2 logf Y  i  is the maximized log-likelihood value,
s , W
Ts = traceW s = diagPs , and Ps = probS1 = s   
V0 and u0 are prior hyperparameters,
n" = dimi , and probST = s,
nz = dimzi . ,s = NS; following SNT recommendation, where NS is
We dene diffuse priors be setting u0 to a n" nz1 vector the number of states,
of zeros, and V0 = 100Innz . .s = 1; following SNT recommendation, and
(3) Generate  K is the number of covariates in the state-dependent
vector.
  i  Z  To adapt the MSC to our application we need to modify
the different components in (B1).

N
IWn" f0 + N G1 The rst term in Equation (B1), 2 logf Y  i , cap-
  
0 + i  Zi  i  Zi  (A4)
i=1 tures the models t. Because our model is estimated using
where f0 and G0 are prior hyperparameters, f0 is the an MCMC hierarchical Bayes estimation procedure, one
degrees of freedom, and G0 is the scale matrix of the inverse needs to dene the procedure used to calculate the models
Wishart distribution. We dene diffuse priors by setting f0 = t measure using the simulation output. We adopt the de-
n" + 5, and G0 = In" . viance measure suggested by Spiegelhalter et al. (2002) as
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185204, 2008 INFORMS 203

the model t component of the deviance information cri- Bckenholt, U., R. Langeheine. 1996. Latent change in recurrent
terion (DIC). We calculate the deviance as the expectation choice data. Psychometrika 61(2) 285302.
over the estimates at each iteration of the MCMC simula- Bolton, R. N. 1998. A dynamic model of the duration of the cus-
tion. Thus, tomers relationship with a continuous service provider: The
role of satisfaction. Marketing Sci. 17(1) 4565.
J N
  j Bucklin, R. E., J. M. Lattin. 1991. A two-state model of purchase
2 logf Y  i  = 2 logf Yi  i j  (B2) incidence and brand choice. Marketing Sci. 10(1) 2439.
j=1 i=1
Bult, J. R., T. Wansbeek. 1995. Optmal selection for direct mail. Mar-
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

where keting Sci. 14 (4) 378394.


J is the number of MCMC iterations used to obtain the Dekimpe, M. G., D. M. Hanssens. 2000. Time-series models in
posterior distributions after the burn-in period, marketing: Past, present and future. Internat. J. Res. Marketing
17(23) 183193.
N is the number of individuals,
j Dillon, W. R., U. Bckenholt, M. S. de Borrero, H. Bozdogan, W.
 i is the vector of parameter estimates for individual i in DeSarbo, S. Gupta, W. Kamakura, A. Kumar, V. Ramaswamy,
iteration j, and M. Zenor. 1994. Issues in the estimation and application of
j
f Yi  i j  is the likelihood of observing a sequence of latent structure models of choice. Marketing Lett. 5(4) 323334.
choices following Equation (8). Djuric, P. M., J.-H. Chun. 2002. An MCMC sampling approach
The term Ts in (B1) captures the effective sample size at to estimation of nonstationary hidden Markov models. IEEE
state s. Accordingly, we dene Trans. Signal Processing 50(5) 11131123.
Du, R., W. A. Kamakura. 2006. Household lifecycles and lifestyles
Ti

N  in america. J. Marketing Res. 43(February) 121132.
Ts = P Sit = s Dwyer, R. F., P. H. Schurr, S. Oh. 1987. Developing buyer-seller
i=1 t=1
relationships. J. Marketing 51(April) 1127.
where P Sit = s is the probability that individual i is in Erdem, T., B. Sun. 2001. Testing for choice dynamics in panel data.
state s at time T . Thus, the only difference between our J. Bus. Econom. Statist. 19(2) 142152.
formulation of Ts and that of SNT is the summation over Fader, P. S., J. M. Lattin. 1993. Accounting for heterogeneity and
individuals in our model. nonstationarity in a cross-sectional model of consumer pur-
chase behavior. Marketing Sci. 12(3) 304317.
Finally, we need to count the number of additional
Fader, P. S., B. G. S. Hardie, C.-Y. Huang. 2004. A dynamic change-
parameters in each state K. Unlike SNT, we include covari- point model for new product sales forecasting. Marketing Sci.
ates both in the transition matrix and in the state-dependent 23(1) 5065.
vector. Accordingly, in our application, K is the number Flanagan, J. C. 1954. The critical incident technique. Psych. Bull.
of covariates for each state in both the transition matrix 51(4) 327359.
and in the state-dependent vector. Note that because the Fournier, S. 1998. Consumers and their brands: Developing
relationship theory in consumer research. J. Consumer Res.
parameters that capture the effect of the covariates are com- 24(March) 343373.
mon across individuals we can simply count the number Frank, R. E. 1962. Brand choice as a probability process. J. Bus.
of parameters. If one estimates a full random-effect model 35(January) 4356.
using hierarchical Bayesian approach, we recommend using Gelman, A., D. B. Rubin. 1992. Inference from iterative simulation
the effective number of parameters, pD of Spiegelhalter using multiple sequences. Statist. Sci. 7 457511.
et al. (2002), divided by the number of states as a measure Greene, W. H. 1997. Econometric Analysis, 3rd ed. Prentice Hall,
Englewood Cliffs, NJ.
for K.
Guadagni, P. M., J. D. C. Little. 1983. A logit model of brand choice
Calibrated on scanner data. Marketing Sci. 2(3) 203238.
Gupta, S., V. A. Zeithaml. 2006. Customer metrics and their impact
References on nancial performance. Marketing Sci. 25(6) 718739.
Aaker, J., S. Fournier, A. S. Brasel. 2004. When good brands do bad. Hamilton, J. D. 1989. A new approach to the economic analysis of
J. Consumer Res. 31(June) 118. nonstationary time series and the business cycle. Econometrica
Allenby, G. M., J. L. Ginter. 1995. Using extremes to design products 57(2) 357384.
and segment markets. J. Marketing Res. 32(November) 392403. Harrison, W. B., S. K. Mitchell, S. P. Peterson. 1995. Alumni dona-
Allenby, G. M., R. P. Leone, L. Jen. 1999. A dynamic model of tions and colleges development expenditures: Does spending
purchase timing with application to direct marketing. J. Amer. matter? Amer. J. Econom. Social. 54(October) 397413.
Statist. Assoc. 93(446) 365374. Heckman, J. J. 1981. Heterogeneity and state dependence. S. Rosen,
Andrews, R. L., I. S. Currim. 2003. A comparison of segment reten- ed. Studies in Labor Markets. University of Chicago Press,
tion criteria for nite mixture models. J. Marketing Res. 39(May) Chicago, 91139.
235243. Hinde, R. A. 1979. Towards Understanding Relationships. Academic,
Arnett, D. B., S. German, S. D. Hunt. 2003. The identity salience London.
model of relationship marketing success: The case of nonprot Hughes, J. P., P. Guttorp. 1994. A class of stochastic models for
marketing. J. Marketing 67(2) 89105. relating synoptic atmospheric patterns to regional hydrologic
Atchad, Y. F. 2006. An adaptive version for the metropolis adjusted phenomena. Water Resources Res. 30(5) 15351546.
Langevin algorithm with a truncated drift. Methodology and Jain, D., S. S. Singh. 2002. Customer lifetime value research in mar-
Comput. Appl. Probab. 8(2) 235254. keting: A review and future direction. J. Interactive Marketing
Baade, R. A., J. O. Sundberg. 1996. What determines alumni gen- 16(2) 3446.
erosity? Econom. Ed. Rev. 15(1) 7581. Jeuland, A. P., F. M. Bass, G. P. Wright. 1980. A multibrand stochas-
Bckenholt, U., W. R. Dillon. 2000. Inferring latent brand depen- tic model compounding heterogeneous Erlang timing and
dencies. J. Marketing Res. 37(February) 7287. multinomial choice processes. Oper. Res. 28(2) 255277.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
204 Marketing Science 27(2), pp. 185204, 2008 INFORMS

Kahneman, D., D. T. Miller. 1986. Norm theory: Comparing reality Poulsen, C. S. 1990. Mixed Markov and latent Markov modeling
to its alternatives. Psych. Rev. 93(2) 136153. applied to brand choice behavior. Internat. J. Res. Marketing 7(1)
Kamakura, W. A., G. J. Russell. 1989. A probabilistic choice model 519.
for market segmentation and elasticity structure. J. Marketing Rabiner, L. R. 1989. A tutorial on hidden Markov models and
Res. 26(November) 379390. selected applications in speech recognition. Proc. IEEE 77(2)
Keane, M. P. 1997. Modeling heterogeneity and state dependence 257286.
in consumer choice behavior. J. Bus. Econom. Statist. 15(July) Rabiner, L. R., B.-H. Juang. 1993. Fundamentals of Speech Recognition.
310327. Prentice Hall, Englewood Cliffs, NJ.
Downloaded from informs.org by [131.96.195.50] on 29 October 2015, at 13:18 . For personal use only, all rights reserved.

Kim, C.-J., C. R. Nelson. 1999. State-Space Models with Regime Switch- Ramaswamy, V. 1997. Evolutionary preference segmentation with
ing: Classical and Gibbs Sampling Approaches with Applications. panel survey data: An application to new products. Internat. J.
M.I.T. Press, Cambridge, MA. Res. Marketing 14(1) 5780.
Lewis, M. 2004. The inuence of loyalty programs and short-term Reinartz, W., V. Kumar. 2003. The impact of customer relationship
promotions on customer retention. J. Marketing Res. 41(August) characteristics on protable lifetime duration. J. Marketing 67(1)
281292. 7799.
Libai, B., D. Narayandas, C. Humby. 2002. Toward an individ- Rossi, P. E., G. M. Allenby. 2003. Bayesian statistics and marketing.
ual customer protability model: A segment-based approach. Marketing Sci. 22(3) 304328.
J. Service Res. 5(1) 6976. Rust, R. T., T. S. Chung. 2006. Marketing models of service and
Liechty, J. C., M. Wedel, R. Pieters. 2003. The representation of local relationships. Marketing Sci. 25(6) 560580.
and global scanpaths in eye movements through Bayesian hid- Rust, R. T., K. N. Lemon, V. A. Zeithaml. 2004. Return on marketing:
den Markov models. Psychometrika 68(4) 519541. Using customer equity to focus marketing strategy. J. Marketing
MacDonald, I. L., W. Zucchini. 1997. Hidden Markov and Other Mod- 68(1) 109127.
els for Discrete-Valued Time Series. Chapman and Hall, London. Schmittlein, D. C., R. A. Peterson. 1994. Customer base analysis:
McAlister, L., R. Srivastava, J. Horowitz, M. Jones, W. Kamakura, An industrial purchase process application. Marketing Sci. 13(1)
J. Kulchitsky, B. Ratchford, G. Russell, F. Sultan, T. Yai, 4167.
D. Weiss, R. Winer. 1991. Incorporating choice dynamics in Schwarz, G. 1978. Estimating the dimension of a model. Ann.
models of consumer behavior. Marketing Lett. 2(3) 241252. Statist. 6(2) 461464.
Montgomery, A. L., S. Li, K. Srinivasan, J. C. Liechty. 2004. Pre- Scott, L. S. 2002. Bayesian methods for hidden Markov models,
dicting online purchase conversion using web path analysis. recursive computing in the 21st century. J. Amer. Statist. Assoc.
Marketing Sci. 23(4) 579595. 97(475) 337351.
Moon, S., W. A. Kamakura, J. Ledolter. 2007. Estimating promo- Smith, A., P. A. Naik, C.-L. Tsai. 2006. Markov-switching model
tion response when competitive promotions are unobservable. selection using Kullback-Leibler divergence. J. Econometrics
J. Marketing Res. Forthcoming. 134(2) 553577.
Morgan, R. M., S. D. Hunt. 1994. The commitment-trust theory of Soukup, J. D. 1983. A Markov analysis of fund-raising alternatives.
relationship marketing. J. Marketing 58(3) 2038. J. Marketing Res. 20(August) 314319.
Morrison, D. G., R. D. H. Chen, S. L. Karpis, K. E. A. Britney. 1982. Spiegelhalter, D. J., N. G. Best, B. P. Carlin, A. van der Linde. 2002.
Modeling retail customer behavior at merrill lynch. Marketing Bayesian measures of model complexity and t. J. Roy. Statist.
Sci. 1(2) 123141. Soc. Ser. B 64(3) 583639.
Naik, P. A., M. K. Mantrala, A. G. Sawyer. 1998. Planning media Srinivasan, V., R. Kesavan. 1976. An alternative interpretation of
schedules in the presence of dynamic advertising quality. Mar- the linear learning model of brand choice. J. Consumer Res.
keting Sci. 17(3) 214235. 3(September) 7683.
Netzer, O. 2004. A hidden Markov model of customer relationship Storm, S. 2003. After years of cash ow, universities hit an ebb. New
dynamics. Ph.D. dissertation, Stanford University, Stanford, York Times 13(March), 25.
CA. Taylor, A. L., J. C. Martin, Jr. 1995. Characteristics of alumni
Newton, M. A., A. E. Raftery. 1994. Approximate Bayesian inference donors at a research I public university. Res. Higher Ed. 36(3)
by the weighted likelihood bootstrap. J. Roy. Statist. Soc. B 56(1) 283302.
348. Thomas, J. S. 2001. The importance of linking customer acquisition
Okunade, A. A. 1993. Logistic regression and probability of to customer retention. J. Marketing Res. 39(May) 262268.
business school alumni donations: Micro-data evidence. Ed. Wedel, M., W. A. Kamakura. 2000. Market Segmentation: Concep-
Econom. 1 243258. tual and Methodological Foundations. Kluwer, Dordrecht, The
Okunade, A. A., R. L. Berl. 1997. Determinants of charitable giving Netherlands.
of business school alumni. Res. Higher Ed. 38(April) 201214. Willemain, T. R., A. Goyal, M. Van Deven, I. S. Thukral. 1994.
Oliver, R. L. 1997. Satisfaction: A Behavioral Perspective on the Con- Alumni giving: The inuences of reunion, class and year. Res.
sumer. McGraw-Hill, New York. Higher Ed. 35(2) 201214.
Pauwels, K., I. Currim, M. G. Dekimpe, E. Ghysels, D. M. Hanssens, Winer, R. S. 2001. A framework for customer relationship manage-
N. Mizik, P. Naik. 2004. Modeling marketing dynamics by time ment. California Management Rev. 43(Summer) 89105.
series econometrics. Marketing Lett. 15(4) 167183. Xie, J. X., M. Song, M. Sirbu, Q. Wang. 1997. Kalman lter esti-
Pfeifer, P. E., R. L. Carraway. 2000. Modeling customer relationships mation of new product diffusion models. J. Marketing Res.
using Markov chains. J. Interactive Marketing 14(2) 4355. 34(August) 378393.

You might also like