Read Me at Last After Markov Chain (Summary)

Lecture by J ohn Shortle, partially transcribed by J ames LaBelle, based on the class textbook: Ross, S. M.
, 2003,
Introduction to Probability Models, 8
th
, Academic Press.
OR / STAT 645: Stochastic Processes
Lecture 1: Probability Review, Exponential Distribution
Given: 8/31/2006

Why You Need to Know about Stochastic Processes
Life is stochastic
o Commute to work
o Wait in line for lunch
o Even deterministic things are stochastic (e.g., Metro busses)
Stochastic problems are directly relevant to your life
o Why do bad things happen in groups?
o How do I increase the page rank of my web site in Google?
o How should stock options be priced?
o When should I replace my aging car?
o Why do I have to wait so long for a bus at Dulles airport? And why do the busses get
clumped up in groups?

Probability Review
Notation
( ) f x is the Probability Density Function (PDF).
( ) F x is the Cumulative Distribution Function (CDF): ( ) ( )
x
F x f u du
.
( ) 1 ( )
c
F x F x = is the Complement of the CDF (or CCDF)

Relationships
( ) ( ) ( )
c
d d
f x F x F x
dx dx
= =
Exponential Distribution
( )
x
f x e

= , 0 x
( ) 1
x
F x e

= , 0 x
( )
c x
F x e

= , 0 x
Memorize These Formulas!

Gamma Distribution

PDF:
1 /
( )
( )
x
x e
f x

, 0 x > , where
1
0
( )
x
x e dx
.
Note: ( ) ( 1)! = when is a positive integer.
CDF: When is a positive integer,
1
/
0
( / )
( ) 1
!
j
x
j
x
F x e
j
=
=

, 0 x > .
Lecture by J ohn Shortle, partially transcribed by J ames LaBelle, based on the class textbook: Ross, S. M., 2003,
th
, Academic Press.
We will derive this property later.

Other properties
1. When is a positive integer, a gamma random variable (RV) is equivalent to the sum of
independent exponential RVs with mean .
2. When 1 = , gamma RV is an exponential RV with mean .
Check:
1 / /
( )
( )
x x
x
x e e
f x e

= = =
, where 1/ = .
Mean of a Random Variable
(1) Typical method
Discrete Case: [ ]
i i
i
E x x p =
, where ( )
i i
p P X x = =
Continuous Case: [ ]
0
( ) E X xf x dx

(2) When X is non-negative random variable

0
[ ] ( )
c
E x F x dx

Proof: Suppose X is discrete with ( )
i i
p P X x = = . Plot ( )
c
F x as a function of x:

Calculate the area under the curve two ways:
First way:
0
( )
c
F x dx
.
Second way: Add up areas of horizontal rectangles. This gives
1
i i
i
x p
, which is [ ] E X .
(3) Using Moment Generating Function
Moment generating function for random variable X: ( )
tX
t E e

=

x
1
x
2
x
3
x
4
p
1
p
2
p
3
p
4
F
c
(x)
th
, Academic Press.
( )
discrete case
( ) continuous case
tk
k
k
tx
e p
t
e f x dx

Then,
( )
( )
0
n n
t
E x t
=

=

.
For example, [ ] ( )
0 t
E x t
=
=
( )
2
0 t
E x t
=

=

Note: Moment generating function is closely related to the Laplace transform:
*
0
( ) [ ] ( )
sx sx
f s E e e f x dx

Example: Exponential Distribution
( ) ( )
( )
( )
( )
0 0
0
x
t x
t x tx x
x
e
t e e dx e dx
t t

=

=
= = = =

.
Therefore,
[ ]
0
1
t
E x
t

=
= =

( )
( )
( )
2 2
3 2
0
0
2 2
t
t
E x t
t
=
=

= = =

Variance of Random Variable
Variance:
2 2 2
var[ ] ( [ ]) [ ] [ ] X E X E X E X E X

= =

, where
2 2
0
[ ] ( ) E X x f x dx
.
Standard Deviation (std. dev.) = var[ ] X
Coefficient of Variation (CV) =
std. dev.
[ ] E X

Example: Exponential Random Variable

2 2
2 2 2
2 1 1
var[ ] [ ] ( [ ]) X E X E X

= = =
std. dev. =
1

CV =
1/
1
1/
=
th
, Academic Press.
Memoryless Property
Def. 1. A random variable X has the memoryless property if:
( | ) ( ) P X t s X s P X t > + > = >

Intuition: Suppose X represents the time that you wait for a bus. Given that you have already been
waiting s time units ( ) X s > , the probability that you wait an additional t units ( | ) P X t s X s > + >
is the same as the probability of waiting t units in the first place ( ) P X t > .
We now formulate an alternate definition. If X has the memoryless property, then
( ) ( | )
( and )
( )
( )
( )
P X t P X t s X s
P X t s X s
P X s
P X t s
P X s
> = > + >
> + >
=
>
> +
=
>

Def. 2. A random variable X has the memoryless property if:
( ) ( ) ( ) P X t s P X t P X s > + = > >

The exponential distribution is the only distribution that has the memoryless property.
Check the exponential has this property:
( )
( ) ( ) ( )
t s t s
P X t s e e e P X t P X s
+
> + = = = > >

Useful Properties of Exponential Distribution
Suppose that
( )
1 1
~exp X (time until event 1 happens)
( )
2 2
~exp X (time until event 2 happens)

( ) ~exp
n n
X (time until event n happens)
All
i
X are independent

1. First occurrence among events
What is the probability that
1 2
X X < ?
1 1 2 2
1
1 1 2 2
1
1 1 2 1
1 2 1 2 2 1
0
1 2 2 1
0
1 1
0
( )
x x
x
x x
x
x x
P X X e e dx dx
e e dx dx
e e dx

< =
=
=

th
, Academic Press.
1
1 2
1 2
( ) P X X

< =
+

The double integration uses the following graphic.

x
1
x
2
x
1
x
2

The second to last equality uses the known CCDF for the exponential distribution.
For the opposite relationship

1 2
2 1 1 2
1 2 1 2
( ) 1 ( ) 1 P X X P X X

< = < = =
+ +
,
as expected from symmetry.

More generally,

1
( min( , , )
i
i n
j
j i
P X X X

= =

To derive the general parts from the 2-variable cases, build up inductively.
3 variable example:
1
1 1 2 3 1 2 3
1 2 3
( min( , , ) ( min( , )) P X X X X P X X X

= = < =
+ +

2. Distribution of time of first event (minimum)
1 2 1 2
1 2
1 2
(min( , ) ) ( , )
( ) ( )
( ) ( )
C C
P X X x P X x X x
P X x P X x
F x F x
> = > >
= > >
=

For an exponential RV,
1 2
1 2
1 2 1 2
( )
(min( , ) ) ( ) ( )
C C
x x
x
P X X x F x F x
e e
e

+
> =
=
=

This is the CDF of an exponential RV with rate
1 2
( ) + , therefore
1 2 1 2
min( , ) ~exp( ) X X + .

More generally,

1 2 1 2
min( , , , ) ~exp( )
n
X X X + + .

th
, Academic Press.
Key intuition: Think of exponential RVs as times until something happens and s as rates.

3. Independence property (stated without proof). The time of the first occurrence of an event is
independent of the ordering of the events. That is,
1 2 n
X X X < < < is independent of
1
min( , , )
n
X X .

4. Distribution of time of last event (maximum)
1 2 1 2
1 2
1 2
1 2
(max( , ) ) 1 (max( , ) )
1 ( , )
1 ( ) ( )
1 ( ) ( )
P X X x P X X x
P X x X x
P X x P X x
F x F x
> =
=
=
=

For an exponential RV,
1 2
1 2 1 2
1 2 1 2
1 2 1 2
( )
( )
(max( , ) ) 1 ( ) ( )
1 (1 )(1 )
1 (1 )
x x
x x x
x x x
P X X x F x F x
e e
e e e
e e e

+
+
> =
=
= +
= +

Note: This could have been derived from Venn diagram principles
1 2 1 2 1 2
(max( , ) ) ( ) ( ) ( , ) P X X x P X x P X x P X x X x > = > + > > >

5. Sum of exponentials (with same rate) is a gamma (stated earlier)

Example (Prob. 5.28)
Consider n components with independent lifetimes. Component i functions for an exponential time
with rate
i
. All components are initially in use and remain so until they fail.
a. Find the probability that component 1 is the second component to fail.
b. Find the expected time of failure of the second component

Possible orderings for component 1 to be second component to fail:
a. 2 fails first, then 1, then some other component fails.
b. 3 fails first, then 1, then some other component fails.
c.
d. n fails first, then 1, then some other component fails.

Probability of event (a) is:
th
, Academic Press.
2 1
2
1
(2 fails first, then 1, then another)
(2 fails before all others)P(1 fails before all except 2)
n
i
i i
i
P
P

=
=

Likewise, probability of event (b) is:
3 1
3
1
n
i
i i
i

.

Events (a),(b), , (d) are mutually exclusive, therefore P(component 1 is second to fail) is sum of all
the above probabilities.
2 1 1
2
1 1
1 2
2
1
n
n n
i i
i i n i i
i i
n
n
i i
i i n i
i

= =

=
+ +

= + +

(b)
Expected time of first failure is
1
1
n
i
i

=
.
Probability first failure is type k is:
1
k
n
i
i
.
Expected time from first failure to second failure given first failure is type k is:
1
i
i k
.

Thus, total expected time until second failure is
1
1 1
1 1
n
k
n n
k i
i k i i
i i
= =
+

Computing Expectations by Conditioning
Basic idea: Compute the expectation or variance of a (complicated) random variable by
conditioning on another random variable.
In stochastic processes, it is often useful to condition on the first event.
Use the formulas
( ) ( ( | ))
( ) ( ( | )) ( ( | ))
E X E E X Y
V X V E X Y E V X Y
=
= +

th
, Academic Press.
Example
The probability of an accident on I-66 during my morning commute is 0.1.
If there is an accident, commute time ~N(50, 6
2
)
If there is no accident, commute time ~N(30, 4
2
)
What is the average time to get to work? What is the variance of time to get to work?

Average time to get to work (easy): 0.1 50 0.9 30 32 + = .
But lets work out carefully in language of conditional expectation:
X =Time to get to work
Y =Accident or no accident
50 if accident
( | )
30 if no accident
E X Y

=

Note: ( | ) E X Y is a random variable (call it Z). In other words,
50 w.p. 0.1
( | )
30 w.p. 0.9
Z E X Y

=

.
Finally,
( ) ( ( | )) ( ) 0.1 50 0.9 30 32 E X E E X Y E Z = = = + = .

To compute V(X), first evaluate ( ( | )) V E X Y . We already know ( | ) E X Y (which we called Z).

2
( ) 0.1 2500 0.9 900 1,060 E Z = + = .
2 2 2
( ( | )) ( ) ( ) ( ) 1060 32 36 V E X Y V Z E Z E Z = = = =

Now, evaluate ( ( | )) E V X Y .
36 w.p. 0.1
( | )
16 w.p. 0.9
V X Y

=

Note: ( | ) V X Y is a random variable.
( ( | )) 0.1 36 0.9 16 18 E V X Y = + =
In summary, ( ) ( ( | )) ( ( | )) 36 18 54 V X V E X Y E V X Y = + = + = . (Note: Variance is bigger than
variances of the conditioned normal variables).

th
, Academic Press.

OR / STAT 645: Stochastic Models
Lecture 2: The Poisson Process
Given: 9/7/2006

The Poisson Distribution
Def. A Poisson random variable with mean A has probability mass function:
( )
!
i
A
A
P X i e
i
= =
where 0,1,2, i =
Note:
A
e
is a normalization constant.

Mean of distribution is A.
Variance of distribution is A.

Note: For an exponential RV, the mean and std. dev. are equal. Here, the mean and variance are
equal.

Historical Background
Ladislaus Bortkiewicz. Born 1868 in St. Petersburg, Russia, born into Russian nobility. He was a
military man and an instructor teaching artillery and mathematics. After being awarded a doctorate,
he led a career in statistics and actuarial science. Some have argued that the Poisson distribution
should be named the von Bortkiewicz distribution.

Bortkiewicz observed that events with a low frequency in a large population follow a Poisson
distribution, even when the probabilities of the events vary. The classical example is the following
data set (Bortkewicz L von. Das Gesetz der Kleinen Zahlen. Leipzig: Teubner; 1898):
14 (out of 16 total) Prussian army corps units observed over 20 years (1875-1894).
A count of men killed by a horse kick, each year, for each unit (280 data points)
Total deaths =196
Average deaths per unit per year =196 / 280 =0.70.

Assume the number of deaths (for one unit in one year) is a Poisson RV with mean 0.70. Then the
predicted and actual distributions are as follows:

Deaths
Theoretical #
of Units
Observed
# of Units
0 139.04 144
1 97.33 91
2 34.07 32
3 7.95 11
4 1.39 2
5+ 0.22 0
Total 280 280

Some sources give an alternate account of the data:
Stochastic Models
th
, Academic Press.
10 Prussian army corps units observed over 20 years (1875-1894)
Total deaths =122
Average deaths per unit per year =122 / 200 =0.61.

Deaths
Theoretical #
of Units
Observed
# of Units
0 108.67 109
1 66.29 65
2 20.22 22
3 4.11 3
4 0.63 1
5 0.08 0
6 0.01 0

Another Example
During World War II, Germans attacked London with V-2 flying bombs. It was observed that the
impacts of the bombs tended to be grouped in clusters, rather than showing a random distribution.
A possible explanation was that (a) specific areas were targeted and (b) the precision of the bombs
was very high. However, the bombs were launched from across Europe and so this explanation
seemed implausible.

The following data were taken
144 square kilometers of south London were divided into 576 squares of square kilometer
each.
A count was made of the number of bombs in each square.
Total bombs observed: 537.
Average bombs per square: 537 / 576 =0.932.

Assume the number of bombs in a square kilometer is a Poisson RV with mean 0.932. Then the
predicted and actual distributions are as follows:
Bombs per
Square
Theoretical #
of Squares
Observed
# of Squares
0 226.74 229
1 211.39 211
2 98.54 93
3 30.62 35
4 7.14 7
5+ 1.57 1
Total 576.00 576

Conclusion: When rare events are randomly distributed, there tend to appear gaps in which no
events occur and then periods in which events appear in clusters. Mentally, we tend to forget about
the gaps and focus on the unusual occurrence of multiple rare events in the same space, giving an
inflated illusion of rare-event clustering. It would actually be quite unusual to see rare events
evenly distributed throughout time or space in a grid-like fashion.

Poisson Convergence
Why does the Poisson distribution work so well?
Stochastic Models
th
, Academic Press.

Roughly speaking, one way to think of a Poisson RV is the sum of a large number of independent
rare events (not necessarily identical). We motivate with an example:

Let
1 if person enters Giant G between 12:05and 12:10 pm on 9/6/05
0 otherwise
i
i
X

Let
1
N
i
i
X X
=
=
, where the summation is over all the people in Fairfax county.

Check conditions
Large number of events? yes
Independent events? Mostly
o Counter-example: Many customers arrive in a short time period. Subsequent
customers see a full parking lot and decide not to enter.
o Counter-example: One car comes with multiple people
Rare events? yes
Identical probabilities of events? no

For the moment, we assume all events are identical, and we relax the assumption later. Specifically,
we suppose ( 1) /
i
P X A N = = for all i. Note: [ ] E X A = .
Based on these assumptions, we have a binomial distribution:
1
1
k N k
N
i
i
N
A A
P X k
k N N
=

= =

!
1
( )! !
k N k
N A A
N k k N N

=

( 1) ( 1) (1 / )
! (1 / )
k N
k k
N N N k A A N
N k A N
+
=

1
! 1 !
k A k
A
A e A
e
k k
= - a Poisson random variable!

In other words Bin(N, p) is approximately Poisson(Np) under the previous assumptions.

Now, we eliminate the identically distributed assumption:
Theorem. Let
, n m
X (1 m n ) be a sequence of RVs where for each n:

, n m
X are independent

,
,
1 w.p.
0 otherwise
n m
n m
p
X

(we are counting events)

,1 ,
(0, )
n n n
p p A + + (collectively, the events are rare, since n is large)

,
1
max 0
n m
m n
p

(all events are rare no one event hogs the probability)
then
,1 ,
Poisson( )
n n n
X X A + + .

Stochastic Models
th
, Academic Press.
Note: This looks similar to the Central Limit Theorem. However, in the CLT, condition 3 is replaced
with
,1 ,
(0, )
n n n
p p nA + + (in other words, the means of the random variables
, n m
X are
approximately constant in n, so the mean of the sum grows linearly in n). Here, the means of the
random variables are shrinking in n, so the mean of the sum stays roughly constant.

Example
400 students are in a calculus class. Let X be the number of students who have a birthday on the day
of the final. What is the probability that there are 2 or more birthdays on the final?

Let
1 if student has a birthday on the final
0 otherwise
i
i
X

( 1) 1/365
i
P X = =
400
1
i
i
X X
=
is approximately Poisson with mean 400 / 365.

Then, ( 2) 1 ( 2) 1
A A
P X P X e Ae

= < = , where A =400 / 365. (Answer: 0.2995)

Preliminary Definitions
Def. A stochastic process is a collection of random variables (RV) indexed by time{ ( ), } X t t T .
If T is a continuous set, the process is a continuous time stochastic process (e.g. Poisson
Process).
If T is countable, then the process is a discrete time stochastic process (e.g. Markov Chain).

Def. A counting process is a stochastic process { ( }; 0} N t t such that
( ) {0,1,2, } N t (that is, ( ) N t is a non-negative integer).
If s t < then ( ) ( ) N s N t (that is, ( ) N t is non-decreasing in t).
For s t < , ( ) ( ) N t N s is the number of events occurring in the time interval ( , ] s t .
Interpretation: ( ) N t is the number of events that have occurred by time t.

Def. A counting process has independent increments if the numbers of events in disjoint (non-
overlapping) intervals are independent.

Def. A counting process has stationary increments if the distribution of the number of events in an
interval depends on the length of the interval, but not on the starting point of the interval. That is,
( ( ) ( ) ) P N s t N s n + = does not depend on s. Intuitively, the interval can be slid around without
changing its stochastic nature.

Def. A function ( ) f is ( ) o h if
0
( )
lim 0
h
f h
h
= . That is, ( ) f goes to zero faster than h goes to zero.

Example: Which functions are ( ) o h ?
2
1.5
2
( ) yes
( ) 0.01 no
( ) yes
( ) no
f x x
f x x
f x x
f x x x
=
=
=
= +

Stochastic Models
th
, Academic Press.

Definitions of the Poisson Process
Definition 1: A Poisson process is a counting process { ( }; 0} N t t with rate 0 > , if:
1. (0) 0 N =
2. The process has independent increments
3. The number of events in any interval of length t is a Poisson RV with mean t .

That is, for all , 0 s t , and 0,1,2, n =
( )
( ( ) ( ) )
!
n
t
t
P N s t N s n e
n
+ = =

Example: Consider people entering a McDonalds over a short period of time, say 20 minutes.

Q: How do you verify these conditions?
A: Condition 1 holds. Condition 2 may hold if people do not come in batches. Hard to verify
assumption 3 without collecting data.

[Note: Cinlar (1975), Introduction to Stochastic Processes, gives a similar definition, without
assuming independent increments. Assumption 3 is changed to: The number of events on any finite
union of disjoint intervals is a Poisson RV with mean b where b is the length of the union.]

Is it possible to use the physics of the situation to derive a Poisson process, similar to the rare
event law given previously?

Definition 2: A Poisson process is a counting process { ( }; 0} N t t with rate 0 > , if:
1. (0) 0 N =
2. The process has stationary increments
3. The process has independent increments
4. ( ( ) 1) ( ) P N h h o h = = + . (#of events approximately proportional to length of interval)
5. ( ( ) 2) ( ) P N h o h = (cant have 2 or more events at the same time orderliness)

This is a more fundamental, qualitative definition of the Poisson process.

Theorem: Definitions 1 and 2 are equivalent.
Stochastic Models
th
, Academic Press.

Q: Can these conditions be verified for the McDonalds example?
A: Stationarity over small intervals ok; independent increments not valid if external events occur.

[Note: Cinlar (1975), Introduction to Stochastic Processes, gives a similar definition, without
assumptions 4 and 5, instead assuming that the process has only unit jumps. Assumption 4 can
actually be eliminated in Cinlar (1975), Lemma 1.8 derives (4) from (1), (2), (3), (5), and that the
process is a counting process.]

Eliminating individual assumptions yields variations on the Poisson process:
Eliminate Assumption 2: Non-stationary Poisson process
Eliminate Assumption 3: Mixture of Poisson processes (choose randomly, then run a
Poisson process)
Eliminate Assumption 5: Compound Poisson process

Def. 2 Implies Def. 1
Assume a Poisson process under definition 2. Consider a time horizon [0, T] divided up into n bins
(where n is large):

n Bins n Bins

The average number of events in Suppose on average T events arrive in time period [0,T]
By orderliness (Property 5), there is (loosely speaking) at most 1 event in each bin.
By Property 4 and stationarity (Property 2),
(1 event in a given bin)
T
P
n
.
By independent increments (Property 3), the numbers in each bin are independent.

Therefore, the total number of events is approximately a binomial distribution bin( , / ) n p T n = . By
the previous discussion on Poisson convergence, the total number of events in the interval is
approximately Poisson with mean np T = .

Additional Poisson Properties
Let
n
T T T , , ,
2 1
be inter-event times for a Poisson Process (for between n & n-1 events). Let
n
S S S , , ,
2 1
be the times of each event (ordered in time). Then..

1
=
N N N
S S T

1 0 1 1
S S S T =

=
=
N
i
i N
T S
1
, and
the following are equivalent
( ) N t N t S t T
N
N
i
i

=1

Stochastic Models
th
, Academic Press.
Inter-event Times
First, we derive the distribution of the time
1
T of the first event.
1
( ) P T t > is the probability that no
events occur in [0, ] t . The number of events in [0, ] t is a Poisson RV with mean t . So,
0
1
( )
( ) ( ( ) 0)
0!
t t
t
P T t P N t e e

> = = = = .
This is the CCDF of an exponential random variable. So,
1
~exp( ) T .

Now, we derive the distribution of the second inter-event time
2
T . First, we condition on the time of
the first event:
2 1
( | ) P T t T s > =
1
( ( ) ( ) 0| ) P N s t N s T s = + = = 0 events in ( , ] s s t + , 1 event in [0, ] s
( ( ) ( ) 0) P N s t N s = + = by independent increments
( ( ) (0) 0) P N t N = = by stationary increments
( ( ) 0) P N t = = since ( ) 0 N t =
t
e

=
Since
2 1
( | )
t
P T t T s e
> = = does not depend on s,

2 2 1
( ) ( | )
t
P T t P T t T s e
> = > = = . So,

2
~exp( ) T and
2
T is independent of
1
T .

We can continue with the same logic for
3 4
, , T T

Definition 3: A Poisson process with rate is a counting process such that times between events are
i.i.d. with distribution exp( ) .

Conditional Distribution of Event Times
Given: One event in [0, ] t , what is the distribution of
1
T ?
1
1
( , ( ) 1)
( | ( ) 1)
( ( ) 1)
(1 event in [0, ],0 events in ( , ])
( ( ) 1)
P T s N t
P T s N t
P N t
P s s t
P N t
=
= =
=
=
=

(1 event in [0, ]) (0 events in ( , ])
( ( ) 1)
P s P s t
P N t
=
=
by independent increments
1
( )
1
( )
1!
( )
1!
s t s
t
s
e e
s
t t
e

= = .
So,
1
( | ( ) 1)
s
P T s N t
t
= = .

This is the CDF of a uniform distribution on [0, t]. Thus, given one event in [0, t], its location is
uniformly distributed in [0, t].
Stochastic Models
th
, Academic Press.

The general result (not proven here) is:
Theorem 5.2: Given n events in [0, t] (i.e., ( ) N t n = ), the un-ordered event times
1 2
, , ,
n
S S S are
distributed as i.i.d. uniform random variables on [0, t].

Un-ordered means that the event times
1 2
, , ,
n
S S S are not listed in the order of occurrence (that
is, where
1 2 n
S S S < < < ). One could think of throwing n darts at the number line [0, t] (according
to a uniform distribution). The dart which happens to be the lowest on the number line would
correspond to the first event. However, this is not necessarily the first dart thrown.

Example (Prob. 5.50)
Hours between successive train arrivals ~Uniform[0, 1].
Passengers arrive ~PP( =7 / hour).
Suppose a train has just left. Let X =number who get on the next train.
Find E[X]and Var[X].

Condition on time Y of next train arrival (Y ~Unif[0,1]).
Then, (X|Y) ~Poisson(7 Y).
E[X]
( | ) 7 E X Y Y = .
( ) ( ( | )) (7 ) 7/ 2 E X E E X Y E Y = = =
Var[X]
var( ) [var[ | ]] var[ [ | ]] X E X Y E X Y = +
Now, var[ | ] 7 X Y Y = (variance of Poisson random variable), so [var[ | ]] 7/ 2 E X Y =
Since [ | ] 7 E X Y Y = and Y ~Unif[0,1], then var[ [ | ]] 49/12 E X Y =
So var( ) 7/ 2 49/12 91/12 X = + = .

th
, Academic Press.
Lecture 3: Poisson Process
Further Properties, Generalizations, and Applications
Given: 9/14/2006

Splitting a Poisson Process

Problem Set-up / Assumptions:
Let ( ) N t be a Poisson process with rate .
Each event is labeled:
o Type-I with probability p,
o Type-II with probability 1 p,
Assignment of event types are i.i.d.
Split Poisson process:
o Let ( )
I
N t be the number of Type-I events by time t.
o Let ( )
II
N t be the number of Type-II events by time t.

Proposition (5.2, p. 296). ( )
I
N t and ( )
II
N t are independent Poisson processes with rates p and
(1 ) p , respectively.

Proof.
(0) 0
I
N =
( )
I
N t has stationary and independent increments.
( ( ) 2) ( ( ) 2) ( )
I
P N h P N h o h =
( ( ) 1) ( ( ) 1| ( ) 1) ( ( ) 1) ( ( ) 1| ( ) 2) ( ( ) 2)
I I I
P N h P N h N h P N h P N h N h P N h = = = = = + =

( ( )) ( ( ) 1| ( ) 2) ( )
( )
I
p h o h P N h N h o h
p h o h
= + + =
= +

Proposition (5.3, p. 303): Same assumptions as above except:
An event at time t is a type-i event with probability ( )
i
p t , 1,2, , i n = , where
1
( ) 1
n
i
i
p t
=
=

(independent of all else). Note: the splitting probability may depend on time.
Let ( )
i
N t be the number of type-i events by time t.
Then, ( )
i
N t are independent Poisson random variables, with
0
[ ( )] ( )
t
i i
E N t p s ds =
.
Note: the split processes are not technically Poisson processes why?

Corollary. If the splitting probabilities have no time dependence, then the split process are
independent Poisson processes with rate
i
p (or mean
i
p t ).

Example
Calls to a central office arrive according to a Poisson process with rate 20 = per min.
Stochastic Models
th
, Academic Press.
The probability that an arriving call is a voice call is 80%; the probability of a data call is
20%, independent of all else.

What is the probability that 100 or more voice calls and 50 or more data calls arrive in a 5 minute
period?
Voice calls have a Poisson distribution with mean 20 0.8 5 80 =
Data calls have a Poisson distribution with mean 20 0.2 5 20 =
The two random variables are independent
The answer is
99 49
80 20
0 0
80 20
1 1
! !
i i
i i
e e
i i

= =

.

What is the probability there are more data calls than voice calls in a 5 minute period?
80 20
0 1
80 20
! !
i j
i j i
e e
i j

= = +

.

Example: Minimizing # of Encounters (Optional)
Assumptions:
Cars enter the highway according to a Poisson process with rate .
Velocity of each car is constant, but chosen according to distribution G.
Cars pass each other with no loss of time.
Q: What speed should you travel to minimize the number of encounters?

Solution
Consider a section of highway with length d and the following variables:

Time Enter
Highway
Time on
Highway Velocity
You 0 / t d v = v
Other car s / T d V = V

The decision variable is v (or equivalently t).

An encounter with this car occurs if:
0 s < and T s t + > (you pass the car) or,
0 s > and T s t + < (the car passes you)

We classify all other cars into those involving an encounter with you and those not. A car arriving at
time s is involved in an encounter with probability p(s):
( ) if 0
( )
( ) if 0
c
F t s s
p s
F t s s
<
=

>

where ( ) ( ) ( / ) ( / ) ( / )
c
F t P T t P d V t P V d t G d t = = = is the CDF of time spend by cars on
this section of highway. (Note: ( ) 0 F t s = when ( ) 0 t s < ).

Stochastic Models
th
, Academic Press.
Think of other cars arriving as a Poisson process and classifying cars by whether or not they have an
encounter with you. By Poisson splitting, the number of cars (over all time) involved in an encounter
with you is a Poisson random variable with mean:
0
0
( ) ( ) ( )
c
p s ds F t s ds F t s ds

= +

(Note: we start counting time, for the Poisson splitting, at , rather than at 0)
( ) ( )
t
c
t
F s ds F s ds
= +

(change of vars)
To minimize this mean, take the derivative with respect to t and set equal to 0:
( ) ( ) 0
c
F t F t + =
This implies that ( ) ( )
c
F t F t = . In other words, t is the median of the travel times on the road.
Equivalently, you should travel at the median velocity of all cars on the road.

Application: M/G/ Queue

Notation
M: Markovian or Memoryless arrival process (i.e., a Poisson process).
G: General service time (not necessarily exponential)
: Infinite number of servers

Let
( ) X t be the number of customers who have completed service by time t
( ) Y t be the number of customers who are being served at time t
( ) N t be the total number of customers who have arrived by time t
Then, ( ) ( ) ( ) N t X t Y t = + .

Splitting the arrival process
Fix a reference time T.
Consider the process of customers arriving prior to time T. (i.e., assume t T ) Note:
notation is slightly different than the book, p. 304.
A customer arriving at t T is
o Type-I if service is completed before T
Occurs with probability ( ) G T t
o Type-II if customer still in service at T
Occurs with probability ( )
c
G T t

Since arrival times and service times are all independent, the type assignments are independent.
Therefore, we can apply Proposition 5.3:
( ) X T is a Poisson random variable with mean
0 0
( ) ( )
T T
G T t dt G t dt =

.
( ) Y T is a Poisson random variable with mean
0 0
( ) ( )
T T
c c
G T t dt G t dt =

.
( ) X T and ( ) Y T are independent.
Stochastic Models
th
, Academic Press.

What happens when T ?

( ) 1 G t for large t. Therefore, ( ) X T is a Poisson random variable with mean T .
( ) Y T is a Poisson random variable with mean
0
( ) [ ]
T
c
G t dt E G
(why does last equality

hold?)

Summary: Number of customers in service in an M/G/ queue in steady state is a Poisson
random variable with mean [ ] E G .

Note: If / and 1/ [ ] E G = , then ( ) ~Poisson( ) X T .

Example
Suppose insurance claims arrive according to a Poisson process with rate 5 per day (Q: What types
of insurance claims can be modeled this way? Hurricane claims? Auto-accident claims?). Suppose
the time it takes to process an insurance claim is uniformly distributed on [1 day, 7 days]. What is
the probability that there are no insurance claims being processed at a given moment?

Solution
Process can be modeled as an M/G/ queue. Assumption made:
Service times are independent
There are a large number of agents, so that effectively the number of servers is infinite
(i.e., no claim ever waits for service)
System is in steady-state

Under these assumptions, the number X of customers in service is a Poisson random variable with
mean [ ] 5 4 20 E G = = . Thus,
20
( 0) P X e
= =

Combining Poisson Processes

If ( )
I
N t and ( )
II
N t are independent Poisson processes with rates
I
and
II
, respectively, and if
( ) N t counts the number of events in both processes, then ( ) N t is a Poisson process with rate
I II
+ .

Why? Inter-event times in ( ) N t are the minimum of inter-event times in ( )
I
N t and ( )
II
N t . Hence,
inter-event times in ( ) N t are exponential with rate
I II
+ (using properties of the exponential
distribution). Hence, ( ) N t is a Poisson process with rate
I II
+ .

Stochastic Models
th
, Academic Press.
Non-Homogeneous Poisson Process (NHPP)

Properties
1. (0) 0 N =
2. ( ) N t has independent increments.
3. [ ] ( ) ( ) 1 ( ) ( ) P N t h N t t h o h + = = +
4. [ ] ( ) ( ) 2 ( ) P N t h N t o h + =

Notes:
This is like a Poisson process, without the stationarity assumption
In property 3, if we had just a constant , then we would have a regular Poisson process
(stationarity is implied by properties 3 and 4).

A process with the above properties is a NHPP with intensity (or rate) function ( ) t .

Def. The mean value function (for a NHPP) is
0
( ) ( )
t
m t u du =

Note: If ( ) t = , then ( ) m t t =

Key Property
For a NSPP, ( ) ( ) N t s N s + (the number of events between s and t) is a Poisson random variable
with mean ( ) ( ) m s t m s + .

Proof (p.316)

Divide interval [ , ] s s t + into n bins. Let
i
N be the number of events in the interval i.
Index i corresponds to interval
( 1)
,
i t it
s s
n n

+ +

Bin width is / t n

Using assumed properties:
( 2) 0
i
P N . (Property 4)
. . . . . . . .
s s+t
N(t+s) N(s)
n bins
Stochastic Models
th
, Academic Press.
( 1)
i
it t
P N s
n n

= +

. (Property 3)

i
N are independent. (Property 2)
Then,
1
( ) ( )
n
i
i
N s t N s N
=
+ =
. For n large, ( ) ( ) N s t N s + is the sum of a large number of

independent, rare events. Thus, ( ) ( ) N s t N s + is approximately a Poisson random variable with
mean:
[ ]
1 1
( ) ( ) [ ]
n n
i i
i i
E N s t N s E N E N
= =

+ = =

Now, [ ] ( 1)
i i
it t
E N P N s
n n

= +

1 1
[ ] ( ) ( ) ( )
s t
n n
i
i i
s
it t
E N s u du m s t m s
n n

+
= =

= + = +

This graphically looks like:
n
it
s + Height =
n
t
Width =
n
it
s + Height =
n
it
s + Height =
n
t
Width =
n
t
Width =

Example

Consider a NHPP with rate
10 0 0.5
( )
20 0.5 1
t
t
t
<
=

.
Find the probability of no events on [0.25, 0.75].

The mean value function is
0
( ) ( )
t
m t u du =
.

We break this integral into 2 parts: [0, 0.5] and [0.5, 1.0].

On [0, 0.5],
0
( ) 10 10
t
m t du t = =
.
On [0.5, 1],
0.5
0 0.5
( ) 10 20 5 20 10 20 5
t
m t du du t t = + = + =

.
In summary,
Stochastic Models
th
, Academic Press.
10 0 0.5
( )
20 5 0.5 1
t t
m t
t t
<
=

Check: ( ) m t should be continuous at 0.5.

(0.75) (0.25) N N is Poisson with mean (0.75) (0.25) 10 2.5 7.5 m m = = .
7.5
( (0.75) (0.25) 0) 0.00055 P N N e
= = = .

Compound Poisson Process (CPP)
Remove the restriction that two or more customers cannot arrive at same time. (i.e., remove
orderliness property).
Let ( ) N t be a PP with rate , and let
i
Y be i.i.d. RV; then
( )
1
( )
N t
i
i
X t Y
=
=
is a compound Poisson
process. Interpretation: there is an underlying Poisson process, but each arrival generates a batch of
events.

Example #1: Buses arrive according to a Poisson process. Let
i
Y be the number of people on bus i,
and let ( ) X t be the total number of people arriving by time t.

Example #2: Insurance claims arrive according to Poisson process. Let
i
Y be the size of the claim
(in dollars), and let ( ) X t be the total amount due to all claims by time t.

Expectation
[ ( )] [ [ ( )| ( )]] E X t E E X t N t =
Now,
1
[ ( )| ( ) ] [ ]
n
i i
i
E X t N t n E Y nE Y
=

= = =

So,
[ ] [ [ ( )| ( )]] ( ) [ ] [ ( )] [ ] [ ]
i i i
E E X t N t E N t E Y E N t E Y tE Y = = =

[ ( )] [ ]
i
E X t tE Y =
Variance
0
5
10
15
20
25
0 0.25 0.5 0.75 1
t
m(t)
(t)
Stochastic Models
th
, Academic Press.
[ ( )] [ [ ( )| ( )]] [ [ ( )| ( )]] V X t V E X t N t E V X t N t = +
Now,
1
[ ( )| ( ) ] [ ]
n
i i
i
V X t N t n V Y nV Y
=

= = =

So,
[ [ ( )| ( )]] [ [ ( )| ( )]] V E X t N t E V X t N t +
[ ( ) [ ]] [ ( ) [ ]]
i i
V N t E Y E N t V Y = +
2
[ ] [ ]
i i
tE Y tV Y = +
2 2 2
[ ] ( [ ] [ ])
i i i
tE Y t E Y E Y = +

2
[ ( )] [ ]
i
V X t tE Y =

Example (similar to 5.26)
People call Ticketmaster according to a Poisson process with rate 2 = per minute. The number of
tickets ordered per call is 1, 2, 3, or 4 with probabilities 1/6, 1/3, 1/3, and 1/6, respectively.

What is the probability that at least 240 tickets are sold in the next 50 minutes?

Let ( ) N t be the number of calls by time t.
Let
i
Y be the number of tickets sold for call i.
Let ( ) X t be the number of tickets sold by time t.

Then, ( ) X t is a compound Poisson process with
( )
1
( )
N t
i
i
X t Y
=
=
, with:
11 2 2 3 2 4 1 15 5
( )
6 6 2
i
E Y
+ + +
= = =
2 2 2 2
2
1 1 2 2 3 2 4 1 43
( )
6 6
i
E Y
+ + +
= =
( ( )) ( ) (2)(50)(5/ 2) 250
i
E X t t E Y = = =
2
2,150
( ( )) ( ) (2)(50)(43/ 6)
3
i
V X t t E Y = = =
Since ( ) N t is relatively large, ( ) X t is approximately a normal random variable. Thus,
(50) 250 240 250
( (50) 240)
2150/3 2150/3
1 ( 0.3735) ( 0.3735)
X
P X P

>

= =

0.6456 =

Lecture by J ohn Shortle, partially transcribed by James LaBelle, based on the class textbook: Ross, S. M., 2003,
th
, Academic Press.
Lecture 4: Markov Chains, Discrete & Continuous Time
Given: 9/21/2006

Discrete-Time Markov Chain (DTMC)
Let
n
X ( 0,1,2, n = ) be a stochastic process, taking on a finite or countable number of
values (generally, assume that {0,1,2, }
n
X ).

( ) X t is a DTMC if it has the Markov Property: Given the present, the future is
independent of the past:
1 1 1 1 1 0 0
( | , , , , )
n n n n
P X j X i X i X i X i
+
= = = = = =
1
( | )
n n
P X j X i
+
= =

In this class, we assume that
n
X is stationary. That is,
1
( | )
n n
P X j X i
+
= = does not
depend on n. That is,
1
( | )
n n ij
P X j X i p
+
= = . The DTMC is said to have stationary
transition probabilities.

Transition probabilities must satisfy 1
ij
j
p =
.
Often write transition probabilities as a matrix P.
Q: Do columns or rows sum to 1?

Continuous-Time Markov Chain (CTMC)

Let ( 0 t ) be a stochastic process, taking on a finite or countable number of values
(generally, assume that ( ) {0,1,2, } X t ).

( ) X t is a CTMC if it has the Markov Property: Given the present, the future is
independent of the past:
( ( ) | ( ) , ( ) ( ), for 0 ) ( ( ) | ( ) ) P X t s j X s i X u x u u s P X t s j X s i + = = = < = + = =

In this class, we assume that ( ) X t is stationary. That is, ( ( ) | ( ) ) P X t s j X s i + = = does
not depend on s, only on t. The CTMC is said to have stationary transition probabilities.

Distribution of Time in a State
Let
i
T be the time spent in state i (before a transition). Suppose
MC enters state i at time 0
MC remains in state i thru time s
What is the probability that the MC remains in state i for at least an additional t time
units?

( | )
i i
P T s t T s > + >
th
, Academic Press.
( | ( ) )
i
P T s t X s i = > + = (by Markov property)
( | (0) )
i
P T t X i = > = (by stationarity)
( )
i
P T t = >

Thus,
i
T has the memoryless property, so ~exp( )
i i
T v .

CTMC: Alternate Definition
This gives an alternate definition for a CTMC:

( ) X t is a CTMC if:
1. The amount of time spent in state i (before a transition) is exponentially
distributed with rate
i
v : ~exp( )
i i
T v
2. When the process leaves state i, it enters state j w.p.
ij
p .
3. All transitions and times are independent (in particular, the transition probability
out of a state is independent of the time spent in the state).

Summary: the process moves from state to state according to a DTMC, and the time spent
in each state is exponentially distributed.

The transition probabilities
ij
p denote the embedded DTMC.
As before, 1
ij
j
p =
.
But now, we require that 0
ii
p = (otherwise, time spent in state i is not
exponential)

Def. The instantaneous transition rate from state i to j is
ij i ij
q v p , where
i
v is the
instantaneous transition rate out of state i.

Note:

ij i ij i
j j
q v p v = =

ij ij
ij
ij i
j
q q
p
q v
= =

Thus, you can specify a CTMC with either { , }
ij i
p v or { }
ij
q .

Example
A company has 4 machines.
The time until each machine breaks is exponentially distributed with mean 6 days.
The repair time of each machine is exponentially distributed with mean 2 days.
There is only one repair person.
All random variables are independent.
th
, Academic Press.

Let ( ) X t be the number of working machines at time t.
The transition rates out of each state are (why?):
0
1
2
3
4
1/ 2
1/6 1/ 2 2/3
2/ 6 1/ 2 5/ 6
3/ 6 1/ 2 1
4/ 6 2/3
v
v
v
v
v
=
= + =
= + =
= + =
= =

The transition probabilities for the embedded DTMC are (why?)
0 1 0 0 0
1/ 4 0 3/ 4 0 0
0 2/5 0 3/5 0
0 0 1/ 2 0 1/ 2
0 0 0 1 0

=

P

Or, define Markov chain using transition rates:
1/ 2 0 0 0
1/ 6 1/ 2 0 0
0 2/ 6 1/ 2 0
0 0 3/6 1/ 2
0 0 0 4/6

Q
(In a moment, we will define the rate matrix Q with non-zero elements on the diagonal.)

Note: Often easier to construct Q first and then construct P and
i
v .

DTMC: n-Step Transition Probabilities
Def. n-step transition probability. Let
n
ij
P be the probability that the system is in state j in
n steps, given the system is in state i now.
( | )
n
ij n k k
P P X j X i
+
= = = .
By stationarity,
0
( | )
n
ij n
P P X j X i = = =
Note:
1
ij ij
P p = (using our original notation).

Chapman-Kolmogorov equations:
0
( | )
n m
ij n m
P P X j X i
+
+
= = =
Must be at one of the possible states at time n:
0
( , |, )
n m n
k
P X j X k X i
+
= = = =

th
, Academic Press.
Apply Bayes rule (easier to see if ignore
0
X i = ):
0 0
( | , ) ( | )
n m n n
k
P X j X k X i P X k X i
+
= = = = = =

By Markov property:
0
( | ) ( | )
n m n n
k
P X j X k P X k X i
+
= = = = =

Thus,
(*)
n m m n
ij kj ik
k
P P P
+
=
.

If
( ) i
P is a matrix of i-step transition probabilities, then (*) is matrix multiplication:
( ) ( ) ( ) n m n m +
= P P P

Also,
(1)
= P P, So
( ) (1) (1) (1) n n
= = P P P P P . In other words, the n-step transition
probabilities are the elements in the matrix obtaining by raising P to the nth power.

Example
1 2 0
0.3 1
0.7 1
1 2 0
0.3 1
0.7 1

0 1 0
0.3 0 0.7
0 1 0

=

P

What is
2
01
P ? Should be 0.
What is
2
02
P ? Should be 0.7.

Check:
2
0 1 0 0 1 0 0.3 0 0.7
0.3 0 0.7 0.3 0 0.7 0 1 0
0 1 0 0 1 0 0.3 0 0.7

= =

P

CTMC: t-time Transition Probabilities
Def. t-step transition probability. Let ( )
ij
P t be the probability that the system is in state j
in t time units, given the system is in state i now.
( ) ( ( ) | ( ) )
ij
P t P X t s j X s i = + = =
( ( ) | (0) ) P X t j X i = = = (by stationarity)

Lemma 6.2
1.
0
1 ( )
lim
ii
i
h
P h
v
h
= (rate process leaves i)

th
, Academic Press.
Proof. ( )
ii
P h Prob(0 transitions in time h) ( )
i
v h
i
P T h e
= > = . Thus,
2
( )
1 1
2! 1 ( )
( )
i
i
ii
i
v h
v h
P h
v o h
h h

+ +

= = +

2.
0
( )
lim
ij
ij i ij
h
P h
q v p
h
= = (rate process goes from i to j)

Proof. ( )
ij
P h Prob(transition before time h and transition is to state j)
[1 exp( )] [1 (1 )]
i ij i ij i ij
hv p hv p hv p = =

Lemma 6.3
( ) ( ( ) |, (0) )
( ( ) , ( ) |, (0) )
ij
k
P t s P X t s j X i
P X t s j X t k X i
+ = + = =
= + = = =

Apply Bayes rule (easier to see if ignore (0) X i = ):
( ( ) | ( ) , (0) ) ( ( ) | (0) )
k
P X t s j X t k X i P X t k X i = + = = = = =

By Markov property:
( ( ) | ( ) ) ( ( ) | (0) )
k
P X t s j X t k P X t k X i = + = = = =

Thus,
(*) ( ) ( ) ( )
ij ik kj
k
P t s P t P s + =
.

Forward Chapman-Kolmogorov Equations:
Basic idea: Apply Lemma 2 using a small time step h:
( ) ( ) ( ) ( ) ( )
ij ij ik kj ij
k
P t h P t P t P h P t

+ =

( ) ( ) [1 ( )] ( )
ik kj jj ij
k j
P t P h P h P t

So,
0 0
( ) ( ) [1 ( )] ( )
( ) ( )
( ) lim lim
ik kj jj ij
ij ij k j
ij
h h
P t P h P h P t
P t h P t
P t
h h
= =

( ) ( ) ( )
ij kj ik j ij
k j
P t q P t v P t

Now, let us define:
jj j
q v = . Then, the previous expression becomes
( ) ( )
ij ik kj
k
P t P t q

This is just matrix multiplication ( ) ( ) t t
= P P Q with
th
, Academic Press.
00 01
10 11
( ) ( )
( ) ( ) ( )
P t P t
t P t P t

=

P

,
0 01
10 1
v q
q v

=

Q

.

Thus, we usually define the transition rate matrix Q with the negative diagonal elements
as described.

The solution to the differential equation is
2 3
( ) ( )
( )
2! 3!
t
t t
t e t = = + + + +
Q
Q Q
P I Q
(This is the matrix analog of solving ( )
ax
x ax x t Ce
= = )

This solution is valid provided
i
v is bounded. In particular, it works when the number of
states is finite.

th
, Academic Press.
Lecture 5: Markov Chains, Discrete & Continuous Time
Given: 9/28/2006

Classifications of States

Def. A path is a sequence of states, where each transition has a positive probability of
occurring.

Def. State j is reachable from state i (or i j ) if there is a path from i to j equivalently,
0
n
ij
P > for some 0 n .

Def. States i and j communicate (i j ) if i is reachable from j and j is reachable from i.
(Note: a state i always communicates with itself)

Def. A MC is irreducible if all states are in the same communication class.

Def. State i is an absorbing state if 1
ii
p = .

Def. A set of states S is a closed set if no state outside of S is reachable from any state in
S (like an absorbing state, but with multiple states)

Def. State i is a transient state if there exists a state j such that j is reachable from i but i is
not reachable from j.

Def. A state that is not transient is recurrent. There are two types of recurrent states:
1. Positive recurrent, if the expected time to return to the state is finite.
2. Null recurrent (less common), if the expected time to return to the state is infinite
(this requires an infinite number of states).

Def. A state i is periodic with period 1 k > , if k is the smallest number such that all paths
leading from state i back to state i have a multiple of k transitions.

Def. A state is aperiodic if it has period 1 k = .

Def. A state is ergodic if it is positive recurrent and aperiodic.

th
, Academic Press.
Examples

1 2 0 1 2 0

Period =2

0
1
2
0
1
2

Period =3

0
1
2
0
1
2

Period =1

(
1 2 3 4
11 11 11 11
0, 0, 0, 0 P P P P = = > > )
Communication Classes
Properties of communication:
1. i i (reflexivity)
2. i j j i (symmetry)
3. i j and j k i k (transitivity)
These three properties partition the set of states into communication classes. Each class is
disjoint, and every state is contained in one class. Each class contains states that
communicate with each other. If there is only one state, the MC is irreducible.

Example
Gamblers Ruin: You win $1 with probability p and lose $1 with probability 1-p. You
stop when you reach $0 or $N. For example, for 4 N = ,
1 2 0 3
1-p 1-p 1-p
p p 1
4
p 1
1 2 0 3
1-p 1-p 1-p
p p 1
4
p 1

Communications classes are:
{0} recurrent
{1, 2, 3} transient
{4} recurrent

Example
1
2
0
3
1
2
0
3

th
, Academic Press.

Communication classes are:
{0, 1} transient
{2, 3} recurrent

Transient and Recurrent Classes
Let
1 if
0 if
n
n
n
X i
I
X i
. Then
0
n
n
I
is the total number of visits to state i. The expected

number of visits to state i (given the MC starts in state i) is:
0 0 0 ,
0 0 0 0
n
n n n i i
n n n n
E I x i E I x i P x i x i P

= = = =

= = = = = = =

Therefore, the state is recurrent if
0
n
ii
n
P
=
=
and transient if
0
n
ii
n
P
=
<
.

Technical note: Switching the expectation and the infinite sum is allowed by the
monotone convergence theorem (e.g., Durrett, Probability Theory and Examples, p. 14):
If 0
j
Y and
j
Y Y , then ( ) ( )
j
E Y E Y . The proof is as follows. (For notational
simplicity, assume all random variables are conditioned on
0
x i = .)
Let
0
j
j n
n
Y I
=
=
and
0
n
n
Y I
=
=
. Then 0
j
Y and
j
Y Y , so the MCT can be used.
The switching of the expectation and infinite sum is proved by:
0 0 0 0
[ ] lim [ ] lim lim ( ) ( )
j j
n n n j n
j j j
n n n n
E I E I E I E Y E Y E I

= = = =

= = = = =

Random Walk
With probability p, we move up 1 step, with probability 1-p, we move down 1 step:

-1 0 -2 1
1-p 1-p 1-p
p p p
2
1-p
p
-1 0 -2 1
1-p 1-p 1-p
p p p
2
1-p
p

Is this chain recurrent or transient?

Probability of returning to state 0:
2
00
2
2 !
(1 ) (1 )
! !
n n n n n
n
n
P p p p p
n n n

= =

Use Sterlings approximation:
1/2
! 2
n n
n n e
+
.
th
, Academic Press.
( )
2 1/2 2
2
00 2
1/2
(2 ) 2
(1 )
2
n n
n n n
n n
n e
P p p
n e
+
+
=
2 1/2
2
(1 )
2
n
n n
p p
n
+
=
4 (1 )
n n n
p p
n
=
State 0 is transient if
2
00
1
n
n
P
=
<
or
1
4 (1 )
n n n
n
p p
n
<
.
If 1/ 2 p = , then
2
00
1 1
1
n
n n
P
n

= =
= =

, so state 0 is recurrent.

If 1/ 2 p , then
2
00
1 1 1
n
n n
n n n
a
P a
n

= = =
= < <

, where 1 a < , so state 0 is transient.

Note: A 2-dimensional symmetric random walk is recurrent. However, a 3-dimensional
(or higher) symmetric random walk is transient.

Limiting Probabilities (DTMC)
Theorem. For an irreducible, ergodic MC, lim
n
j ij
n
P

exists and is independent of the
starting state i. Then
j
is the unique solution of
j i ij
i
P =
and 1
j
j
=
.

Proof. Using law of total probability:
1 1
( ) ( ) ( )
n n n n
i
P X j P X j X i P X i
+ +
= = = = =
.
Taking limits of both sides as time :
1 1
lim ( ) ( ) ( )
n n n n
n
i
P X j P X j X i P X i
+ +

= = = = =

.
j ij i
i
P =
.

In matrix form, this theorem can be stated:
= P

Two interpretations for
i
:
1. The probability of being in state i a long time into the future (large n).
2. The long-run fraction of time in state i.

If the MC is irreducible and ergodic, then interpretations 1 and 2 are equivalent.
Otherwise,
i
is still the solution to = P, but only interpretation 2 is valid.

th
, Academic Press.
Example 1
0 1
1
1
0 1
1
1

[ ] [ ]
[ ] [ ]
0 1 0 1
0 1 1 0
1 0
0 1
P

=

=

=

0 1
0 1
1 0.5
i
i

=
= = =

Chain is irreducible and positive recurrent, but not aperiodic. Thus, interpretation 1 is not
valid. In particular,
2 2 1
00 00
1, 0
n n
P P
+
= = , for integer n.

Example 2
Planes arrive at Dulles airport.
Three types: Heavy, Large, Small
Assume the sequence of airplanes follows a MC

0
1
2
0.3
0.8
0.3
0.2
0.4
0.7
0.3
0
1
2
0.3
0.8
0.3
0.2
0.4
0.7
0.3

3 . 0 7 . 0 0
4 . 0 3 . 0 3 . 0
0 8 . 0 2 . 0
S
L
H
S L H

S L H
S L H S
S L H L
S L H H

+ + =
+ + =
+ + =
+ + =
1
3 . 0 4 . 0 0
7 . 0 3 . 0 8 . 0
0 3 . 0 2 . 0

One equation is redundant, eliminate complicated equation: 0.8 0.3 0.7
L H L S
= + +
th
, Academic Press.
3
8
4
7
H L
S L

=
=

So,
3 4
1 1
8 7
L

+ + =

0.193
0.514
0.294
H
L
S
=
=
=

Example 3: Google
The following Markov Chain is motivated by the Google search engine.

Consider the following MC:
States are web pages
Randomly choose a new page from available links (w.p. 1 / n where n is the
number of links on current page)

Page rank is determined by
j
, the overall fraction of visits to page j.
Note: Page rank is boosted by
Many links to the site
Having the pages which link to the site have a high page rank themselves

Some issues:
Web pages with no links (absorbing states)
Web pages with circular links (absorbing communication class)

Solution: At each site,
With probability p choose a random web page from all web pages
With probability 1 p choose a random web page from existing links

Limiting Probabilities (CTMC)
Let lim ( )
ij j
t
P t P
= (assume no dependence on i).

Using Chapman-Kolmogorov forward equations (recall, we defined
jj j
q v = ):
( ) ( )
ij ik ki
k
P t P t q
(in matrix form: ( ) ( ) t t
= P P Q)
lim ( ) ( )
ij ik ki
t
k
P t P t q

Now, assuming that limit exists, ( )
ij
P t
must go to zero, since probabilities are bounded by

0 and 1. Therefore, ( )
ik k
P t P (assuming limit does not depend on initial state i)
th
, Academic Press.
0
k ki
k
P q =

In matrix notation, this is
0 P = Q

, where
[ ]
0 1 2
P P P P =
and
0 01 02
10 1 12
20 21 2
v q q
q v q
q q v

Remarks: We have assumed that the limiting probabilities
i
P exist (and do not depend on
the initial condition). A sufficient condition for this is: The MC is positive recurrent and
irreducible (note: dont need aperiodic as in CTMC).

Interpretation of this equation:
0 P = Q

0
j kj j j
k j
P q Pv

j j j kj
k j
Pv P q

The left-hand side is the rate of transitions out of state j.
The right-hand side is the rate of transitions into state j.

Example

3 machines, time to failure ~exp(1)
2 service workers, time to repair ~exp(8)

[ ]
0 1 2 3
16 16 0 0
1 17 16 0
0
0 2 10 8
0 0 3 3
P P P P

0 1
1 0
16 0
16
P P
P P
+ =
=

0 1 2
2 1 0
16 17 2 0
8 128
P P P
P P P
+ =
= =

2 3
3 2 0
8 3 0
8 1024
3 3
P P
P P P
=
= =

th
, Academic Press.

0 1 2 3 0
1024
1 16 128
3
P P P P P

+ + + = + + +

0
1
2
3
3
0.00206
1459
48
0.03290
1459
384
0.26319
1459
1024
0.70185
1459
P
P
P
P
= =
= =
= =
= =

Example: M/M/1 Queue
1 2 0 3

4
1 2 0 3

4

( )
( )
( )
0 0
0
0
0 0
Q

+

= +

+

[ ] [ ]
( )
( )
( )
( )
( )
( )
( )
0 1 2
0 1 1 0
0 1 2
1 2 3
1 1 2
1 2 2 1
0
0
0 0 0
0
0
0
0
0
1
i
i
P
P P P
P P P P
P P P
P P P
P P P
P P P P
P
Q

+

= =

+

+ = =
+ + =
+ + =
+ + =
+ =
=

Generalizing this we get:
0
P P
i
i

th
, Academic Press.
=
=
1
1
0
i
i
P
P

th
, Academic Press.
Lecture 6: Markov Chains, Applications, Branching Processes

Limiting Probabilities (Review)

DTMC
= P and 1
i
i
=
.
Sufficient conditions for limiting probabilities and unique solution to exist:
irreducible and ergodic.
CTMC:
0 P = Q and 1
i
i
P =

Sufficient conditions for limiting probabilities and unique solution to exist:
irreducible and positive recurrent.
Since (under given assumptions) the solution is unique, if you can guess
i
or
i
P , and
then solve the above equations, then
i
or
i
P are the limiting probabilities.

CTMC Example: Tandem Queue
rate
in
Queue
#1
1
Queue
#2
rate
in
Queue
#1
1
Queue
#1
1
Queue
#2
2
Queue
#2
2

Assumptions
Exponential inter-arrival times
Exponential service times
All times independent

To model as a CTMC, choose a 2-dimensional state space: ( ) ( , ) X t a b = , where
a is the number at queue 1 (including any customer in service)
b is the number at queue 2 (including any customer in service)

For a Markov chain like this, it is hard to write out the transition matrix Q, because the
state space is 2-dimensional. Instead, we write out the rate balance equations for each
state.

Let
, a b
P be the limiting probability of being in state (a, b). The rate balance equations are:

1. Node ( , ) a b , where , 1 a b :
1 2 , 1 1, 1 2 , 1 1,
( )
a b a b a b a b
P P P P
+ +
+ + = + +
2. Node (0, ) b , where 1 b :
2 0, 1 1, 1 2 0, 1
( )
b b b
P P P
+
+ = +
3. Node ( ,0) a , where 1 a :
th
, Academic Press.
1 ,0 2 ,1 1,0
( )
a a a
P P P

+ = +
4. Node (0,0) :
0,0 2 0,1
P P =

These equations are based on the following rate diagram.

Now, we guess the form of
, a b
P and show that it satisfies all of the equations above.
Clearly, the first queue operates as an M/M/1 queue. Recall, the limiting probabilities for
an M/M/1 queue are:
1
n
n
P

=

Conjecture that the second queue operates as an independent M/M/1 queue.
Queue 1: Arrival rate =, Service rate =
1

Queue 2: Arrival rate =, Service rate =
2

Thus, the joint distribution is:
,
1 1 2 2
1 1
a b
a b
P

=

We can regard the terms that do not depend on a or b as normalizing constants:
,
1 2
a b
a b
P C

=

Check that this solves the above equations.

Equation (1):
1 2 , 1 1, 1 2 , 1 1,
( )
a b a b a b a b
P P P P
+ +
+ + = + +
0,0 0,0
0,1 0,1
0,2 0,2 1,2 1,2
1,1 1,1
1,0 1,0
2,2 2,2
2,1 2,1
2,0 2,0
2

2
2
Eq. 1
Eq. 3
Eq. 2
Eq. 4
Number in Queue 1
Number in
Queue 2
0 1 2
0
2
1
th
, Academic Press.
Plugging in:
1 1 1 1
1 2 1 2
1 2 1 2 1 2 1 2
( )
a b a b a b a b
C C C C

+ +

+ + = + +

Dividing by
1 2
a b

:
2 1
1 2 1 2
1 2
( )

+ + = + +

1 2 2 1
+ + = + +
Also need to check for other equations, but we omit that here.

Summary: The steady-state distribution for the number in each queue is as if the 2 queues
are independent M/M/1 queues. But, the second queue is not really independent of the
first

DTMC Example: Family Genetics
Consider left and right handed people. The book What to Expect the Toddler Years
provides the following probabilities of having left-handed children based on the
handedness of the parents:
Parents Prob (Left-handed Child)
LL a =0.50
LR b =0.17
RR c =0.02
Using this information, what is the fraction p of left-handed people from this data?

Let
n
X be the handedness of the first born of the nth generation. ( { , }
n
X L R ).

A left-handed child
Marries a left-handed spouse with probability p
o Has a left-handed kid with probability a
o Has a right-handed kid with probability 1 a
Marries a right-handed spouse with probability 1 p
o Has a left-handed kid with probability b
o Has a right-handed kid with probability 1 b
A right-handed child
Marries a left-handed spouse with probability p
o Has a left-handed kid with probability b
o Has a right-handed kid with probability 1 b
Marries a right-handed spouse with probability 1 p
o Has a left-handed kid with probability c
o Has a right-handed kid with probability 1 c

Thus, the transition matrix
th
, Academic Press.
L R
L (1 ) 1 [ (1 ) ]
R (1 ) 1 [ (1 ) ]
pa p b pa p b
pb p c pb p c

+

+ +

Now, solve = P

[ (1 ) ] [ (1 ) ]
L L R
pa p b pb p c = + + + ,
where
L
represents the probability of being left-handed for large n - by definition, p.
Also, we have 1
R L
= .
[ (1 ) ] (1 )[ (1 ) ] p p pa p b p pb p c = + + +
2 2
2 (1 ) (1 ) ] p p a p p b p c = + +
2
0 ( 2 ) (2 2 1) a b c p b c p c = + + +
2
0 0.18 0.70 0.02 p p = +
.70 .49 4(.02)(.18)
.36
p

=
3.86,0.03 p =
Choose the value of p that is a probability, so 0.03 p = .

Actual percentage of left-handed people is about 10%. What explains the incorrect value
of p?
Not a Markov chain next state also depends on handedness of grandparents,
great-grandparents?
Other factors influencing next state
Probability two left-handed parents have a left-handed child >>50% (value
presented in book was approximate)

Branching Process
Consider a population.
Each individual produces j new offspring each period with probability , 0
j
p j .
Assume that 1
j
p < for all j (i.e., the problem is not deterministic).
Let
n
X be the size of the population at period n.

Usually modeled as a DTMC. The states of a Markov chain are:

0
X
1
X
2
X
th
, Academic Press.
0 1 2 3 4 0 1 2 3 4

How many communication classes?
State 0 is absorbing.
All other states are transient (assuming
0
0 p > ), since one can get to 0 from any
state, but one can not get to that state from 0. In other words, it is possible that all
individuals fail to produce any offspring during the same time period.
Nevertheless, it is possible that the MC chain has an infinite positive drift to the
right (in other words, every state is transient, but you dont have to end up in state
0).

Let be the average number of offspring per individual. That is,
0
j
j
jp
=
=

If 1 , the system will always end up in state 0 (the population dies out).
If 1 > , the system may end up in state 0 or may grow to infinity.

Fundamental question: What is the probability the population survives indefinitely?

Let
i
Z be the number of offspring of the ith individual from generation n 1. Then,
1
1
n
X
n i
i
X Z
=
=

1 1
0
[ ] [ | ] ( )
n n n n
k
E X E X X k P X k

=
= = =

1
1
0 1
( )
n
X
i n
k i
E Z P X k

= =

= =

1
0
( )
n
k
k P X k
=
= =

1
[ ]
n
E X

=

Thus, (if start with one individual),

0
[ ] 1 E X =

1
[ ] E X =

2
2
[ ] E X =

[ ]
n
n
E X =
th
, Academic Press.
Let
0
be the probability that the population dies out.
Condition on
1
X :
0 1 1
0
(population dies out| ) ( )
j
P X j P X j
=
= = =

(*)
0 0
0
j
j
j
p
=
=

When 1 > , it can be shown that
0
is the smallest positive number satisfying (*).

Example
Suppose:

0
0.3 p =

1
0.3 p =

2
0.4 p =
From this data, 1.1 = .
2
0 0 0
0.3 0.3 0.4 = + +
2
0 0
0 0.3 0.7 0.4 = +
0
.7 .49 .48 .7 .1 3
.8 .8 4

= = =

Lecture by John Shortle, transcribed by James LaBelle, based on the class textbook: Ross, S. M., 2003, Introduction to
Probability Models, 8
th
, Academic Press.
OR 645: Stochastic Models II
Lecture 7: Markov Chains, Birth/Death Processes, Reversible Chains

Birth Death Process
DTMC Birth-Death Process: Let X(t) be a DTMC variable only possible transitions are up (+1) and down (-1) like a random walk.
Births are represented by +1. Deaths are represented by -1. Births and Deaths occur one at a time.
CTMC Birth Death Process: Let X(t) be a CTMC variable represents population size at time t (p353). i people give birth with rate
i

(arrivals). i people have a death with rate
i
(departures). CTMC Characteristics: (a) time in state (b) embedded DTMC.
+
+ +
+ +
=

0 0 0
0 0
0 0
0 0 1 0
3 3
3
2 2
2
2 2
2
1 1
1
1 1
1

P
Probability Matrix
( )
( )
+
+
3
1 2 2 2
1 1 1 1
0 0
0 0
0
0
0 0

Q
Transfer Rate Matrix
1 2 0 3
1

2

3
3
1 2 0 3
1

2

3
3

For M/M/3 Queue

=
=
=
=
=
4
3
2
1
0

3
3
3
2
5
4
3
2
1
=
=
=
=
=

Expected Time to State n: Let T
i
be the time to first get to i+1 starting at i. Condition on 1
st
step. Let
+
+
=
1 is step 1st if 0
1 is step 1st if 1
i
I .
If we move from 23:
[ ]
i i
i i
I T E
+
= =
1
1
which occurs with probability:
[ ]
i i
i
i
I P

+
=
If we move from 213:
Stochastic Models
th
, Academic Press.
[ ] [ ] [ ]
i i
i i
i i
T E T E I T E + +
+
= =
1
1
0

which occurs with probability:
[ ]
i i
i
i
I P

+
=
Unconditionally: [ ] [ ] [ ]
i i i
I T E E T E =
[ ]
[ ] [ ]
[ ] [ ] [ ] [ ]
[ ] [ ]
1
1
1
1
1
1 1
=
+
+
+
+
=
+
+
+
+
+
=
i
i
i
i
i
i i
i i
i
i i
i
i i
i i
i
i i i i
i
i i
i
T E T E
T E T E T E
T E T E
T E

where the initial condition [ ]
0 0
1 = T E
Example: (HW 6.20) There are two machines, one of which is used as a spare. A working machine will function for an exponential time
with rate and will then fail. Upon failure, it is immediately replaced by the other machine if that one is in working order, and it goes to
the repair facility. The repair facility consists of a single person who takes an exponential time with rate to repair a failed machine. At
the repair facility, the newly failed machine enters service if the repairperson is free. If the repairperson is busy, it waits until the other
machine is fixed; at that time, the newly repaired machine is put in service and repair begins on the other one. Starting at both machines
working, find the expected value and variance of the time until both are in the repair facility. In the long run, what proportion of time is
there a working machine?
1 2 0

# in repair
1 2 0

# in repair

[ ]
[ ] [ ]
[ ]
[ ] [ ] [ ]
[ ]

+ =
+ =
+ =
+ =
=
2
2
1 0 2
2
1
0 1
0
2
1
1
1
T E
T E T E T E
T E
T E T E
T E

Stochastic Models
th
, Academic Press.
Variance in Time to State n:
[ ] [ ] [ ] [ ] [ ]
[ ] [ ] [ ] [ ] [ ]
i i i i i
I T V E I T E V T V
Y X V E Y X E V X V
+ =
+ =

where
[ ]

=
=
+
+
=
0 if
1 if 0
1
i
i
i i
i i
I f
I
I T E

where
[ ] [ ]
i i
i
i i
i
1
y probabilit with 0
y probabilit with 1

+
=
+
=
+ =

i
i
i i
I
I
T E T E f

Now note that [ ] [ ]
i i
T E T E f + =
1
is a Bernoulli Variable of a binomial for which [ ] [ ] [ ]
2 2
X E X E X V = is turned into
[ ] ( ) p pA p A p A X V = = 1
2 2 2 2
.
Also note that the term ( )
i i
+ 1 is a constant and therefore will contribute 0 to the variance. So now, we have.
[ ] [ ] [ ] [ ] [ ] pq T E T E I T E V
i i i i
2
1
0 + + =

where A occurs with probability p; and 0 occurs with probability (1 - p) = q.
[ ]
2
1
1

+
= =
i i
i i
I T V

with probability p.
[ ] [ ] 1 plus 1 plus 1 state to time 0 + = = i i i i- i- V I T V
i i

with probability q = 1 - p.
[ ] [ ] [ ]
i i
i i
i i
T V T V I T V + +
+
= =
1
2
1
0

with probability q = 1 - p.
[ ] [ ] [ ]
+ +
+
+
+
=
i i
i i i i
i i
T V T V q I T V
1
2 2
1 1

and the initial condition is : [ ]
2
0
0
1
T V
Note: In Regular Markov Chain, we had .
[ ] i X j X P P
n n ij
= = =
1

.. using Bayes Formula
[ ]
[ ]
[ ] [ ]
[ ] i X P
j X P j X i X P
Q
i X P
i X j X P
Q
n
n n n
ij
n
n n
ij
=
= = =
=
=
= =
=

1 1
1
,

(Note: Dont confuse this with CTMCs
ij
q ). Assume that this is ergodic, irreducible Markov Chain. Let n . Now
Stochastic Models
th
, Academic Press.
a.
i
j ij
ij
P
Q
= , This is always true.

b. A chain is time reversible if
ij ij
P Q = . This is only true for a Time Reversible MC
Given the following sequence
2 1 2 3 1 2 2 1 3 1 2 1 3 2 1 3
What if we had to estimate 12 pattern occurrence?
2
1
6
3
12
= P . And in the reverse sequence, we get
2 1 2 3 1 2 2 1 3 1 2 1 3 2 1 3
3
2
6
4
12
= Q . In other words, NOT EVERY CHAIN is Time Reversible. Markov Chain is Time Reversible if number of transactions
from ij equals the number of transactions going from ji. Expressing
i
j ij
ij
P
Q
= as
ji j ij i
P P =
We can see that the transition rate ij must equal the transition rate ji.
Example: The following chain is not Time Reversible; we can go 13 but we cannot go 31.
1
2 3
1
2 3

Example: The following is chain has the following movement:
1 2 0 1 2 0

0 1 2 1 0 1 0 1 2 1 0
The forward chains sequence will always be within one of the reverse chains sequence.
Conclusion: Birth Death process is Time Reversible.
Theorem: If you can find
i
that satisfies
ji j ij i
P P = and 1 =
i
i
, then the chain is Time Reversible and the
i
s are the
limiting probabilities. In other words, if you can guess a solution to
ji j ij i
P P = that solves it, then this Markov Chain is Time
Reversible.
Example: Given the following Random Walk:
1 2 0 3
p p p
q q q
1 2 0 3
p p p
q q q

Is this Time Reversible Markov Chain? YES. Typical approach is:
1.
1 0
q =
2.
2 1 2 0 1
q p q = + =
3.
3 2 3 1 2
q p q p = + =
Time Reversible (Quess) approach: Let i = 0 & j = 1 and using
ji j ij i
P P = we get
1. ( )
1 0
1 q =
Stochastic Models
th
, Academic Press.
2.
2 1
q p =
3.
3 2
q p =
Cut Method: Using midterm exam problem to illustrate: Given the following Markov Chain:
1 2 0 3
3/10 2/10 1/10
1 1 1/2
1 2 0 3
3/10 2/10 1/10
1 1 1/2

Cut across both paths. Rate you cross from 12 must equal rate of crossings going 21. This is reversible by inspection has to be
true for every possible combinations. Key Point:
ji j ij i
P P = and 1 =
i
i
.
Example: consider the midterm problem represented in the above figure. Go from 01 & 10.
0 1 1 0
5
3
2
1
10
3

=
=
ji j ij i
P P

Go from 12 & 21.
( )
0 1 2 2 1
25
3
5
1
1
10
2
= = =

Go from 23 & 32.
( )
0 2 3 3 2
250
3
10
1
1
10
1
= = =

433
3
,
433
30
,
433
150
,
433
250
250
3
25
3
1
1
3 2 1 0
0 0 0
2 1 0
= = = =
+ =
+ + = =

i
i

Birth Death processes are Time Reversible. A Time Reversible process may be Birth Death process (i.e., there are other Markov
Chains that are reversible besides Birth Death).
Example: consider the midterm problem represented in the above figure. Go from 01 & 10.
2

1 2 0 3 4 5
3 3 3

2

1 2 0 3 4 5
3 3 3

Assume: arrival rate = & service rate = for a single server.
0 1 1 0
P P P P
P P
ji j ij i

= =
=

Stochastic Models
th
, Academic Press.
0
2
2
2
0 1 2 2 1
2
2 2
2
P P
P P P P P
= = =

0
4
4
3
0
2
2
2 3 3 2
3 ! 3
2 3 3
3
P P
P P P P P
= = =

and for any 3 > n ,
0
3
3 ! 3
P P
n n
n
n
=
Proposition 6.8 (p.381) A Time Reversible chain with limiting probabilities S j P
j
, , that is truncated to the set S A and remains
irreducible is also Time Reversible and has limiting probabilities
A
j
P given by
=
A i
i
j A
j
P
P
P , A j
1 2 0 3 4 5
Eliminate these
states to create
the truncated
Markov Chain
1 2 0 3 4 5
Eliminate these
states to create
the truncated
Markov Chain

This is a renormalization, given that we have thrown away some states (see figure below), so that in the new truncated Markov Chain
1 =
i
i
and the individual probabilities will have the same relationship relative to each other. Renormalization Constant is:
A i
i
P .
Stochastic Models
th
, Academic Press.
Example: M/M/3/3 (from the previous example)
=
3
0
0
0
4
4
3
!
1
3 ! 3
j
j
j
P
j
P
P

Where the Renormalization Constant is:
3
0
0
!
1
j
j
j
P
j

In general, the Blocking Probability for M/M/C/C Queue is
=
C
j
j
j
C
C
C
j
C
P
0
!
1
!
1

This is called the Erlang-B Blocking Formula. Same formula is used for M/G/C/C Queues. Use this to determine the probability of being
blocked all circuits are busy.

th
, Academic Press.
Lecture 8: Exam Review, Renewal Theory

Renewal Theory
Generalization of the Poisson Process wherein you allow any distribution for an arrival rate.
Example:
X
1
arrivals
Time
X
1
arrivals
Time

DEFN:
n
X is the time between (n-1)
st
and n
th
arrivals. Basic Assumption of Renewal Theory: assume that X
n
are i.i.d. with
distribution G, and = E[X
n
]. Lets say that = 30 minutes and there is variation. Question: if you show up for the Metro, how long
do you wait? Intuition says the average wait would be /2 minutes. Answer:
[ ]
[ ]
[ ]
[ ] [ ]
[ ] G E
G Var G E
G E
G Var
Wait E
2 2 2 2
+ = + =

Example #1: Let G ~ Deterministic with length .
[ ] = G E [ ] 0 = G Var
[ ]
[ ] [ ]
[ ] 2 2 2

= + =
G E
G Var G E
Wait E
Example #2: Let G ~ exp(1/ ).
[ ] = G E [ ]
2
= G Var
[ ]
[ ] [ ]
[ ]

= + = + =
2 2 2 2
2
G E
G Var G E
Wait E
Counting process (no
structure to process)
Renewal Process (i.i.d.
exponential inter-arrival
time)
Poisson Process (i.i.d.
exponential inter-arrival time)
Counting process (no
structure to process)
Renewal Process (i.i.d.
exponential inter-arrival
time)
Poisson Process (i.i.d.
exponential inter-arrival time)

DEFN: ( ) t N , 0 t a counting process is a renewal process if
n
X are iid ~ G. ( ) t N is the number of arrivals at time t. DEFN:
Let
=
=
n
i
i n
X S
1
, 1 n (time n
th
event occurred). Notation: F denotes inter-arrival distribution.
Assume: F(0) < 1 [CDF] Probability of the inter-arrival times being 0 (zero). So we are saying that inter-arrival times are > 0. F(0)
= 0 two or more arrivals cannot occur at the same time. F(0) > 0 two or more arrivals can occur at the same time. Notation:
[ ]
n
X E = where is a mean (not a rate). Notation: ( ) t S n t N
n
. DEFN: ( ) ( ) [ ] t N E t m = - expected number of
counted events by time t (note: ( ) t m is a deterministic variable and ( ) t N is a random variable). m(t) is the Renewal Function. For a
Poisson Process, m(t) = t. Theorem: m(t) uniquely determines F and vice-versa. Corollary: If m(t) = t then F ~ exp(). Poisson
Process is the only Renewal Process with a linear m(t).
Stochastic Models
th
, Academic Press.
Theorem (Limit Theorem) Proposition 7.1 With probability 1.0, ( ) ( ) ( ) 1 = t t N as time t .
Elementary Renewal Theorem: ( ) ( ) ( ) 1 t t m .
Example #1: When you buy a car time between purchases are iid UNIF[5,15] years. What is the rate of car buying? Once every 10
years?
[ ] [ ] yrs U E 10 15 , 5 = =
Elementary Renewal Theorem:
( )
yrs t
t m
10
1

Example #3: Life of machine is exp(mean 100 days). That is, the mean time to fail ~ exp(mean 100d). Repair time is
~ UNIF[1,5d]. What is rate of machine repairs? Key: Find the Renewal Points on a timeline.
Failed
Time
Repaired
Failed
Time
Repaired

where an event is a machine failure.
[ ] [ ] [ ] [ ]
days
E UNIF E
103 100 3
100 exp 5 , 1
= + =
+ =

days 103
1 1
=
rate
Example #3: M/G/1/1 Queue. Customers arrive at a pay phone according to a Poisson Process (). Time on the phone is ~ G,
[ ] G E
G
= , i.e., 3.5 minutes. Any customer arriving and the phone is in-use, leaves without joining the queue. Question: What is the
rate of calls made? What are the Renewal Points?

1
+ =
G

Rate at which calls are made is:
( ) 1 1
1
+
=
+
G
G

Example #4: What fraction of potential customers make a call?
( )
( ) 1
1
1
+
=
+
G
G

where is rate of arrival of potential customers?
[ ]
[ ]
( )
( )
. 25 . 6
20
125
2 15 5 2
2 15 5
2
2 2 2
yr
X E
X E
n
n
= =
+
+
=
Lecture by J ohn Shortle, transcribed by J ames LaBelle, based on the class textbook: Ross, S. M., 2003, Introduction to
th
, Academic Press.
Lecture 9: Renewal Theory, Renewal-Reward Process

Renewal Reward
During every interval, you get a reward (R
n
) can be monetary gain or monetary loss. Let R
n
be the reward collected at the n
th
event.
R
n
are i.i.d. random variables. But they are not necessarily independent of X
n
. They can be dependent on X
n
(i.e., taxi fare is a reward
dependent on duration of taxi drive). X
i
are i.i.d. random variables. DEFN: ( )
( )
=
=
t N
n
n
R t R
1
be the cumulative (total) reward by time, t.
Proposition: (a)
( ) [ ]
[ ]
[ ]
n
n
n
t
R E
X E
R E
t
t R
= =

lim with probability 1.0. (b)
( ) [ ] [ ]
[ ]
n
n
t
X E
R E
t
t R E
=

lim . Assume that [ ] <
n
R E
and [ ] <
n
X E .
Proof:
( )
( ) ( )
( )
( )
= =

= =
t
t N
t N
R
t
R
t
t R
t N
n
n
t N
n
n
1 1
where
( )
t
t N
is the event rate. By Strong Law of Large Numbers,
( )
( )
[ ] R E
t N
R
t N
n
n
=1
as t ; and by Proposition 7.1
( )
[ ]
n
X E t
t N 1
as t .
Example #1: Car Buying. Assume that a cars lifespan is ~ H [CDF]. Develop a policy where you buy a new car when (a) car dies (i.e.,
wreck) or (b) cars life exceeds your objective value. DEFN: C
1
cost to buy new car. C
2
cost to repair car. You want to minimize the
long run average costs. Your control variable is T length of the car buying cycle (time span from one procurement to the next
procurement; or, lifespan you hold onto any one vehicle). The Long Run Cost Rate (LRCR) is:
[ ]
[ ]
n
n
X E
R E
LRCR =
Renewal Event when you buy a new car. Y ~ H is cars life.
<
=
) (
) (
b T Y T
a T Y Y
X
n
n n
n

[ ] ( ) ( )
[ ] ( ) [ ]
[ ] ( ) ( ) ( )T H dx x xh T H dx x xh X E
T T Y P dx x xh X E
dx x Th dx x xh X E
T
C
T
n
n
T
n
T
T
n
+ = + =
+ =
+ =

1
0 0
0
0

< +
=
) (
) (
1
2 1
b T Y C
a T Y C C
R
n
n
n

[ ] ( ) H C C H C C H C R E
C
n 2 1 2 1 1
+ = + + =
Long Run Cost Rate (LRCR) can now be written as:
Stochastic Models
th
, Academic Press.
( ) ( )T H dx x xh
H C C
LRCR
T
+
+
=
1
0
2 1

Now suppose that our cars have a lifespan of ~ UNIF[5,15].
5 10 15 0
H(t)
h(t)
5 10 15 0
H(t)
h(t)

( )
( ) 15 5
0 ` 1
5
15 5
10
1

=
=
x
x
x H
x x h

Stochastic Models
th
, Academic Press.
Example #1 (continued):
( )
25 30
10 2 20
10 2 20 25
10 2 20
10
5
10 20
10
5
10
10
5
1
10
10
5
2
2 2 1
2 2
2 2 1
2
5
2
2 2
1
5
2 1

+
= =
+ +
+
=
+ +
+
=

+
=
T T
C T C C
T g LRCR
T T T T
C T C C
LRCR
T T
T
x
C T C
C
LRCR
T
T
dx
x
T
C C
LRCR
T
T

The Long Run Cost Rate can be minimized by setting the 1
st
derivative equal to 0, and solving for Quadratic Equation for root solutions.
( )
[ ] [ ][ ]
[ ]
[ ] [ ]
[ ] 625 1500 950 60
20 4 40 300 60 600 50 2 60
25 30
2 30 10 2 20 2 25 30
25 30
10 2 20
2 3 4
2
2
2 1 2 2 1 2 2
2
2
2
2
2 2 1 2
2
2
2 2 1
+ +
+ +
=

+
=

+
=
T T T T
TC T C TC C T C C C C T TC
T T
T C T C C C T T
T T
C T C C
T g
( ) ( )
2 1 2 1
2
2
250 600 20 40 2 0 C C T C C T C + + =
and solving for roots.
( ) ( ) [ ] ( )
2
2
2 1 2
2
2 1 2 1
2
4
5 12 400 2 20 2 20
2
4
C
C C C C C C C
a
ac b b
roots
+ +
=

=
Assume that C
1
= $25,000 and C
2
= $1,000.
( ) ( ) [ ]
6 . 504 , 62 . 14
10 4
10 018 . 2
,
10 4
10 846 . 5
4000
10 038 . 1 10 8 . 9
4000
10 0784 . 1 10 8 . 9
4000
10 18 . 1 10 604 . 9 10 8 . 9
4000
10 18 . 1 10 20 10 1 10 20 10 1
3
6
3
4 6 5
12 5
11 11 5
11
2
3 6 3 6
=

=

=

=
+
=
+
=
roots

You should keep your car for 14-15 years (text pages 418-419 use a different set of parameters for C
1
(= $3000) & C
2
, (= $500) to
achieve a different set of results (T 9.25 years).
Example #2:
Customers arrive at a bus stop with the time between customer arrives represented by the iid distribution Y
n
, = E[Y
n
] (units of
minutes). Bus leaves when there are n customers. Cost for bus is (a) fixed cost, K for each bus departure; and (b) $nc / min when n
customers are on the bus at departure. Minimize the Long Run Cost Rate. Events are Bus Departures. Let X
n
be the time between
bus departures.
[ ] n X E
n
=
Stochastic Models
th
, Academic Press.
Bus Departs
Time
Customers Arriving
C=0 CY
2
2CY
3
C(n-1)Y
n
Bus Departs
Time
Customers Arriving
C=0 CY
2
2CY
3
C(n-1)Y
n

[ ] ( ) [ ]
[ ] ( ) [ ]
[ ]
( ) ( )
[ ]
( )
[ ]
[ ]
( )
( )
2
1 2
1
2
1
2
1 1
1 3 2 1
1 2
2
3 2
+ =

+
=

+ =
+
+ =
+ + + + + =
+ + + + =
n C
n
K
n
n n
C K
X E
R E
n n
C K R E
n n
C K R E
n C K R E
K CY n CY CY E R E
n
n
n
n
n
n n

st
[ ]
[ ]
( )
( )
( )
2
0
2
2
1
2
2
C
n
K
C
n
K
n L
n C
n
K
n L
X E
R E
n
n
=
= +
+ = =

C
K
n
2
=

Average Age of the Renewal Process
(see text pp. 423-424, 430-432 & 467). Question: What is the average age of the cars you have owned? Let the Age of the Renewal
Process be the time since the last Renewal Event indicated by ( )
( ) t N
S t t A = . We are interested in the average value of Age.
Time
Renewal Events
Time, t
Time
Renewal Events
Time, t

If we are at Time, t, then N(t) = 3 and S
N(t)
is the time of the 3
rd
event. Average value of Age, A(t) from [0,S] is
( )
S
dt t A
S
0

For Long Run Average Value.
Stochastic Models
th
, Academic Press.
( )
S
dt t A
S
S

0
lim
To determine this quantity, we use Renewal Reward Theory in the following way. DEFN: ( )
=
n
n
X
X
n
dt t A R
1
. Given that (per above
figure), the span from event (n-1) and n is X
n
. Therefore,
2
2
n
n
X
R = and [ ]
[ ]
2
2
n
n
X E
R E = . Long Run Average Value of A(t) is
.
[ ]
[ ]
[ ]
[ ]
[ ] [ ]
[ ]
[ ] [ ]
[ ]
n
n n
n
n n
n
n
n
n
X E
X Var X E
X E
X E X Var
X E
X E
X E
R E
2 2
2
2
2
2
+ =
+
=
= =

Example #1: Suppose we have E[X
n
] = 10 yrs. For instance, you buy a car every 10 yrs. What is the average age of the car?
CASE #1: Deterministic you buy two cars at exactly 10 yr. Intervals. In this case, your cars average age is 5 years.
[ ]
[ ]
( )
( )
. 5
20
100
2 10 10 2
2 10 10
2
2 2 2
yr
X E
X E
n
n
= =
+
+
=
CASE #2: Random you buy two cars; one at 5 yrs. age on the previous car, and then at 15 yrs. on the previous car.
[ ]
[ ]
( )
( )
. 25 . 6
20
125
2 15 5 2
2 15 5
2
2 2 2
yr
X E
X E
n
n
= =
+
+
=
th
, Academic Press.
Lecture 10: Renewal Theory: Regenerative Proc., Alternating Renewal Proc.

( ) ( )T H dx x xh
H C C
LRCR
T
+
+
=
1
0
2 1
Now suppose that our cars have a lifespan of ~ UNIF[5,15].
5 10 15 0
H(t)
h(t)
5 10 15 0
H(t)
h(t)

( )
( ) 15 5
0 ` 1
5
15 5
10
1

=
=
x
x
x H
x x h

( )
25 30
10 2 20
10 2 20 25
10 2 20
10
5
10 20
10
5
10
10
5
1
10
10
5
2
2 2 1
2 2
2 2 1
2
5
2
2 2
1
5
2 1

+
= =
+ +
+
=
+ +
+
=

+
=
T T
C T C C
T g LRCR
T T T T
C T C C
LRCR
T T
T
x
C T C
C
LRCR
T
T
dx
x
T
C C
LRCR
T
T

st
( )
[ ] [ ][ ]
[ ]
[ ] [ ]
[ ] 625 1500 950 60
20 4 40 300 60 600 50 2 60
25 30
2 30 10 2 20 2 25 30
25 30
10 2 20
2 3 4
2
2
2 1 2 2 1 2 2
2
2
2
2
2 2 1 2
2
2
2 2 1
+ +
+ +
=

+
=

+
=
T T T T
TC T C TC C T C C C C T TC
T T
T C T C C C T T
T T
C T C C
T g
( ) ( )
2 1 2 1
2
2
250 600 20 40 2 0 C C T C C T C + + =
and solving for roots.
( ) ( ) [ ] ( )
2
2
2 1 2
2
2 1 2 1
2
4
5 12 400 2 20 2 20
2
4
C
C C C C C C C
a
ac b b
roots
+ +
=

=
Assume that C
1
= $25,000 and C
2
= $1,000.
( ) ( ) [ ]
6 . 504 , 62 . 14
10 4
10 018 . 2
,
10 4
10 846 . 5
4000
10 038 . 1 10 8 . 9
4000
10 0784 . 1 10 8 . 9
4000
10 18 . 1 10 604 . 9 10 8 . 9
4000
10 18 . 1 10 20 10 1 10 20 10 1
3
6
3
4 6 5
12 5
11 11 5
11
2
3 6 3 6
=

=

=

=
+
=
+
=
roots

Stochastic Models
th
, Academic Press.
21
You should keep your car for 14-15 years (text pages 418-419 use a different set of parameters for C
1
(= $3000) & C
2
, (= $500) to
achieve a different set of results (T 9.25 years).
Example #2:
Customers arrive at a bus stop with the time between customer arrives represented by the iid distribution Y
n
, = E[Y
n
] (units of
minutes). Bus leaves when there are n customers. Cost for bus is (a) fixed cost, K for each bus departure; and (b) $nc / min when n
customers are on the bus at departure. Minimize the Long Run Cost Rate. Events are Bus Departures. Let X
n
be the time between
bus departures.
[ ] n X E
n
=
Bus Departs
Time
Customers Arriving
C=0 CY
2
2CY
3
C(n-1)Y
n
Bus Departs
Time
Customers Arriving
C=0 CY
2
2CY
3
C(n-1)Y
n

[ ] ( ) [ ]
[ ] ( ) [ ]
[ ]
( ) ( )
+
+ =
+ + + + + =
+ + + + =
2
1 1
1 3 2 1
1 2
2
3 2
n n
C K R E
n C K R E
K CY n CY CY E R E
n
n
n n

[ ]
( )
[ ]
[ ]
( )
( )
2
1 2
1
2
1
+ =

+
=

+ =
n C
n
K
n
n n
C K
X E
R E
n n
C K R E
n
n
n

st
[ ]
[ ]
( )
( )
( )
2
0
2
2
1
2
2
C
n
K
C
n
K
n L
n C
n
K
n L
X E
R E
n
n
=
= +
+ = =

C
K
n
2
=
Average Age of the Renewal Process
(see text pp. 423-424, 430-432 & 467). Question: What is the average age of the cars you have owned? Let the Age of the Renewal
Process be the time since the last Renewal Event indicated by ( )
( ) t N
S t t A = . We are interested in the average value of Age.
Time
Renewal Events
Time, t
Time
Renewal Events
Time, t

If we are at Time, t, then N(t) = 3 and S
N(t)
is the time of the 3
rd
event. Average value of Age, A(t) from [0,S] is
( )
S
dt t A
S
0

For Long Run Average Value.
Stochastic Models
th
, Academic Press.
22
( )
S
dt t A
S
S

0
lim

To determine this quantity, we use Renewal Reward Theory in the following way. DEFN:
( )
=
n
n
X
X
n
dt t A R
1
. Given that (per above
figure), the span from event (n-1) and n is X
n
. Therefore,
2
2
n
n
X
R =
and
[ ]
[ ]
2
2
n
n
X E
R E =
. Long Run Average Value of A(t) is .
[ ]
[ ]
[ ]
[ ]
n
n
n
n
X E
X E
X E
R E
2
2
= =

[ ] [ ]
[ ]
[ ] [ ]
[ ]
n
n n
n
n n
X E
X Var X E
X E
X E X Var
2 2
2
2
+ =
+
=

Example #1: Suppose we have E[X
n
] = 10 yrs. For instance, you buy a car every 10 yrs. What is the average age of the car?
CASE #1: Deterministic you buy two cars at exactly 10 yr. Intervals. In this case, your cars average age is 5 years.
[ ]
[ ]
( )
( )
. 5
20
100
2 10 10 2
2 10 10
2
2 2 2
yr
X E
X E
n
n
= =
+
+
=

CASE #2: Random you buy two cars; one at 5 yrs. age on the previous car, and then at 15 yrs. on the previous car.
[ ]
[ ]
( )
( )
. 25 . 6
20
125
2 15 5 2
2 15 5
2
2 2 2
yr
X E
X E
n
n
= =
+
+
=

Problem 7.9 Worker sequentially works on jobs. Each job takes random amount of time with Distribution ~ F to complete. Shocks occur
and disrupt the job with rate ~ PP(). What is the rate of job completion?
Shocks
Time
Finished Job
Shocks
Time
Finished Job

What & When are the renewals? When the cycle starts over in a stochastic sense.
Different Renewal Options Rewards/Renewals
1. J ob completion (os) 1
2. J ob starts (xs & os) 0,1
3. Shock events (xs) 0,1,2,,n

Selecting Renewal Option #2 (Job Starts) with either 0 or 1 job completions before the next cycle start (next job start). Let Z = time to
complete a job (~ F ). Y = time between shocks ~ exp().
[ ] ( ) [ ] Y Z E X E
i
, min =
[ ]
[ ]
[ ]
[ ]
[ ]
( ) [ ] Y Z E
Z Y P
X E
R E
Z Y P
Z Y P
R
i
i
i
, min time
jobs completed #
w.p. 1
w.p. 0
= =
<
=

Y
Z
Y
Z

Graph on left depicts area where Z Y .
For this,
[ ] ( )

=
0 0
Y
y
dzdy e z f Z Y P

pdf of (Y,Z)
[ ] ( )

=
0 0
Y
y
dzdy e z f Z Y P

pdf of (Y,Z)

[ ] ( )
=
0
dy e y F Z Y P
y
z

Stochastic Models
th
, Academic Press.
23
[ ] ( )
=
0
dy e y F Z Y P
y
z

You cannot proceed any further until you understand what F(y) is (relative to
0
dy e
y
).
( ) [ ] [ ] ( )
[ ] ( )
=
=
0
, min
dx x F X E
dx x xf X E Y Z E
C
i
i

where
( )

=
1
0
i
i i
C
x p dx x F

( ) [ ] ( ) [ ]
( ) [ ] [ ] [ ]
> > =
> =
0
0
, min
, min , min
du u Y P u Z P Y Z E
du u Y Z P Y Z E

( ) [ ] ( )
=
0
, min du e u F Y Z E
u C

Selecting Renewal Option #3 (Event Shocks) with either 0,1,2,,n job completions before the next cycle start (next job start).
[ ]
[ ]
[ ] [ ]
[ ] [ ]
>
>
>
=
=
+ +
+ +
1 1
2
1 1
w.p. 2
w.p. 1
w.p. 0
1
n n n n
n n n n
n n
i
i
Y Z P Y Z P
Y Z P Y Z P
Y Z P
R
X E

Note that for 2 =
i
R :
[ ] [ ] [ ]
[ ] [ ]
1 1
2
1 2 2 1 1
+ +
+ + + + +
>
> >
i i i i
i i i i i i i i i
Y Z P Y Z P
Z Z Y Z P Z Y Z P Y Z P

This is a geometric variable (#time before you win); therefore:
[ ]
[ ]
[ ]
[ ]
[ ]
[ ] Y Z P
Y Z P
Y Z P
Y Z P
Y Z P
R E
i
>
=
>
>
=
>
=
1
1
1
[ ]
[ ]
[ ]
[ ] Y Z P
Y Z P
X E
R E
i
i
>
=

Regenerati ve Process
With Renewal Process, you only keep track of when renewals occur. With Regenerative Process, you keep track of when process
starts over and additional information about what goes on during a cycle. DEFN: Regenerative Process is 1) X(t) is a stochastic
process; 2) Process starts over stochastically; and 3) times between start over are i.i.d. with Distribution ~ F. Key Question we try
answer is: What is fraction of time in one state? DEFN: Indicator Function I(t), such that
( )
( )
=
=
otherwise 0
occurs event an i.e., if 1 j t X
t I

X(t) takes on discrete values for example
t
X(t)
I(t)
Renewals
t
X(t)
I(t)
Renewals

Let
( )
( )
=
=
otherwise 0
2 if 1 t X
t I
. Let Reward in cycle n
( ) = =

n
n
S
S
dt t I
1
time spent in j during cycle n, where S
n
is time of n
th
event and S
n-1

is time of the n-1
th
event. By Renewal-Reward Theorem:
Stochastic Models
th
, Academic Press.
24
Long Run
Fraction of
time in
state j
( )
[ ]
[ ]
( )
[ ]
n
S
S
i
i
T
X E
dt t I E
X E
R E
t
dt t I
n
n

= = =

1 0

Alternating Renewal Process
Says that there is an ON period and an OFF period during each cycle. DEFN: System can be ON or can be OFF. Let
Z
i
= ON time of cycle i and
Y
i
= OFF time of cycle i.
t
X(t)
y
1
ON
OFF
y
2
z
1
z
2
z
3
cycle 1 cycle 2
t
X(t)
y
1
ON
OFF
y
2
z
1
z
2
z
3
cycle 1 cycle 2
Note:
Z
i
is independent of Z
j
for ij
Y
i
is independent of Y
j
for ij
Z
i
and Y
i
may be dependent.
Assume: Cycle Time = X
i
= Y
i

+ Z
i
. Main Result: Fraction of
ON time =
[ ]
[ ] [ ]
i i
i
Y E Z E
Z E
+

Example: Machine is working for time Z
i
~ gamma(k=3,=2), where gamma distribution is the sum of 3 i.i.d. exponential distributions
each with = 2 ~exp(). Machine is not working for time ~ UNIF(0,1).
Fraction of ON time =
[ ]
[ ] [ ] 4
3
2
4
2
3
2
1
3
2
1 0
2
1
3
= =
=
+
i i
i
Y E Z E
Z E

Example: Police Officer sits on country road awaiting for cars to pass. Cars arrive ~ PP(). ( ) 5 1 of cars are speeding. Police spend
a ~ UNIF(10,14) min. pulling speeders over. Assume = 2/min. Question: Fraction of time is he doing nothing? Renewal when
he completes processing a speeder. Assume:
Car not speeding
Time
Speeder
busy
Car not speeding
Time
Speeder
busy

[ ]
[ ]
[ ] [ ]
[ ]
[ ] ( ) [ ]
[ ]
j
j
j
j j
j
j
X E
UNIF E X E
X E
Y E X E
X E
Z E
14 , 10
=
=

Assume speeders are i.i.d. with probability of ( ) 5 1 . Speeders arrive ~ PP() = ( ) ( ) 5 2 2 5 1 = .
[ ]
[ ]
29
5
12
2
5
2
5
12
2
5
12 12
2
5
2
29
12
2
5
=
+
=
+
+
=
= + =
j
j
Z E
X E

Question: What is the Rate of Citations?
With no time to write citations: ( ) 5 2
With time to write citations ( )( ) ( ) 29 2 29 5 5 2 =

Example: People get sick ~ PP() . Time of sickness is i.i.d. ~ F. Assume 1
st
person is sick at t = 0. Question: What is the expected
time until nobody is sick?
Stochastic Models
th
, Academic Press.
25
t
X(t)
I(t)
1
2
3
4
t
X(t)
I(t)
1
2
3
4

Using Alternating Renewal Process: Let
( )
( )
>
=
otherwise 0
0 when 1 t X
t I

T = length of time when X(t) > 0 & I(t) = 1.
Goal: Find E[T]. This is a M/G/ queue where people get sick with rate .
X(t) ~ PP
( )

t
C
du u F
0

For t (i.e., t is large)
( ) t X ~ PP [ ] [ ] F E
( ) [ ]
[ ]
( ) [ ]
[ ] F E
F E
e t X P
e t X P
=
= =
1 1
0

and this is the Long Run Fraction of Time that more than one person is sick.
[ ]
[ ]
[ ] ( )
[ ]
[ ]
[ ] [ ] ( ) ( )
1 1
1
1
+ =
+
=
T E e T E
T E
T E
e
F E
F E

[ ]
[ ]
[ ]
[ ] F E
F E
e
e
T E

=
1

Excess of Renewal Process
DEFN: Equal to time until the next renewal. Excess = Y(t). Recall: A(t) = time since last renewal occur.
Average Long Run Excess
[ ] [ ]
[ ]
n
n n
X E
X Var X E
+ =
2 2

(or Age of Process)
Now, derive ( ) [ ] u t Y P > probability that you will have to wait longer than u for the next event, i.e., next bus.
t
X(t)
1
2
3
4
z
1
z
2
z
3
z
4
Renewal
Points
t
X(t)
1
2
3
4
z
1
z
2
z
3
z
4
Renewal
Points

Want to compute the Long Fraction of Time that Y(t) > u. Assume Indicator Variable
( )
( )
>
=
otherwise 0
if 1 u t Y
t I

Let Z
j
be ON time in cycle j. u X Z
j j
= if
j
X u < ; therefore,
( ) 0 , max u X Z
j j
=

( ) [ ]
[ ]
[ ]
j
j
X E
Z E
u t Y P = >
by Renewal-Reward Theorem
Stochastic Models
th
, Academic Press.
26
( ) [ ]
[ ]
( ) [ ]
[ ]
( ) [ ]
[ ]
j
j
j
j
j
j
X E
dy y u X P
X E
dy y u X P
X E
u X E

=

=
=
0
0
0 , max
0 , max

since y 0
( ) [ ]
[ ]
( ) [ ]
[ ]
[ ]
[ ]
( )
[ ]
j
u
C
j
u
j
j
u
j
j
j
X E
dz z F
X E
dz z X P
X E
dy y u X P
X E
dy y u X P

=
>
=
+ >
=
+ >
=
0

( ) [ ] u t Y P > =
Equilibrium Distribution p.432 & 469.

th
, Academic Press.
Lecture 11: Brownian Motion

Derivation / Motivation
First, consider a symmetric random walk (as a discrete-time MC)
-1 0 -2 1
1-p 1-p 1-p
p p p
2
1-p
p
-1 0 -2 1
1-p 1-p 1-p
p p p
2
1-p
p

(where p =).

Intuitively, Brownian motion can be thought of as a symmetric random walk where the
jumps sizes are very small and where jumps occur very frequently.

Specifically, suppose that
J ump sizes are x (rather than 1 )
The time increment for the DTMC is t (rather than 1 time unit).
We will achieve Brownian motion by letting 0 x and 0 t in the right way.

Let
i
X indicate whether the ith jump is to the right (+1) or to the left (-1). That is,
1 w.p. 1/2
1 w.p. 1/2
i
X

+

Note that:
( ) 0
i
E X =
2 2
( ) ( ) ( ) 1 0 1
i i i
V X E X E X = = =

Then, the state of the Markov chain after n jumps is:

1 2
( )
n
x X X X + + +
If we think of the discrete Markov chain operating in continuous time, then the state of
the random walk at time t is:
1 2 /
( ) ( )
t t
X t x X X X

= + + + ,
where z

(the floor function) is the greatest integer less than or equal to z.

Then,
( ( )) 0 E X t =
2
1 2 /
2
2
( ( )) ( ) [ ( ) ( ) ( )]
( )
( )
t t
V X t x V X V X V X
t
x
t
t
x
t

= + + +

=

th
, Academic Press.
Now, we let 0 x and 0 t .

Method #1. The most obvious way is to let 0 x t = . Then,
( ( )) 0 E X t =
2
( ( )) ( ) 0
t
V X t x x t
t
= =

In other words, in the limit, the random walk converges to a process where ( ) 0 X t = .
This is not an interesting process. We need to take the limit in such a way that the
variance is neither 0 nor infinite.

Method #2. Let 0 x t = . Then
2 2 2
( ( )) ( )
t t
V X t x t t
t t
= =

Note: The variance of ( ) X t increases linearly in time. In other words, you have less
certainty about the future than about the process a short time from now.

Note: can technically be negative, but convention is that >0.

Summary: Brownian motion is the limit of a symmetric random walk where
0 x t = .

Properties of Brownian Motion

Basic properties of Brownian motion (giving only intuitive proofs)
1. ( ) 0 X t = (assumed)
2.
2
( ) ~ (0, ) X t N t
a. Why? ( ) X t is the sum of a large number of i.i.d. random variables
3. ( ) X t has independent increments.
a. That is,
2 1
( ) ( ) X t X t (the net change over the interval
1 2
[ , ] t t ) is
independent of
4 3
( ) ( ) X t X t (the net change over the interval
3 4
[ , ] t t ),
assuming the intervals are disjoint.
b. Why? Underlying random walk has independent increments
4. ( ) X t has stationary increments.
a. That is,
2 1
( ) ( ) X t X t has the same distribution as
4 3
( ) ( ) X t X t if
2 1 4 3
t t t t = . In other words, the net change over an interval depends
only on the length of the interval, not on where the interval starts.
b. Why? Underlying random walk has stationary increments

Side note: ( ) X t is continuous everywhere, but differentiable nowhere (no proof given
here).

th
, Academic Press.
What is the distribution of
2 1
( ) ( ) X t X t ?
By stationary increments (property 4),
2 1
( ) ( ) X t X t has the same distribution as
2 1 1 1 2 1 2 1
( ) ( ) ( ) (0) ( ) X t t X t t X t t X X t t = =
By property 2,
2
2 1 2 1
( ) ( ) ~ (0, ( ) X t X t N t t .

Standard Brownian Motion

When 1 = , the process is called Standard Brownian Motion (SBM)

Any Brownian motion process ( ) X t can be converted to SBM by dividing by (as we
now show):

Let ( ) X t be Brownian motion with
2
( ( )) V X t t =
Let
( )
( )
X t
Y t
= .
Then, ( ) Y t has all of the properties above, but with ( ( )) V Y t t = . That is, the variance
coefficient for ( ) Y t is 1
Y
= .

In this class, we generally analyze SBM, since any Brownian motion can be converted to
SBM.

Brownian Bridge

The basic idea here is to condition on the final value of a Brownian motion process and
derive the stochastic properties in between.

-2
-1
0
1
2
3
4
0 5 10 15
x =1/16,
t =1/64
X(s)
X(t) = B
t s
-2
-1
0
1
2
3
4
0 5 10 15
x =1/16,
t =1/64
X(s)
X(t) = B
t s

Note: There is an error in the figure. The diagonally sloped line should run from (0,0) to
(t, B).

Let ( ) X t be standard Brownian motion. Suppose we know that ( ) X t B = . What is the
distribution of ( ) X s (where 0 s t < < ).

th
, Academic Press.
(If we do not condition on ( ) X t B = , then ( ) ~ (0, ) X s N s .)

The goal is to find
( )| ( )
( | )
X s X t
f x B i.e., the density function of ( ( ) | ( ) ) X s x X t B = = .

Main result

( )| ( )
( | ) ~ , ( )
X s X t
Bs s
f x B N t s
t t

. That is, ( ( ) | ( ) ) X s x X t B = = is normally
distributed with:
( ( )| ( ) )
s
E X s X t B B
t
= =
( ( )| ( ) ) ( )
s
V X s X t B t s
t
= =

NOTE: These results are for standard Brownian motion.

Some intuitive checks:
The expected value of ( ( ) | ( ) ) X s x X t B = = lies on a line from going from (0,0)
to ( , ) t B (see figure).
End point checks
o If 0 s = , then ( ( )| ( ) ) 0 E X s X t B = = and ( ( )| ( ) ) 0 V X s X t B = = .
o If s t = , then ( ( )| ( ) ) E X s X t B B = = and ( ( )| ( ) ) 0 V X s X t B = = .
o That is, the endpoints are known (no variance)
The value of s with the highest variance is / 2 s t = . That is, we know the
endpoints of the process, but we dont know exactly what happens in between.

Derivation
( ), ( )
( )| ( )
( )
( , )
( | )
( )
X s X t
X s X t
X t
f x B
f x B
f B
=
We can split the numerator into 2 parts: The process must first go from 0 to x in s time
units, then from x to B in t s time units. These parts are independent.

( ) ( ) ( )
( )
( ) ( )
( )
X s X t X s
X t
f x f B x
f B

=
Note: the denominator does not depend on x, it is a normalizing constant. Then, using the
normal density function and basic properties for SBM,
2 2
1
( )
exp exp
2 2( )
x B x
K
s t s

=

, where
1
K is the normalizing constant from the
denominator to make the expression integrate to 1.

Multiplying out:
th
, Academic Press.
2
1 2
1 1
exp
2 2( ) ( )
Bx
K x K
s t s t s

= + + +

, where
2
K is some constant that
does not depend on x.

Now, we complete the square for the expression in the exponent:
2 2
2 2
2
3
1 1
2
2 2( ) ( ) 2 ( )
2 ( )
Bx t Bxs
x K x K
s t s t s s t s t
t Bs
x K
s t s t

+ + + = +

= +

where
3
K is a constant. Thus,
2
4
( | ) exp
2 ( )
t Bs
f x B K x
s t s t

=

, where again
4
K is a normalizing constant.

This completes the proof.

Note on all the normalizing constants: We could if we wanted to figure out
1
K ,
2
K ,
etc. But we do not need to. At each step, we know that we have a density function that
integrates to 1, so we can use that information to avoid explicitly calculating the
constants.

Example 1

Suppose you have a stock whose value follows Brownian motion with
2
( ) ~ (0, ) X t N t
and 4 = .

A) If the stock is up $10 after 3 hours, what is the probability you at least break even
at the 6-hour mark?

We want to find ( (6) 0| (3) 10) P X X > =

( (6) (3) 10| (3) 10) P X X X = > =
( (6) (3) 10) P X X = > by independent increments
( (6) (3) 10/ 4) P Y Y = > , where ( ) ( )/ 4 Y t X t = is SBM.
( (0,3) 2.5) P N = > Since (6) (3) ~ (0,3) Y Y N .
( (0,1) 2.5/ 3) P N = >
1 ( 2.5/ 3) =
(2.5/ 3) =

Note 1: We did not use the Brownian bridge process in this first part.

th
, Academic Press.
B) If the stock is up $10 after 6 hours, what is the probability you were ahead after 3
hours?

We want to find ( (3) 0| (6) 10) P X X > =

( (3) 0| (6) 2.5) P Y Y = > = , where ( ) ( )/ 4 Y t X t = is SBM.

This is a Brownian bridge process where
3 3 5 3
( (3)| (6) 2.5) ~ 2.5, 3 ,
6 6 4 2
Y Y N N

= =

.
Thus, ( (3) 0| (6) 2.5) ( (1.25,1.5) 0) P Y Y P N > = = >
( (0,1) 1.25/ 1.5) P N = >
1 ( 1.25/ 1.5) =
(1.25/ 1.5) =

Example 2
In a bicycle race between two competitors, let ( ) X t denote the amount of time (in
seconds) by which the racer that started in the inside position is ahead when 100t percent
of the race has been completed, and suppose that ( ), 0 1 X t t can be effectively
modeled as a Brownian Motion process with variance parameter
2
.

(a) If the inside racer is leading by seconds at the midpoint of the race, what is the
probability that she is the winner?
( (1) 0| (0.5) ) P X X > =
( (1) (0.5) | (0.5) ) P X X X = > =
( (1) (0.5) ) P X X = > by independent increments
( (0.5) (0) ) P X X = > by stationary increments
( (0.5) ) P X = >
2
( (0, 0.5) ) P N = >
2
( (0,1) / 0.5) P N = >
1 ( 0.5) =
( 0.5) =
0.9213 =

(b) If the inside racer wins the race by a margin of seconds, what is the probability
that she was ahead at the midpoint?
( (0.5) 0| (1) ) P X X > =
( (0.5) 0| (1) 1) P Y Y > = , where ( ) ( )/ Y t X t = is SBM.
Using Brownian bridge properties:
( (0.5,0.25) 0) P N = >
( (0,1) 0.5/ 0.25) P N = >
th
, Academic Press.
5 10 15
Drift t
5 10 15
Drift t
(1) 0.8413 = =

Brownian Motion with Drift

Definition 1. Let B(t) be standard Brownian motion. Let ( ) ( ) X t B t t = + . Then X(t) is
Brownian motion with drift (and variance parameter
2
).

Definition 2. { ( ); 0} X t t . X(t) is Brownian motion with drift (and variance parameter
2
) if
1. (0) 0 X =
2. { ( ); 0} X t t has stationary and independent increments
3.
2
( ) ~ ( , ) X t N t t
These two definitions are equivalent.

Example
Let X(t) be Brownian motion with 2 = and drift 0.1 = .
What is [ (30) 0| (10) 3] P X X > = ?

Method 1
[ (30) 0| (10) 3] [ (30) (10) 3] P X X P X X > = = >
Now, X(30) X(10) is a normal random variable with mean (30 10) 2 = and variance
2
(30 10) 80 = . Thus,
[ (30) (10) 3] [ (2,80) 3]
3 2
(0,1)
80
1
1
4 5
P X X P N
P N
> = >

= >

=

Method 2
First convert to standard Brownian motion:
( )
( )
X t t
Y t

= is standard Brownian motion

th
, Academic Press.
( (30) 0| (10) 3)
(30) (30) 0 (30) (10) (10) 3 (10)
3
(30) (10) 2
2
( (30) (10) 1/ 2)
( (20) 1/ 2)
( (0,20) 1/ 2)
1/ 2
(0,1)
20
1
1
4 5
P X X
X X
P
P Y Y
P Y Y
P Y
P N
P N

> =

= > =

= > =

= >
= >
= >

= >

=

Read Me at Last After Markov Chain (Summary)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Read Me at Last After Markov Chain (Summary)

Uploaded by

Copyright:

Available Formats

Lecture by J ohn Shortle, partially transcribed by J ames LaBelle, based on the class textbook: Ross, S. M.

, where the summation is over all the people in Fairfax county.

= - a Poisson random variable!

(we are counting events)

is approximately Poisson with mean 400 / 365.

= . That is, ( ) f goes to zero faster than h goes to zero.

> = = does not depend on s,

> = > = = . So,

(why does last equality

. For n large, ( ) ( ) N s t N s + is the sum of a large number of

= (rate process leaves i)

= = (rate process goes from i to j)

is the total number of visits to state i. The expected

= (assume no dependence on i).

(in matrix form: ( ) ( ) t t

must go to zero, since probabilities are bounded by

= , This is always true.

= is standard Brownian motion

You might also like