You are on page 1of 10

95

Application of origin-destination
matrices to the design of train services *
AR Albrecht and PG Howlett
School of Mathematics and Statistics, University of South Australia
D Coleman
TTG Transportation Technology, Sydney, NSW

SUMMARY: We consider two related problems in the design of train services on a linear rail
network. In the first case, for a prescribed set of practical stopping plans, we determine the number
of train services with each allowable stopping pattern that best meets the known demand. We
establish fundamental results to define the concept of a maximal origin-destination demand matrix,
and use this insight to formulate and solve an integer program that finds the best collection of train
services. In the second case we discuss demand estimation from a collection of observed traffic counts.
Our aim is to outline the fundamental procedures proposed in a celebrated paper by Van Zuylen
& Willumsen (1980). These two problems arose during an Australian Mathematical Sciences
Institute (AMSI) industry internship sponsored by Sydney-based company TTG Transportation
Technology. These problems are well-suited as a basis of a senior level project-based mathematics
course in which students build research skills and develop real-world technical experience
through the study of industrial problems. The instructor may use the problems to motivate the
study of deterministic mathematical programming and stochastic optimisation, and to introduce
undergraduate mathematics students to important techniques in modern applied mathematics.

Introduction

An important element in creating a train timetable is


the design of train services to meet a known travel
demand within some fixed time interval. Demand is
defined by the number of people who wish to travel
between each pair of stations. The set of all possible
origin-destination (OD) demands is described by an
OD matrix (Borndrfer et al, 2004).
A key feature of each train service is the list of stations
at which the train is required to stop. This is called
a stopping pattern. Any collection of train services
that meets the known demand can be regarded as
a solution to the design problem. The solution may
include many different stopping patterns. The best
solutions could be those that maximise the number of
express services, or those that minimise travel time,
the number of passenger transfers or the number of
unnecessary stops; or it could be those that minimise
line operating costs (Borndrfer et al, 2004). At a higher
* P
 aper D09-EM09 submitted 9/06/09; accepted for
publication after review and revision 7/10/09.
Corresponding author Dr Amie Albrecht can be
contacted at amie.albrecht@unisa.edu.au.
Institution of Engineers Australia, 2009

level is the issue of how an OD matrix is obtained


in the first place. For a rail network this may be a
non-trivial task. Household surveys are costly and
labour-intensive, and so indirect estimation methods
based on traffic counts have been developed. See
Cascetta & Postorino (2001) for an overview of
methods. Although these methods require more
complex mathematics, the data collection problem is
convenient and relatively inexpensive.
We consider two related problems, which we now
state informally.
Problem 1: For a given OD matrix and train services
with prescribed stopping patterns and given
capacities, find the number of services for each
stopping pattern that best meets the demand.
Problem 2: On the basis of a collection of observed
traffic counts, find the best estimate for the OD matrix.
The key results in this paper are Lemmas 1 and 2
in section 2.2, and the statement and solution of
Problem 1 in section 2.4 using an integer program.
It is equally important to understand the contextual
relevance of our results. In this regard our assumption
that demand is known underlines the estimation of
Australasian Journal of Engineering Education, Vol 15 No 2

96

Application of origin-destination matrices to the design of train services Albrecht, Howlett & Coleman

demand as a critical practical problem for transport


planners. The methods proposed by Van Zuylen &
Willumsen (1980), which we describe in sections 3.4
and 3.5, are regarded as fundamental to the problem
of demand estimation. For any serious discussion of
Problem 1, we believe it is essential to emphasise the
intrinsic connection to Problem 2 and to understand
the mathematical implications.
The Australian Mathematical Sciences Institute
(AMSI) industry internship project provides funding
for a graduate or postdoctoral student to work
on an industrial mathematics problem under the
supervision of an academic mentor and an industry
advisor. The problems discussed in this paper arose
during an AMSI internship at the University of South
Australia with industry partner TTG Transportation
Technology, which provides technological support
and consulting services to the rail industry.
In an undergraduate context, the problems described
here are well-suited as a basis of a senior level projectbased mathematics course, such as the Mathematics
Clinic program conducted at Harvey Mudd College
(HMC) in California (see Spanier (1977) for a
description). A similar course modelled on the HMC
program is run at the University of South Australia.
These programs are designed around industrial
problems that require students to build research
skills and develop real-world technical experience.
Projects are student-driven with the instructor
acting as facilitator and advisor. The instructor may
use the train service design problem to drive and
motivate the study of deterministic mathematical
programming. The demand estimation problem
requires the study of stochastic optimisation.
2

Satisfying demand

An OD matrix specifies the number of people wishing


to travel within a given time interval between each
OD pair of a traffic network. We wish to determine
train services that will best meet the demand
described by a given OD matrix. We consider travel
in one direction on a linear network.
Over the entire journey, an individual seat is occupied
by one or more passengers when a passenger
alights at a particular station, the now empty seat
may be occupied by another passenger boarding the
train at the same station. A natural question is to ask
how many seats are required to satisfy the demand.
This number is the maximum, over all segments,
of the number of seats required on each individual
segment. The calculations are described in section
2.1. In order to minimise unnecessary stops for
individual passengers we could identify seats with
similar usage patterns and group them together to
form trains with a stopping pattern that serves the
specific needs of the group.
Australasian Journal of Engineering Education

With these observations in mind we investigate,


in section 2.2, how an individual seat may best be
used. We define a one-seat movement pattern as a
combination of single passenger movements that,
when considered together, uses precisely one seat for
the entire journey. In section 2.3 we look at grouping
one-seat movement patterns together to form trains.
2.1

Calculating the minimum


number of seats required

Consider a linear network with m stations. For each


k = 1, 2, ..., m 1, let J k denote the track segment
between station Ak and station Ak+1. The minimum
number of seats required to service a given OD
matrix D will be the maximum, over all segments, of
the number of seats occupied on each segment. We
write D di , j m m, where the entry di,j in the (i, j)
position for i < j denotes the number of passengers
to be moved from station i to station j. For i j,
set di,j = 0. The number of passengers boarding at
m
station k is given by pk jk 1 dk , j for k < m. Since
no passengers board at station Am, it follows that
pm = 0. The number of passengers alighting at station
k 1
k is given by qk i1 di ,k for k > 1. Since no passengers
alight at station A1, it follows that q1 = 0. For each
k = 1, 2, ..., m, the cumulative number of passengers
that have boarded up to and including station
k
k may be expressed as rk i1 pi and the cumulative
number of passengers that have alighted at or before
k
station k may be expressed as sk i1 qi . Thus
tk = rk sk is the number of seats required on segment
Jk for all k = 1, 2, ..., m 1. The minimum number of
seats required is n = max1km1tk.
2.1.1

Example 1 a simple OD matrix

Let m = 4 and consider the OD matrix:


0 4 3 4

0 2 3
D

0 3

We have:
k

pk

qk

rk

sk

tk

11

11

11

16

12

19

10

10

19

19

and so n = 12. Therefore a minimum of 12 seats are


required.
2.2

The set of passenger movement patterns

To define efficient movement patterns we combine


compatible demands to ensure that each seat is
Vol 15 No 2

Application of origin-destination matrices to the design of train services Albrecht, Howlett & Coleman

occupied for the entire journey. Thus, when one


passenger alights the seat is immediately occupied
by another passenger boarding at the same station.
This leads to a collection of one-seat movement
patterns. Each one-seat movement pattern generates
a corresponding OD matrix. It is possible to service
an OD matrix without requiring that each seat
be occupied for the entire journey but this means
there would be spare capacity to move additional
passengers.
2.2.1

Example 1 (continued)

97

2m2 different one-seat movement patterns for m


stations and 2m2 corresponding OD matrices. Another
way of counting the different one-seat movement
patterns is to count the number of sequences of the form
i = (1, i1, i2, ..., ir, m) where i I. Thus we have:
m 2 m 2
m 2
m 2

...
2
0
1
m
2

(1)

different movement patterns. We now examine how to


use one-seat movement patterns to meet the demand
q
given in the OD matrix. Let n ni , where
m2
q = 2 and define a maximal OD matrix:

There are four different one-seat movement patterns


and four corresponding OD matrices:

Fn ni Di

D(1,4)

D(1,3,4)

The matrix Fn is useful in practice as it describes


a maximal level of demand that requires the same
number of seats as the original OD matrix. The
number of passengers boarding the train at the first
station A1 is given by n iI ni and is equal to the
number of passengers alighting at the final station Am.
At each intermediate station the number boarding is
equal to the number alighting and hence the number
of passengers on each segment is constant. Of course
each individual one-seat movement pattern also has
these same properties. We have the following results.

0 0 1
0 1

0 0 0
0
, D(1,2,4)

0 0

0 1 0
0

0 0 0
, D(1,2,3,4)

0 1

0 0

0 1
,
0 0

0
1 0 0

0 1 0
0 1

The matrix D (1,2,4) corresponds to a passenger


boarding at A1 and alighting at A2, with another
boarding at A2 and alighting at A4.
For a general m station problem, the OD matrix
for a one-seat movement pattern must satisfy the
following constraints:
j d1, j 1 (one passenger must board at station 1)
i di ,m 1 (one passenger must alight at station m)
di,j = 1 for some j > i implies $ k > j such that
dj,k = 1 (if one passenger alights at station j, one
passenger must board at station j).
Each one-seat movement pattern, i is defined as
a permissible sequence i = (1, i1, i2, ..., ir, m) where
1 < i1 < i2 < ... < ir < m for some r with 0 r m 2.
The set of permissible sequences will be denoted by
I = Im. We order the elements of I according to the rules:
1. if r < s, then (1, i1, i2, ..., ir, m) (1, j1, j2, ..., js, m)
2. if il = jl for l < k and ik < jk, then (1, i1, i2, ..., ik, m)
(1, j1, j2, ..., jk, m).
Each i has an associated OD matrix Di, with di,u,v = 1
if u = ip and v = ip+1 for p = 0, 1, ..., r, where we write
i0 = 1 and ir+1 = m, and di,u,v = 0 otherwise. If we let
D = Dm be the set of OD matrices corresponding to
all one-seat movement patterns for m stations then
D = {Di|i I}.
A formula for the number of one-seat movement
patterns for m stations can be developed as follows.
In order that a seat be occupied for the entire journey,
one passenger must board at A1 and one must alight
at Am. An exchange of passengers may or may not
occur at the remaining m 2 stations. Hence there are
Australasian Journal of Engineering Education

iI

(2)

Lemma 1: Let D be an OD matrix and let tk be the


number of seats required on segment Jk for each
k = 1, 2, ..., m. We can find a maximal OD matrix F such
that the number of seats required on each segment
is equal to max1kmtk.
The idea of the proof is to adjust demand so that at
each station the number of alightings is the same
as the number of boardings. This can be done
by increasing demand for selected one segment
journeys. Suppose that tp = max1kmtk, and for the
sake of argument suppose that 1 < p < m, and tp1
and tp+1 are both strictly less than tp. Then at Ap, the
number alighting must be less than the number
boarding. Thus:
d1,p + ... + dp1,p < dp,p+1 + ... + dp,m

(3)

and hence
Dp = (dp,p+1 + ... + dp,m) (d1,p + ... + dp1,p) > 0

(4)

Now define dp* 1, p dp1, p p, with du* ,v du ,v otherwise.


The new OD matrix D* describes an increased level
of demand. Since:
d1,* p ... dp* 1, p dp* , p1 ... dp* ,m

(5)

the number of people boarding at Ap equals the


number of people alighting and hence tp* 1 tp* tp. By
**
*
setting dp , p1 dp , p1 p1 and leaving the remaining
elements unchanged, we can replace D* by D** so that
tp** tp**1 tp. By continuing in this way we eventually
obtain a maximal OD matrix F with the required
properties.
Vol 15 No 2

98

Application of origin-destination matrices to the design of train services Albrecht, Howlett & Coleman

Lemma 2: Each maximal OD matrix F can be


represented uniquely in the form F iI ni Di, where
q m
q = 2m2.
m+ and
Dn= dniji

( )

The idea behind the proof is that we can subtract


one-seat movement patterns from the maximal OD
matrix until the matrix is exhausted.
2.2.2

Example 1 (continued)

We use our first example to illustrate the demand


modification described in the explanation of Lemma
1. Since t2 = 12 is the maximum value of tk, we replace
d1,2 by d1,2 + D2 = d1,2 + t2 t1 = d1,2 + 1. Thus we have:
0 4 3 4
0 5 3 4

0 2 3
0 2 3
D
D*

0 3
0 3

0
0

*
*
*
*
Now we replace d3,4
by d3,4
*3 d3,4
t3* t2* d3,4
2.
Therefore:

0 5 3 4
0 5 3 4

0
2
3
0 2 3
D ** =
D* =
F

0 3
0 5

0
0

A simple calculation shows that F = 4D(1,4) + 3D(1,2,4) +


3D(1,3,4) + 2D(1,2,3,4) is maximal and that F D. Since F
is defined by the vector n = (4, 3, 3, 2), we will write
F = Fn. Thus in order to satisfy the original demand
matrix D, we require exactly four one-seat movement
patterns of type (1, 4), three of type (1, 2, 4), three of
type (1, 3, 4) and two of type (1, 2, 3, 4).

Figure 1:

2.3

The set of stopping patterns

The list of stations at which a train is required to stop


is called a stopping pattern. Each stopping pattern
j is defined as a permissible sequence j = (1, j1, j2, ...,
js, m) where 1 < j1 < j2 < ... < js < m for some s with
0 s m 2. The set of permissible sequences will
be denoted by J = Jm. There are 2m2 stopping patterns
for a train on a linear network with m stations. Many
of these may be considered impractical for a variety
of reasons. In practice it is important that consumers
understand the rationale for the design of train
services. Thus the concept of an express train that
stops only at the busiest stations is well-understood
as is the idea that some trains should stop at every
station. To gain understanding of practical stopping
patterns we analysed train services between Adelaide
and Gawler.
2.3.1

Example 2 case study of the Gawler line

Gawler is a town with a population of approximately


20,000 people, located 40 km north of Adelaide. There
are 26 stations on the Gawler line. The timetable was
updated in April 2008. Figure 1 provides a summary
of the stopping patterns (with associated service
counts) used in the old timetable for the 29 morning
services from Gawler to Adelaide. The circles indicate
stations at which the service stops. Note that 16
different stopping patterns were used.
The stopping patterns for the new timetable shown
in figure 2 are much simpler. All patterns stop at six
nominated interchange stations. The new timetable
mostly contains alternating repetition of patterns B
and F, which together cover 25 of the 26 stations. Thus
passengers may board any train with the knowledge
that if this train is not scheduled to stop at their

Old stopping patterns, Gawler to Adelaide, AM.

Australasian Journal of Engineering Education

Vol 15 No 2

Application of origin-destination matrices to the design of train services Albrecht, Howlett & Coleman

Figure 2:

New stopping patterns, Gawler to Adelaide, AM.

intended destination, they can alight at a preceding


interchange station and board the next train.
The case study of the Gawler line suggests that rail
operators are likely to allow only practical stopping
patterns. For convenience, from now on, we will
use J to denote the set of practical stopping patterns
under consideration. We observe that a passenger
movement pattern i can be assigned to a train with
stopping pattern j if and only if:
{1, i1, i2, ..., ir, m} {1, j1, j2, ..., js, m}

(6)

For convenience we write i j. If i j for some


practical stopping pattern j J, then we say that
i is a feasible movement. If i j for all j J, then i
is infeasible. If (1, 2, ..., m) J, then all movement
patterns are feasible. If Fn iI ni Di is a maximal
demand matrix and if ni > 0 for some infeasible i,
then F is infeasible. Suppose Fn is feasible and i J
for all i with ni > 0. If we provide a train of capacity
ni and stopping pattern i for each i with ni > 0, then
demand is satisfied and there are no empty seats.
If Fn is feasible but i J for some i with ni > 0, then
any train service that meets demand will make
unnecessary stops at intermediate stations. The
most efficient service will be one with the minimal
number of unnecessary stops. To see this consider
a one-seat movement pattern. Such movements are
full by definition. If the movement i is feasible but
i J, then we must assign this movement to a train
with stopping pattern j J, where i j, but i j.
Hence there will be unnecessary stops at intermediate
stations for the movement pattern i.
2.3.2

99

Example 1 (continued)

Suppose we have two trains, each with six seats, and


prescribed stopping patterns J = {(1, 3, 4), (1, 2, 3, 4)}.
From before, F = 4D(1,4) + 3D(1,2,4) + 3D(1,3,4) + 2D(1,2,3,4). The
assigning of movement patterns i I to trains with
stopping patterns j J is done in two steps. We first
assign each movement pattern i to the most efficient
stopping pattern j J. Thus we assign movement (1,4)
to stopping pattern (1, 3, 4) and movement (1, 2, 4) to
stopping pattern (1, 2, 3, 4). This means seven seats
are allocated to stopping pattern (1, 3, 4) and five to
stopping pattern (1, 2, 3, 4). This allocation is not an
efficient solution because the capacities of the available
trains means that three trains will be required. Thus
Australasian Journal of Engineering Education

we invoke the second step, which is to rearrange the


feasible stopping patterns into efficient train loads.
Because (1, 3, 4) (1, 2, 3, 4), one movement previously
assigned to stopping pattern (1, 3, 4) can be reassigned
to stopping pattern (1, 2, 3, 4). Thus six seats are
assigned to (1, 3, 4) and six to (1, 2, 3, 4). Two trains
are required and eight unnecessary stops are incurred.
The above arguments suggest that we may be able
to develop a general solution procedure based on
fundamental principles of counting, ordering and
allocating. These ideas will be investigated more
fully in a subsequent paper. In this paper we will use
a more standard solution procedure.
2.4

Statement of Problem 1

Now that we have arrived at the heart of our first


problem, we restate it formally.
Problem 1: Given an OD matrix D on a linear network
and a set of stopping patterns J, how many one-seat
movement patterns i (if any) for all i I should be
assigned to each train, and how many trains with
stopping pattern j J and capacity fj are required in
order to minimise the number of unnecessary stops
for passengers?
2.5

The solution process

We construct an integer program that assigns each


one-seat movement pattern i I to a train with
stopping pattern j J, where i j, in such a way
that the capacity of each train is not exceeded. We
minimise the total number of unnecessary stops
incurred by all one-seat movement patterns i on
trains with stopping patterns j for all i I and all
j J. In particular examples it is more convenient to
use a generic notation and simply list the relevant
movements and stopping patterns. The decision
variables are:
xi,j = the total number of one-seat movements with
pattern i on a train with stopping pattern j where
ij
yj = the number of trains with stopping pattern j.
Note that the variables xi,j directly determine the
values of n = {ni} because ni j xi , j. The problem
data is:
Vol 15 No 2

100

Application of origin-destination matrices to the design of train services Albrecht, Howlett & Coleman

f j = the capacity of the trains with stopping


pattern j
ci,j = the number of unnecessary stops incurred by
a one-seat movement pattern i on a train using
stopping pattern j
y = the maximum number of trains allowed.
We could set y = nf + r, where n = max1km1tk and
f = (minjfj) over all j, and r is an arbitrary natural
number. The objective is to minimise the number of
unnecessary stops:

ci , j xi , j
i, j

subject to:

di ,r ,s xi , j dr ,s

r , s with r s

i, j

(8)

xi,j f j y j 0 j
i


j

(9)
(10)

for all xi , j , y j 2. Because the number of one-seat


movement patterns grows exponentially, when m is
large it may be necessary to employ techniques such
as column generation to solve the integer program
efficiently.
2.5.1

Estimating demand
from traffic counts

(7)

yj y

0 128 221 35 11 79 240

0
93 3 2 8 22

0 66 4 83 161

0 51 14 39
F

0 21 47

0 205

Example 3

We apply our integer program to the OD matrix:


0 22 177 35 11 79 228

0
9
3 2 8 22

0 11 4 83 161

0 3 14 39
D

0 21 47

0 85

Before solving the model, we calculate that 702 seats


are required. If the available trains consist of single
rail cars, each with a capacity of fj = 60 seats, then at
least y = 12 trains are necessary to satisfy demand.
To form the set J we consider the total usage pk + qk
for each Ak. Comparing these values indicates that
A1, A3 and A7 are busy, with A6 less so, and remaining
stations not at all. Thus we could choose J = {(1, 7),
(1, 3, 7), (1, 3, 6, 7), (1, 2, 3, 4, 5, 6, 7)}.
Once the integer program is solved, we obtain the
values of the decision variables. The number of trains
required with stopping pattern j for j = 1, 2, 3, 4 is
y = [4, 2, 3, 3].
The maximal OD matrix F associated with the
optimal value of n (which is calculated from the
values of xi,j) is given by:
Australasian Journal of Engineering Education

In practice it may be necessary to estimate OD


matrices from appropriate traffic flow and turnstile
data. We consider two methods proposed in a
celebrated paper by Van Zuylen & Willumsen
(1980). We describe the first method in some detail
and the second method more briefly. Both methods
rely on the use of traffic counts. Traffic counts are
routinely collected in transport research and are
relatively inexpensive, although in larger networks
a subset of count locations may be considered.
Methods to choose locations are described in Ehlert
et al (2006). Van Zuylen & Willumsen (1980) were
primarily concerned with vehicle counts and hence
used segment flows as the basis for their analysis.
Although we have adapted their arguments to
consider passenger counts on each segment, we note
that turnstile counts may provide useful additional or
alternative information for passenger rail problems.
In modern metropolitan rail networks it is possible
that OD counts could be collected routinely on ticket
validation machines.
3.1

Fundamental requirements

We consider the linear network described earlier.


Suppose a set of segment flows has been found by
counting passengers on certain segments within the
network. The flow for each OD pair AiAj will use all
intervening segments. In general we write ik, j for the
proportion of trips from Ai to Aj that use segment
Jk. In our study this simply means that ik, j = 1 if
k [i, j) and ik, j = 0 otherwise. In general 0 ik, j 1. If
nk denotes the count on link Jk, then the fundamental
flow equation is:
nk ik, j di , j
i j

i , j ik j

di , j

(11)

The problem of estimating the OD matrix is one of


finding the m(m 1)/2 unknowns di,j for i < j. Since
the number of segment flows in the network will be at
most m 1, and since there are m(m 1)/2 unknowns,
it follows that counting segment flows alone will
be insufficient to define a unique solution. Thus we
could also consider turnstile counts. We note that the
continuity equations for station Ak can be written as:
nk = nk1 + pk qk

(12)
Vol 15 No 2

Application of origin-destination matrices to the design of train services Albrecht, Howlett & Coleman

and so nk is determined by nk1, pk and qk. Thus if we


use some turnstile counts as well as flow counts then
there is the possibility of redundant information. In
practice, where counting errors may occur, it could
be that the counts will be inconsistent.
3.2

Interdependence and consistency

If we assume an intrinsic rate of flow ri,j from Ai to Aj


that depends only on (i, j), then it would be natural to
describe the corresponding passenger counts by a set
of independent Poisson random variables Ri,j with:
P Ri , j ri , j

i , j

ri , j

i , j

ri , j ! 

(13)

e ip
pi ! 
i

(14)

for 1 i < m and each pi , where we have written


m
i ji1 i , j. The random variables Q2, Q3, ..., Qm
are also independent Poisson variables with the
probability of an observed count qj for Qj given by:
j

P Q j q j

qj
j

qj ! 

(15)

for 1 < j m and each q j , where we have written


j 1
j i1 i , j. The likelihood of observing a boarding
vector p = (p1, p2, ..., pm1, 0) is given by:
e ip
Lp
pi ! 
i 1
m1

(16)

The maximum likelihood estimate for = (s1, s2,


..., sm1, 0) given that the total flow is r is found by
forming the Lagrangian function:
m1

Lp i pi log e i log e pi ! i r (17)


i 1
i1

m1

and setting the partial derivatives equal to zero. The


solution is:

rp

i1

m1

pi 

(18)

Similar arguments show that the maximum likelihood


estimate for = (0, t2, t3, ..., tm) when the total flow
is r is given by:

rq

j 2 q j 
m

Australasian Journal of Engineering Education

(19)

Statement of Problem 2

We can now give a precise statement of our second


problem.
Problem 2: Given an observed collection of passenger
flows from Ai to Aj for all (i, j) with i < j on a linear rail
network, find the best estimate of the OD matrix D.
Of course we have not defined what we mean by the
best estimate, but we now describe two fundamental
solution procedures proposed in the paper by Van
Zuylen & Willumsen (1980). These methods are
a cornerstone for demand estimation in modern
traffic theory.
3.4

for each ri , j . In this case the number of passengers


m
boarding at Ai is defined by Pi ji1 Ri , j for 1 i < m
with Pm = 0, and the number of passengers alighting
j 1
at station Aj is defined by Q j i1 Ri , j for 1 < j m
with Q1 = 0. The random variables P1, P2, ..., Pm1 are
independent Poisson variables with the probability
of an observed count pi for Pi given by:
P Pi pi

3.3

101

Estimating the OD matrix


by minimising information

We wish to choose an OD matrix that adds as little


information as possible to the information contained
in equation (11). If the observations are counts on
a particular link in the network, we define state
(i, j) with i < j as the state in which the observed
passengers travel from Ai to Aj. The information
contained in a set of n observations where state (i, j)
has been observed ni,j times is defined by:
n

qi , j

I log e n !
i j ni , j ! 
i,j

(20)

where n i j ni , j and where q i,j is the a priori


probability of observing state (i, j). Indeed the
expression in the square brackets is simply a standard
multinomial probability. Although the ni,j could be
regarded as random variables, we follow Van Zuylen
& Willumsen (1980) and use a lowercase notation. If
we use a simplified Stirling approximation:
logeX! XlogeX X
then it follows that:
ni , j
I ni , j log e
nqi , j
i j

(21)

Note that the count for state (i, j) on segment Jk


denoted by nik, j satisfies the equation:
di , j if i k j
nik, j ik, j di , j
otherwise 
0

(22)

and hence the a priori probability of observing state


(i, j) on segment Jk could be given by an estimate of
the form:
qik, j

ik, j i , j
k 

(23)

where d i,j is any a priori estimate of d i,j and


k i j ik, j i , j By substituting equations (22) and
(23) into (21), we can see that the information
Vol 15 No 2

102

Application of origin-destination matrices to the design of train services Albrecht, Howlett & Coleman

contained for a total count nk i , j nik, j on segment


Jk is given approximately by:

selecting a matrix D m m with a total number N of


passengers is:

di , j k
I k ik, j di , j log e

i j
nk i , j

W D

(24)

Summing over all segments for which counts


have been obtained gives the approximate total
information as:
di , j k
I ik, j di , j log e

k i j
nk i , j

(25)

(26)


k
r ,s 1 k
k


(27)

(28)

it can be shown that the optimal choice is given by:


rk, s / r , s

(29)

for r < s. The quantities Xk given by equation (28) can


be calculated once the m 1 unknowns lk have been
found. In a linear network, the m 1 equations given
by equation (11) take the form:
k 1

dk ,k 1 nk

di , j dk , j

i 1 j k 1

jk 2

(30)

for k = 1, ..., m 1. From equation (29) we observe that


s1
dr ,s r ,s k r X k1/sr and in particular dk,k+1 = dk,k+1Xk.
A simple iterative scheme in which initial values of
X1, ..., Xm1 are arbitrarily chosen could be used to
solve this non-linear system of equations.
3.5

Estimating the OD matrix


by maximising entropy

The beginning hypothesis is that the most likely


OD matrix is the one with the greatest number of
associated micro-states. The number of ways of
Australasian Journal of Engineering Education

(32)

subject to equation (11). Thus we form a Lagrangian:

L# di , j log e di , j di , j k nk ik, j di , j (33)


i j
k
i j

and set the partial derivatives equal to zero. If we


define X k e , then the solution is given by:
k

nk 1
e
k


dr ,s r ,s X k

i j

rk, s

equal to zero. By writing r ,s k rk,s and defining:


Xk

For convenience we maximise S = logeW subject to


the traffic count constraint of equation (11). Using
the simplified Stirling approximation once again
and assuming that N is constant means that we can
simply seek to maximise:

dr ,s X k

where l k 0 are Lagrange multipliers, and by


choosing the value for D that sets each of the partial
derivatives:
d
L
rk,s log e r ,s k
dr ,s
k
nk r , s

(31)

S # di , j log e di , j di , j

over all observed flows. To find an OD matrix that


minimises the additional inherent information we
find the value for D that minimises equation (25),
subject to the flow constraints of equation (11). The
solution is found by forming a Lagrangian function:
di , j k
L ik, j di , j log e
nk i , j
k i j

k ik, j di , j nk
k
i j

N!
i j di , j !

(34)

for r < s.
3.6

Application to train services

There are two methods of traffic count used on


passenger services. One commonly used counting
procedure is turnstile counting. Thus we would
effectively count the number of boarding and
alighting passengers at selected major stations.
Estimating an OD matrix based on boarding and
alighting counts has been investigated in the
literature (most recently by Li & Cassidy (2007), who
provided a comprehensive review of past work).
This method could be used as a basis for either
the information minimisation estimation or for the
entropy maximisation estimation. In both cases, the
application of this counting procedure would mean
a minor change to the constraints and this in turn
would lead, no doubt, to a different but probably
similar solution. This is certainly a proposal that we
intend to follow up in the future.
The other popular method of passenger counting
for trains is implemented by an observer boarding
the train at a selected station A k, counting the
passengers while the train traverses segment Jk
and then alighting at station Ak+1. This is precisely
equivalent to the method described by Van Zuylen
& Willumsen (1980) and so their estimations could be
implemented directly. To obtain a priori estimates di,j
for the demands di,j where no previous OD data has
been collected, it may be possible to use population
and workforce data. We may consider this problem
in the future.
Vol 15 No 2

Application of origin-destination matrices to the design of train services Albrecht, Howlett & Coleman

Conclusions and Future Work

The train service design problem could be used


in senior undergraduate courses to demonstrate
mathematical modelling of industrial problems,
and to motivate the study of counting, ordering,
allocating and integer programming. These are
standard mathematical topics in most undergraduate
mathematics programs. However, to understand the
full implications of solving real world problems, it
is necessary to consider everything that is required
for a solution. In this case, it is necessary to estimate
the OD matrix by solving Problem 2 before we
can assume that the demand is known. Only then
can we solve Problem 1. Thus, the first problem
is meaningless as an industrial application unless
students understand that it also entails consideration
of the second problem.
The AMSI study showed that OD matrices can
be used as a basis for the design of effective train
services. We note that the methods proposed by
Van Zuylen & Willumsen (1980) require knowledge
of network flows from Ai to Aj for each i < j. Future
work would require consideration of the best ways
to gather this information.
We used the elementary principles of counting,
ordering and collating, and the representation of
maximal demands as a unique combination of
movement patterns, to solve the problem of designing
train services when there is no restriction on the
available stopping patterns. Our analysis of Example
1 suggests that we may be able to generalise these
methods to find analytic solutions when the stopping

Australasian Journal of Engineering Education

103

patterns and hence the elementary movement


patterns are restricted. This may ultimately obviate
the need for an integer programming solution of the
type used in this paper.
References
Borndrfer, R., Grtschel, M. & Pfetsch, M. E. 2004,
Models for line planning in public transport,
9th International Conference on Computer-Aided
Scheduling of Public Transport, San Diego, California.
Cascetta, E. & Postorino, M. N. 2001, Fixed point
approaches to the estimation of O/D matrices using
traffic counts on congested networks, Transportation
Science, Vol. 35, No. 2, pp. 134-147.
Ehlert, A., Bell, M. G. H. & Grosso, S. 2006, The
optimisation of traffic count locations in road
networks, Transportation Research B, Vol. 40, pp.
460-479.
Li, Y. & Cassidy, M. J. 2007, A generalized and
efficient algorithm for estimating transit route ODs
from passenger counts, Transportation Research B,
Vol. 41, pp. 114-125.
Spanier, J. 1977, Education in applied mathematics:
The Claremont mathematics clinic, SIAM Review,
Vol. 19, No. 3, pp. 536-549.
Van Zuylen, H. J. & Willumsen, L. G. 1980, The most
likely trip matrix estimated from traffic counts,
Transportation Research B, Vol. 14B, pp. 281-293.

Vol 15 No 2

104

Application of origin-destination matrices to the design of train services Albrecht, Howlett & Coleman

Amie Albrecht
Amie Albrecht obtained her PhD in Mathematics from the University of
South Australia in 2009. She is currently a Research Associate in the Centre
for Industrial and Applied Mathematics, and a member of the Institute for
Sustainable Systems and Technologies at the University of South Australia.
Amie works on planning and scheduling problems in the rail industry, some
of which are funded by the Cooperative Research Centre for Rail Innovation.
Her research area is Operations Research, and her interests include heuristic
and exact solution techniques for discrete optimisation problems.

Phil Howlett
Phil Howlett is Professor of Industrial and Applied Mathematics in the Centre
for Industrial and Applied Mathematics, and a member of the Institute for
Sustainable Systems and Technologies at the University of South Australia. He
is the Leader of the Scheduling and Control Group, and has worked extensively
on optimal driving strategies for trains and solar-powered racing cars and on
related matters relating to efficiency of railway operations. Phil has wide-ranging
interests in other areas of mathematics, including recent work on management
of water supply systems, singular perturbations of linear operators on Banach
space, gradient approximation, modelling of realistic systems and estimation
of random signals. He is currently the Chair of ANZIAM (Australia and New
Zealand Industrial and Applied Mathematics), and a member of the Council
and Steering Committee of the Australian Mathematical Society.

Dale Coleman
Dale Coleman is Managing Director of TTG Transportation Technology Pty Ltd,
which provides specialist software and engineering products and services to the
rail industry. Prior to his current role, Dale was Global Head of WorleyParsons
Rail following the acquisition of his consulting business by WorleyParsons
in 2006. Since forming TMG in the mid 1980s, he has provided advice on all
aspects of the planning, management, operation, maintenance and safety of rail
infrastructure and rolling stock to owners and operators in Australia and Asia.
Dale has been active in the promotion of railway research and development
and railway technology, and prior to his involvement in rail, Dale had a 13-year
career in the mining and heavy engineering industry. Dale has a Bachelor of
Engineering (Civil) from the University of Sydney.

Australasian Journal of Engineering Education

Vol 15 No 2

You might also like