You are on page 1of 11

Journal of Theoretical Biology 243 (2006) 407417

A stochastic model for the sizes of detectable metastases


Leonid Hanin
a,
, Jason Rose
a,b
, Marco Zaider
c
a
Department of Mathematics, Idaho State University, Pocatello, ID 83209-8085, USA
b
Department of Mathematics, College of Southern Idaho, Twin Falls, ID 83303-1238, USA
c
Department of Medical Physics, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY 10021, USA
Received 18 February 2006; received in revised form 5 July 2006; accepted 10 July 2006
Available online 15 July 2006
Abstract
A stochastic entirely mechanistic model of metastatic progression of cancer is developed. Based on this model the joint conditional
distribution of the ordered sizes of detectable metastases given their number, n, is computed. It is shown that this distribution coincides
with the joint distribution of order statistics for a random sample of size n derived from some probability distribution, and a formula for
the latter is obtained. This formula is specialized for the case of exponentially growing primary and secondary tumors and exponentially
distributed metastasis promotion times, and identiability of model parameters is ascertained. These results allow for estimation of the
natural history of cancer. As an example, it is estimated for a breast cancer patient with 31 bone metastases of known sizes. The proposed
model for the sizes of detectable metastases provided an excellent t to these data.
r 2006 Elsevier Ltd. All rights reserved.
Keywords: Cancer natural history; Metastasis; Model identiability; Poisson process; Primary tumor
0. Introduction
In spite of signicant progress in detection and treatment
of primary cancer, its metastatic spread continues to pose a
formidable challenge to the improvement of cancer-specic
survival of patients inicted with the disease. The greatest
unknown faced by an oncologist who designs a curative
treatment plan for a cancer patient is the possibility of the
presence of occult (undetectable) metastases at the start of
treatment for the primary tumor. To increase the chances
of long-term survival of a patient with such metastases,
they should be treated concurrently with (or shortly after)
the treatment of the primary disease. The advent of modern
methods of metastasis ablation such as conformal stereo-
tactic hypofractionated radiosurgery (Schell et al., 1995)
and radioimmunotherapy (Bernhardt et al., 2001; Goddu
et al., 1994; ODonoghue, 2000) makes this systemic
approach to cancer treatment feasible.
However, a curative therapy developed for an individual
patient can only be as good as the information about the
natural history of his/her disease. Unfortunately, many
important parameters descriptive of the course of the
disease (such as the age at onset, growth rates for primary
tumor and metastases, intensity of metastasizing, and
promotion time, that is, the time between shedding of a
metastasis by the primary tumor and inception of this
metastasis in the host organ or tissue) are unobservable. To
estimate them, one has to develop a comprehensive
mathematical model of the disease natural history and t
it to the individual data observable in a given clinical
setting. These observations include clinical characteristics
of the disease that become available at the time of primary
diagnosis (such as age, tumor size, stage, localization,
histological grade, and various biochemical markers at
detection of the primary tumor) as well as variables
descriptive of the metastatic progression of the disease
(most notably, the number and sizes of detectable
metastases). A detailed mechanistic model of individual
cancer natural history allows one to estimate the distribu-
tional characteristics of the number and sizes of metastases
that remain occult at the time of diagnosis or the start of
ARTICLE IN PRESS
www.elsevier.com/locate/yjtbi
0022-5193/$ - see front matter r 2006 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jtbi.2006.07.005

Corresponding author. Tel.: +1 208 282 3293; fax: +1 208 282 2636.
E-mail addresses: hanin@isu.edu (L. Hanin), jrose@csi.edu (J. Rose),
zaiderm@mskcc.org (M. Zaider).
treatment. With this information at hand, an oncologist
can design a therapeutic intervention plan (including mode,
dosage, and time course of treatment) that ts a given
patient best and maximizes the probability of cure or the
residual lifetime.
In the present paper, which is predominantly methodo-
logical, we develop a mathematical model of metastasis
formation and growth in a given host site. These processes
are very complex, heterogeneous, and selective (Barbour
and Gotley, 2003; Chambers et al., 1995; Evans, 1991;
Fidler, 1990, 1991, 1997, 2003). To form a micrometastasis,
a tumor cell has to separate itself from the primary tumor,
penetrate a blood or lymph vessel, traverse the circulation
system, evade attacks of the immune system, extravasate,
invade a host site, proliferate, and induce angiogenesis. As
a result, only a small fraction of cells shed by the primary
tumor gives rise to viable metastases.
The formation of metastases involves substantial varia-
bility and random uctuations in the characteristics of
metastatic cells and the host microenvironment. Therefore,
it can be best described using stochastic models. This
approach to modeling metastatic progression of cancer was
applied, mostly in the form of Monte Carlo simulation
studies, in Bernhardt et al. (2001) and Kendal (2006). A
statistical estimation of the empirical distribution of the
sizes of metastases resulting from autopsy studies was done
by Kendal (2001). A semi-stochastic description of the
kinetics of the number and sizes of metastases based on the
von Fo rster equation was obtained by Iwata et al. (2000).
An attempt at developing a mechanistic, biologically
motivated, stochastic model of the metastasis was made
in Bartoszyn ski et al. (2001), where the probability that at
the time of cancer detection the primary tumor has not yet
metastasized was computed. One of the basic ideas
entertained in Bartoszyn ski et al. (2001) was to relate the
rate of metastases formation to the size of the primary
tumor within the framework of quantal response models
(Puri, 1967, 1971; Puri and Senturia, 1972). This approach
will be followed in the present work as well.
In order to assess the extent of the disease, a patient
diagnosed with primary cancer is given an imaging
procedure which, after reading of the images, allows the
number of detected metastases and their sizes (or volumes)
to be determined. Therefore, the rst step in developing a
comprehensive model of the natural history of metastasis
consists in deriving a formula for the conditional joint
distribution of the sizes of detectable metastases given their
number. Such a formula is a key to statistical estimation of
some of the unobservable parameters of the natural history
of the disease. This leads to an important question as to
what characteristics of occult metastases (such as the
distributions of their number, sizes, and total volume) can
be estimated on the basis of data available for detectable
metastases.
When dealing with the sizes of metastases observed at a
given time t one faces a considerable mathematical and
methodological challenge. The problem is that the sizes of
metastases cannot in general be thought of as resulting
from a sequence of independently repeated trials, and thus
they do not form a random sample from a probability
distribution. The reasons for this are two-fold. First,
metastases that were shed later tend to have smaller sizes at
time t. Second, the rate of metastasis shedding may depend
on the primary tumor size which increase typically causes
acceleration of the process of metastasis formation.
Therefore, although one can always construct a frequency
distribution (histogram) of the sizes of detectable metas-
tases at any time t post-diagnosis, it is generally not true
that it represents an empirical counterpart of the distribu-
tion of the size of detectable metastasis at time t. In fact,
the latter random variable is not well-dened! Labeling (or
numbering) metastases represents yet another concomitant
problem. The most natural way to label detectable
metastases is through ordering their sizes taken at a given
time from the smallest to the largest (or vice versa).
However, with such a labeling the sizes of metastases are
represented by random variables that are neither indepen-
dent nor identically distributed.
The most important ndings of this work consist of
showing that under certain biologically plausible assump-
tions the joint distribution of the sizes of detectable
metastases conditional on their number coincides with that
of the vector of order statistics derived from some
probability distribution and, furthermore, obtaining a
formula for the latter. Although the mechanism of
sampling from this distribution is elusive, for many
purposes, including parameter estimation in the maximum
likelihood setting, this distribution may serve as a
surrogate of the distribution of the size of detectable
metastasis. In particular, tting this distribution to the
observed sizes of metastases makes it possible to estimate
many parameters descriptive of the natural history of the
disease and gain an insight into the process of metastasis
formation and growth.
The structure of the paper is as follows. In Section 1 we
formulate a model of the natural history of cancer. Section
2 deals with the derivation of a formula for the distribution
underlying the sizes of a given number of detectable
metastases. Specialization of this formula for the case of
exponentially growing tumors and exponentially distribu-
ted promotion times is treated at length in Section 3.
Additionally, in this section we discuss various issues
related to model identiability and statistical parameter
estimation. Section 4 deals with data analysis based on our
theoretical results from Sections 2 and 3 and estimation of
model parameters. Finally, our conclusions are formulated
in Section 5.
1. The model
The natural history of invasive cancer is commonly
divided into the periods of tumor latency, primary tumor
growth, and metastatic progression. These periods and
relevant model assumptions are described below.
ARTICLE IN PRESS
L. Hanin et al. / Journal of Theoretical Biology 243 (2006) 407417 408
1.1. Tumor latency
The latent period starts with the birth of an individual
and ends with the appearance of the rst malignant
clonogenic cell. This event is termed the onset of the
disease.
1.2. Primary tumor growth
The size of the primary tumor (that is, the number of
cells comprising the tumor) at any time w counted from the
onset of the disease will be denoted by Fw. It is assumed
that F is a strictly increasing continuous function such that
F0 1. The function F may depend on one or several
parameters that can be deterministic or random. We
denote by j the function inverse to F.
1.3. Metastasis formation
Following Bartoszyn ski et al. (2001), we will assume that
metastasis shedding is governed by a non-homogeneous
Poisson process with intensity m proportional to some
power of the current size of the primary tumor. Thus,
mw aF
y
w, (1)
where a40 and yX0 are constants, and time w is counted
from the onset time t of the disease. Note that model (1)
with y 0 describes stationary metastasis shedding gov-
erned by a homogeneous Poisson process with constant
rate a. It is further assumed that metastases shed by the
primary tumor reach a given host site and get established
there independently of each other with some probability q.
Therefore, (see e.g. Ross, 1997, pp. 257259), inception of
metastases in the host site is governed by a Poisson process
with the intensity n qm. Each established metastasis
spends some random time between detachment from the
primary tumor and inception in the host site (which may
include a period of dormancy, see Barbour and Gotley,
2003 and references therein) after which it starts to
proliferate. We assume that these promotion times for
different metastases are independent and identically
distributed with some probability density function (p.d.f.)
f and the corresponding cumulative distribution function
(c.d.f.) F. It is well-known (see e.g. Hanin and Yakovlev,
1996) that the resulting delayed Poisson process is again
Poisson with the rate
lw
_
w
0
nsf w s ds. (2)
1.4. Secondary metastasis
To retain mathematical tractability of the model, we
assume that secondary metastasizing (that is, formation of
metastasis of metastasis) to the given site both from
other sites and from within is negligible.
1.5. Metastasis growth
After inception in the host site the growth of a metastasis
is irreversible and its size at time w measured from the
inception is equal to Cw, where C is a strictly increasing
differentiable function such that C0 1. The function C
may depend on some deterministic or random parameters.
The inverse function for C will be denoted by c.
1.6. Metastasis detection
The volume of a metastasis becomes measurable when
the size of the metastasis reaches some threshold value m.
The value of m and the accuracy of volume measurement
are determined by the sensitivity of the imaging technol-
ogy. In the case of PET/CT imaging involved in the present
study the threshold volume was 0:5 cm
3
, and the accuracy
of volume determination was one pixel, which is approxi-
mately 0:065 cm
3
.
1.7. Effects of treatment
After surgical removal of the primary tumor its size is set
to zero. Then, in accordance with formula (2), if the
primary tumor was resected at age v then mw 0 for
w4v t. Because the rate of secondary metastasizing is
assumed to be negligible, at the time of primary tumor
extirpation the process of new metastasis formation is
stopped. Finally, chemotherapeutic or hormonal treatment
of metastases is assumed to affect them only through the
rate of their growth and the distribution of their promotion
times.
2. Joint distribution of the sizes of detectable metastases
Suppose that at age u a patient was diagnosed with
primary cancer and that the primary tumor size at
diagnosis was S. It follows from our assumptions in
Section 1.2 that the age t at the disease onset is given by
t u jS. (3)
Suppose also that at age v; vXu, the primary tumor was
resected, and that at age t; tXu, the patient developed n
detectable metastases localized in the same host site with
the observed sizes x
1
; x
2
; . . . ; x
n
, where mpx
1
ox
2
o
ox
n
pCt t. We intend to compute the joint conditional
p.d.f. of the observed sizes of metastases given that their
number, N, is equal to n. Note that tumor resection after
time t has no bearing on the sizes of metastases measured
at time t so that in this case (as well as in the case of an
untreated primary tumor) we can set, for the purpose of
our computation, v t. Thus, we will assume without loss
of generality that 0ptoupvpt, see Fig. 1.
Let X
1
; X
2
; . . . ; X
n
be the sizes of detectable metastases
at time t ordered from the smallest to the largest. Let also
T T
1
; T
2
; . . . ; T
n
be the vector of the corresponding
inception times counted from the onset of the disease. We
ARTICLE IN PRESS
L. Hanin et al. / Journal of Theoretical Biology 243 (2006) 407417 409
have X
i
Ct t T
i
, hence T
i
t t cX
i
; 1pi
pn. Clearly, these inception times form a decreasing
sequence. Note also that a metastasis with size X
i
is
detectable if and only if 0pT
i
pt t cm; 1pipn.
In accordance with our assumptions and due to formula
(1) the inception times of metastases follow a Poisson
process with the intensity ns qaF
y
s for 0pspv t
and ns 0 for v tospt t. Therefore, in view of (2)
lw qa
_
minfw;vtg
0
F
y
sf w s ds; 0pwpt t. (4)
We are interested in the joint distribution of inception
times on the interval 0; t t cm given that n inception
events occurred on that interval. It follows from a well-
known basic theorem about Poisson processes (see e.g.
Ross, 1997, pp. 264265) that the p.d.f. of this conditional
distribution has the form
p
T
t
1
; t
2
; . . . ; t
n
jN n n! ot
1
ot
n

if t t cmXt
1
4t
2
4 4t
n
X0, 5
and equals 0 otherwise, where
ow
lw
_
ttcm
0
ls ds
; 0pwpt t cm. (6)
Substituting (4) into (6) and changing the order of
integration in the denominator of the resulting formula
we represent the function o as follows:
ow
_
minfw; vtg
0
F
y
sf w s ds
_
minfttcm; vtg
0
F
y
sFt t cm s ds
,
0pwpt t cm. 7
Also, if the duration of the promotion time is negligible
then formula (7) is reduced to
ow
F
y
w
_
minfttcm; vtg
0
F
y
s ds
,
0pwpminft t cm; v tg. 8
Because the random vector X X
1
; X
2
; . . . ; X
n
is related
to the absolutely continuous random vector T
T
1
; T
2
; . . . ; T
n
through the transformation X
i
Ct
t T
i
; 1pipn, the distribution of X is also absolutely
continuous. To evaluate the conditional p.d.f. of the
random vector X given that N n at a point
x
1
; x
2
; . . . ; x
n
, where mpx
1
ox
2
o ox
n
pCt t,
pick a number d40 small enough so that dom and the
intervals x
i
d; x
i
d; 1pipn, are disjoint. Then ac-
cording to (5)
PrfX
i
2 x
i
d; x
i
d; 1pipnjN ng
PrfT
i
2 t t cx
i
d; t t cx
i
d,
1pipnjN ng
n!

n
i1
_
ttcx
i
d
ttcx
i
d
ow dw. 9
Dividing both sides of (9) by 2d
n
and taking the limit as
d ! 0 we nd that
p
X
x
1
; x
2
; . . . ; x
n
jN n n!

n
i1
ot t cx
i
c
0
x
i

(10)
if mpx
1
ox
2
o ox
n
pCt t, and vanishes elsewhere.
This proves the following result.
Theorem. The sizes X
1
oX
2
o oX
n
of metastases in a
certain host site that are detectable at age t are equidis-
tributed, given their number n, with the vector of order
statistics for a random sample of size n drawn from the
distribution with the p.d.f. dened by
px ot t cxc
0
x; mpxpCt t, (11)
and px 0 for xem; Ct t, where t is given by (3) and
function o is specied in (7).
Remark 1. The p.d.f. p given in (11) is independent of the
number n of metastases detected at time t and is free of the
parameter qa that characterizes the intensity of metastasis
seeding. The latter parameter, however, is indispensable for
determining the distribution of the number of metastases in
the site in question at any time post-diagnosis. In fact, the
distribution of the number of metastases in a given site that
are detectable at time t is Poisson with parameter
qa
_
minfttcm; vtg
0
F
y
sFt t cm s ds.
The same formula with m 1 gives the distribution of the
total number of metastases in the site at time t.
Remark 2. If the primary tumor size S at presentation is
unknown then the onset time t should be treated as a
random variable. In this case an additional integration in
(11) with respect to the distribution of the onset time is
required. This distribution can be obtained by utilizing one
of the established mechanistic models of tumor latency
ARTICLE IN PRESS
0 t u v
birth of the
individual
of metastases
of the volumes
measurement
primary tumor
resection of the detection of the
primary tumor disease
onset of the
Fig. 1. Timeline of the natural history of cancer.
L. Hanin et al. / Journal of Theoretical Biology 243 (2006) 407417 410
such as the two-stage clonal expansion model (also termed
MoolgavkarVenzonKnudson model) (Moolgavkar et al.,
1988; Moolgavkar and Knudson, 1981; Moolgavkar and
Luebeck, 1990; Moolgavkar and Venzon, 1979) or
YakovlevPolig model (Yakovlev and Polig, 1996). Alter-
natively, one can assume that the tumor latency time
follows a distribution from a exible parametric family
(e.g. of gamma or Weibull distributions). For methodolo-
gical approaches to parameter estimation for such models
from population data on cancer incidence, see Hanin et al.
(2006), Luebeck and Moolgavkar (2002), Zorin et al.
(2005).
Remark 3. If the laws of the primary tumor and/or
metastasis growth contain random parameters then for-
mula (11) should be additionally integrated with respect to
their distribution. More generally, if the growth of the
primary tumor is governed by a stochastic process then
formula (11) can be applied to any of its sample paths F
and then integrated with respect to the corresponding
probability measure on the space of sample paths of the
process (which is typically extremely hard to obtain). In the
case where growth of metastases is governed by a stochastic
process this approach is feasible only if sample paths of the
process are all increasing and never cross each other, the
assumptions used in a very essential way in our derivation
of formula (11). The main difculty that arises for
stochastic processes whose sample paths are not necessarily
increasing or may intersect is that metastases that were
shed earlier may have smaller sizes at the time of detection
t than those shed later or even remain occult at or become
extinct by time t.
Remark 4. Formula (11) with m 1 describes the dis-
tribution underlying the sizes of all (detectable and occult)
metastases. In this case function o in (7) takes on a simpler
form
ow
_
minfw;vtg
0
F
y
sf w s ds
_
vt
0
F
y
sFt t s ds
; 0pwpt t.
In the next section, we will explicate formulas (7) and
(11) in the special case of non-random exponential growth
of both the primary tumor and its metastases combined
with an exponentially distributed metastasis promotion
time. In particular, we will consider the most common
scenario with regard to the primary tumor, namely, when
the latter is surgically removed at the time of diagnosis.
3. The distribution of the sizes of detectable metastases for
exponentially growing tumors
3.1. Theoretical distribution
Suppose that the primary tumor and metastases grow
exponentially with constant rates b and g, respectively:
Fw e
bw
and Cw e
gw
. Then jy ln y=b and
cy ln y=g. Also, we assume that metastasis promotion
times follow an exponential distribution with the expected
value r40. The condition tX0 implies in view of (3) that
bX
ln S
u
. (12)
Denote by M Ct t the maximum possible size of a
metastasis. Then it follows from (3) that
M S
g=b
e
gtu
. (13)
3.1.1. Full model
A straightforward computation based on formula (7)
yields the following expression for the p.d.f. (11) under-
lying the conditional joint distribution of the sizes of
detectable metastases given their number:
px
Cx
1
M=A
a
A=M
b
x=A
b
if mpxoA;
Cx
1
M=x
a
x=M
b
if ApxpM;
_
(14)
where M is given by (13),
a
yb
g
; b
1
gr
; A maxfm; e
gtv
g, (15)
and
C b
1
M=A
a
A=M
b
1 m=A
b

a
1
M=A
a
1 b
1
1 A=M
b

is a normalization constant that makes (14) a proper


probability distribution. Observe that if e
gtv
pm then
distribution (14) takes on a simpler form
px C
1
x
1
M=x
a
x=M
b
; mpxpM, (16)
with
C
1
a
1
M=m
a
1 b
1
1 m=M
b
.
In particular, this is the case for v t, that is, when the
primary tumor remained untreated by the time when the
sizes of metastases were measured.
Note that the function p in (16) is decreasing for all
values of parameters a; b40 and M4m. The same is true
for function p given by the more general model (14)
provided that bo1. In the case b41, function (14)
increases on the interval m; A (or remains constant if
b 1) and decreases on the interval A; M. Also, function
(14) is continuous at the point A, and both functions (14)
and (16) vanish at the point M.
Some of the limiting forms of the full model (14) and its
particular case (16) are discussed below.
3.1.2. Homogeneous model
This model arises when y 0 so that by setting a 0 in
(14) we obtain
px
C
2
x
1
1 A=M
b
x=A
b
if mpxoA;
C
2
x
1
1 x=M
b
if ApxpM;
_
(17)
ARTICLE IN PRESS
L. Hanin et al. / Journal of Theoretical Biology 243 (2006) 407417 411
where parameters M; A; b are given in (13) and (15), and
C
2
ln
M
A
b
1
m=A
b
m=M
b
.
If Apm then distribution (17) reduces to
px C
3
x
1
1 x=M
b
; mpxpM, (18)
with
C
3
ln
M
m
b
1
1 m=M
b
.
3.1.3. Instantaneous seeding model
If promotion time is very short (in other words, if r ! 0)
then the p.d.f. p takes on an especially simple form.
Proceeding from (8) and (11) we obtain the power
distribution
px C
4
x
a1
; ApxpM, (19)
where
C
4

a
A
a
M
a
.
A homogeneous version of this model y 0 is obtained
by setting a 0. In this case
px
1
lnM=Ax
; ApxpM. (20)
3.2. Identication of model parameters
To reconstruct the natural history of the disease, one has
to estimate kinetic parameters b; y; g; r from the observed
sample of sizes of detectable metastases. Such estimation is
possible only if the corresponding model is identiable
(Hanin, 2002). We rst establish identiability of the
parameters M; A; a; b of the full model assuming that
a; b40 and mpAoM.
Proposition. Parameters M; A; a; b and M; a; b are uniquely
determined by the models (14) and (16), respectively.
The proof of the proposition is given in the Appendix.
Similar but simpler arguments would show that respective
parameters of the models (17)(20) are jointly identiable
as well.
We now address the reconstruction of the kinetic
parameters b; y; g; r of the full model based on the known
parameters M; A; a; b (or assuming they were already
estimated from the observed sizes of detectable metastases)
and given quantities t; u; v; S. Consider the following cases:
(1) Suppose that t4u and A4m (the latter implies in
particular that t4v). Then parameters b; y; g; r are
uniquely determined by the parameters M; A; a; b provided
the latter satisfy certain conditions. In fact, using formulas
(13) and (15) we nd easily that
b
ln Aln S
t v ln M t u ln A
; g
ln A
t v
,
r
t v
b ln A
21
and
y a
t v ln M t u ln A
t v ln S
. (22)
Clearly, conditions y40 and (12) are satised if and only if
t u
t v
o
ln M
ln A
p
t
t v
. (23)
In particular, in the case v u (that is, when the primary
tumor is resected at the time of diagnosis) we have
b
ln Aln S
t u lnM=A
; g
ln A
t u
,
r
t u
b ln A
and y a
lnM=A
ln S
24
under the condition that
ln M
ln A
p
t
t u
. (25)
(2) Keeping the assumption t4u we now suppose that
A m. In this case only three out of the four parameters
b; y; g; r of model (16) can be determined from (13) and the
rst two equations in (15).
(3) We are now left with the case t u v where the
sizes of metastases were measured concurrently with the
size of the primary tumor. Here A m, and the relations
between kinetic parameters b; y; g; r and parameters a; b; M
of model (16) are given by
yb
g
a; gr b
1
and
g
b

ln M
ln S
.
In this case, surprisingly, parameter y is identiable:
y a ln M= ln S. However, it is only the combinations gr
and b=g that can be estimated from the given observations.
Turning to the homogeneous model we nd that in the
case where t4u and A4m, parameters b; g; r of the model
(17) are given by formulas (21) under condition (25) while
in all other cases these three parameters are not determined
uniquely by parameters M; A; b of model (18). Finally,
parameters b; y; g of the instantaneous seeding model (19)
can be obtained from M; A; a only in the case t4u; A4m;
they are given by the relevant formulas in (21) and (22)
under conditions (23). The same is true regarding the
recovery of parameters b; g of the model (20) from its
natural parameters M and A.
3.3. Statistical estimation of model parameters
Parameters M; A; a; b involved in models (17)(20) can
be estimated by maximizing the joint likelihood of
observations. Due to formula (10), the likelihood function
(with the factor n! omitted) for either of these models is
ARTICLE IN PRESS
L. Hanin et al. / Journal of Theoretical Biology 243 (2006) 407417 412
given by
Lx
1
; x
2
; . . . ; x
n

n
i1
px
i
. (26)
Notice that it has exactly the same form it would take
should the observations be independent.
We rst discuss the instantaneous seeding model (19).
The positivity of likelihood requires that Apx
1
and
MXx
n
. Also, because the p.d.f. (19) is monotonically
increasing in A and decreasing in M we conclude that
^
A
x
1
and
^
M x
n
, where^ stands for the maximum likelihood
estimator. Thus, the likelihood (26) becomes
Lajx
1
; x
2
; . . . ; x
n

a
x
a
1
x
a
n
_ _
n

n
i1
x
i
_ _
a1
.
By taking the logarithm, changing the sign, and dividing by
n, we reduce the problem to minimizing the function
La:
1
n
ln La lnx
a
1
x
a
n
a 1 x ln a; a40,
where
x:
1
n

n
i1
ln x
i
. (27)
We have
L
0
a
lnx
n
=x
1

x
n
=x
1

a
1
x ln x
1

1
a
lnx
n
=x
1

1
lny 1

1
y
c
_ _
, 28
where y:x
n
=x
1

a
1 and
c:
x ln x
1
lnx
n
=x
1

. (29)
Notice that y40 and 0oco1. The function
gy:
1
lny 1

1
y
; y40,
is decreasing, lim
y!0
gy
1
2
and lim
y!1
gy 0. There-
fore, it follows from (28) that in the case co
1
2
the function
L has a unique minimizer ^ a40 while for cX
1
2
the
minimizer is ^ a 0 in which case the best tting model
becomes
px
1
lnx
n
=x
1
x
; x
1
pxpx
n
.
The full and homogeneous models do not seem to allow for
a similar explicit computation of the maximum likelihood
estimates of their parameters. In the data analysis example
discussed in the next section these estimates were obtained
numerically.
4. Data analysis
To ascertain applicability of the above model for making
inference about the natural history of disseminated cancer,
we used a data base of breast cancer patients who were
diagnosed and treated at the Memorial Sloan-Kettering
Cancer Center (MSKCC) and for whom the whole body
PET/CT scans are available. The treatments included
various combinations and time courses of radical surgery,
radiation-, chemo-, and hormonal therapies. Reading PET/
CT scans to detect metastases and measure their volumes is
a laborious procedure that requires signicant effort and
expertise. As of now such analysis has been done for 40
patients from the data base. To be useful for our analysis,
the patients had to satisfy several requirements: (1) the
number of metastases in a single site is large enough; (2)
primary tumor volume at presentation is available; (3) the
volumes of the primary tumor and metastases were
measured at different times; and (4) the time course of
treatment allows application of the above parametric
model of metastatic cancer natural history with a single
set of kinetic parameters b; g; y; r.
Only one patient among the 40 satised these conditions.
At the age of 74 (specically, on 4/1/96), this patient was
diagnosed with stage III estrogen receptor positive primary
breast cancer, and the volume of the primary tumor was
found to be 10:3 cm
3
. Shortly after the diagnosis she
underwent radical surgery and was put on an adjuvant
hormonal therapy with tamoxifen. At the age of 82 (more
exactly, on 4/6/04), 37 detectable bone, lung, lymph and
soft tissue metastases were discovered, and their sizes were
identied through PET/CT images. The prevalent meta-
static site was the skeletal system that was found to contain
31 bone metastases. The sizes of these metastases were
26; 31; 31; 31; 33; 34; 38; 47; 49; 51; 52; 54; 54; 55; 65; 67; 73,
78; 78; 81; 84; 87; 98; 101; 114; 139; 142; 172; 196; 213; 354
pixels, the volume of one pixel being approximately
0:065 cm
3
. To convert these readings into volumes, we
resolved equal metastasis sizes in pixels by spreading them
uniformly over the corresponding pixel bins. The resulting
volumes rounded to two decimal places are
1:69; 1:98; 2:01; 2:04; 2:14; 2:20; 2:46; 3:05; 3:18; 3:31; 3:37,
3:48; 3:52; 3:57; 4:22; 4:34; 4:73; 5:04; 5:08; 5:25; 5:45; 5:64,
6:36; 6:55; 7:39; 9:01; 9:21; 11:15; 12:71; 13:81; 22:96 cm
3
.
In addition to 31 bone metastases, three lung metastases
with the volumes of 1.30, 2.01 and 7:26 cm
3
, two lymph
node metastases with the volumes of 2.85 and 9:66 cm
3
, and
one soft tissue metastasis with the volume of 11:41 cm
3
were detected. The threshold of measurable volumes for
the PET/CT scanner at hand was m 5 10
8
cells that
corresponds to 0:5 cm
3
based on the volume of one tumor
cell of 10
9
cm
3
that was assumed in this study.
Parameters of the full, homogeneous, and instantaneous
seeding models were estimated using the maximum like-
lihood methodology. Our computation was based on the
above volumes of n 31 detectable metastases, the known
quantities u v 74 years, t 82 years, S 10:3 cm
3
and
ARTICLE IN PRESS
L. Hanin et al. / Journal of Theoretical Biology 243 (2006) 407417 413
m 5 10
8
cells. The quantity c dened in (29) and (27)
turned out to be about 0.382 which suggests the use of the
instantaneous seeding model (19) with a40, as opposed to
its degenerate version (20). The maximum likelihood
estimate of parameter b in the full model turned out to
be very large which implies, in accordance with (24), that
the mean promotion time r is very small. Thus, the best
tting full model degenerated into the instantaneous
seeding model. This is also corroborated by a decreasing
pattern in the histogram of the observed volumes of
detectable metastases, see Fig. 2. Maximum likelihood
estimates of the parameters M; A; a; b of the instantaneous
seeding and homogeneous models are compared in Table 1.
Clearly, conditions t4u; A4m and (25) for both models
are satised, and hence their kinetic parameters are
identiable (see Section 3.2). The values of these para-
meters and the estimated date of the disease onset
computed using formulas (24) and (3) are presented in
Table 2.
To assess the adequacy of the instantaneous seeding and
homogeneous models, we compared the empirical c.d.f.
and the model-based c.d.f. with the optimum parameters,
see Fig. 3(a), (b). A visual comparison shows that, with
these parameters, the instantaneous seeding model pro-
vides an excellent t to the empirical distribution of the
sizes of detectable metastases, whereas the t of the
homogeneous model is considerably worse. This conclu-
sion is conrmed by the values of L
2
distance between the
empirical and theoretical c.d.f.s as well as the values of the
log-likelihood L 1=n ln L for the two models re-
ported in Table 1. Finally, 95% condence bounds for the
parameters of the best tting instantaneous seeding model
are also given in Table 1.
5. Discussion
Mathematical modeling of complex biomedical pro-
cesses is always a balancing act that attempts to achieve a
delicate compromise between model adequacy, its mathe-
matical tractability, and identiability of model para-
meters. The complexity of the model should match the
available data from which model parameters are estimated
and against which model adequacy is tested. To be useful,
the model should provide a reasonably good t to the data
and capture the most salient features of the process of
interest while disregarding its less important aspects.
In this work we developed a detailed mechanistic model
of metastatic progression of cancer in a given site based on
the assumption that metastasis formation is governed by a
non-homogeneous Poisson process with the intensity
proportional to some power of the primary tumor size.
This model allows for arbitrary laws of growth of primary
tumor and metastases and any distribution of metastasis
promotion time. We also formulated and thoroughly
studied a parametric model (termed the full model), that
assumes non-random exponential growth of primary
tumor and metastases and exponentially distributed
metastasis promotion time, as well as its two limiting cases
(called instantaneous seeding and homogeneous models).
The full and instantaneous seeding models with the
maximum likelihood parameters estimated from the
volumes of 31 detectable bone metastases observed in a
breast cancer patient turned out to be almost indistinguish-
able and provided a remarkably good t to the empirical
ARTICLE IN PRESS
0
0.05
0.1
0.15
0.2
0.25
5 10 15 20 25
Metastasis Volume (in cm
3
)
Fig. 2. Equal areas histogram of the volumes of n 31 detectable bone
metastases.
Table 1
Maximum likelihood estimates of model parameters with 95% condence bounds for instantaneous seeding model
Model M cm
3
A cm
3
a b L
L
2
distance
Instantaneous seeding 22:96 8:81 1:69 0:24 0:56 0:49 1 2.396 0.200
Homogeneous 24.58 2.01 0 5.87 2.514 0.549
Table 2
Parameters of the disease natural history
Model b years
1
g years
1
y r (years) Date of onset
Instantaneous seeding 23.47 2.66 0.063 0 4/7/95
Homogeneous 24.66 2.68 0 0.064 4/25/95
L. Hanin et al. / Journal of Theoretical Biology 243 (2006) 407417 414
distribution of the volumes of metastases. This supports
the validity of our methodological approach.
One of the main limitations of our general non-
parametric model of disseminated cancer natural history
is that it neglects secondary metastasis. The parametric
version of this model is of necessity even more simplistic in
several respects. First, it proceeds from the exponential
growth laws for primary tumor and metastases that may
not be completely adequate. Second, it assumes an
exponential distribution for the metastasis promotion time
that may prove to be too rigid. Third, it postulates the
same metastatic growth rates and mean promotion times
for the entire period from the onset of the disease to the
time of metastasis surveying. However, taking into account
secondary metastases, assuming more exible multipara-
metric growth laws and promotion time distributions (for
example, Gompertz growth and gamma distributed pro-
motion times), and changing model parameters at the time
of primary tumor resection and/or start of treatment for
metastases would make the number of model parameters
prohibitively large, given the data available in our study.
Our analyses led us to the following conclusions:
1. Within the adopted model of cancer development the
sizes of site-specic detectable metastases at a given time
post-treatment rearranged in an increasing order are
equidistributed, conditional on their number n, with the
order statistics for a sample of size n drawn from a
probability distribution p given by formula (11). This
distribution is free of n and independent of the rate of
metastasis inception. However, the latter parameter is
critical for determination of the distribution of the number
of both detectable and occult metastases. Therefore,
estimation of the extent and dynamics of metastasis in a
given patient requires a series of measurements of the
number and volumes of detectable metastases taken at
different time points. Furthermore, formula (11) with m
1 allows the computation of the distribution of the sizes of
all metastases at any time post-diagnosis and hence of the
sizes of occult metastases alone.
2. The four-parameter full model (14) for the p.d.f. p,
that governs the distribution of sizes of detectable
metastases, and its particular case (16) and degenerate
versions (17)(20) are represented by various combinations
of power functions and display a variety of patterns
including decreasing and peak-shaped patterns. The
instantaneous seeding model (19) and its degenerate
version (20) are reduced to a single power function. All
these models are identiable in that the set of their natural
parameters M; A; a; b (or a subset thereof) is uniquely
determined by the p.d.f. p. However, the set of biologically
meaningful kinetic parameters b; g; y; r descriptive of the
natural history of the disease or its respective subsets are
identiable from the models (14), (17), (19) and (20) only if
time points u and t at which the sizes of primary tumor and
metastases were surveyed are distinct t4u and A4m.
However, even when t u parameter y can still be
recovered from models (16) and (19). If the size S of the
primary tumor at diagnosis is unknown then in the case
t4u; A4m the rate of growth of metastases g and their
mean promotion time r can nevertheless be found through
formulas (21).
3. Knowledge of the distribution of the sizes of
detectable metastases at a certain time post-diagnosis
makes it possible to estimate, under the conditions specied
in Section 3.2, all parameters of the individual natural
history of the disease except for the intensity of metastases
shedding by the primary tumor. These parameters include
the time of onset t, the rates of growth b and g of the
primary tumor and metastases, respectively, and the mean
metastasis promotion time r. In particular, we found that
the patient under analysis was inicted with an extremely
aggressively growing primary tumor (b 23:47 years
1
that corresponds to the tumor doubling time of about
10.8 days). By contrast, the metastases were growing much
slower as suggested by the growth rate g 2:66 years
1
which corresponds to the doubling time of approximately
95 days. This difference is most likely due to estrogen
deprivation caused by hormonal therapy. It is worth noting
ARTICLE IN PRESS
0
0.2
0.4
0.6
0.8
1
c
d
f
5 10 15 20 25 30
Metastasis Volume (in cm
3
)
0
0.2
0.4
0.6
0.8
1
c
d
f
5 10 15 20 25 30
Metastasis Volume (in cm
3
)
(a)
(b)
Fig. 3. Comparison between the empirical and model-based c.d.f.s of the
volumes of n 31 detectable bone metastases: (a) instantaneous seeding
model r 0; (b) homogeneous model y 0.
L. Hanin et al. / Journal of Theoretical Biology 243 (2006) 407417 415
that, as follows from Table 2, the onset of the disease
occurred just about 1 year prior to the detection of the
primary tumor. Because this time is much smaller than the
time of about 8 years between the primary diagnosis and
detection of metastases, our extension of the rates of
growth of metastases and their inception in the host site
under hormonal therapy to the entire period between
metastasis shedding and detection is justied. We observe
also that, although parameters y and r differed substan-
tially between the instantaneous seeding and homogeneous
models, the growth rates b and g estimated from these
models remained remarkably stable (see Table 2). Finally,
the estimated promotion time for bone metastases proved
to be negligibly small.
4. Interestingly enough, the full model applied to the
entire set of volumes of 37 detectable metastases did not
degenerate into the instantaneous seeding model and
provided an excellent t to the observed distribution of
volumes of detectable metastases with the log-likeli
hood L 2:486 and L
2
distance between the theoretical
and empirical c.d.f.s of 0.150, compare with Table 1.
The following estimates of the kinetic parameters of
the full model were obtained: b 24:26 years
1
; g
2:68 years
1
; r 0:059 years and y 0:064. The mean
metastasis promotion time r was the only kinetic
parameter that changed signicantly when six additional
non-skeletal metastases were added to the data set. In fact,
it increased from almost zero for bone metastases to 21.5
days for multiple metastatic sites. This observation can
serve as an indirect evidence that the main difference
between various metastatic sites is manifested in the
metastasis promotion times rather than growth rates, and
that promotion times for non-skeletal metastatic sites may
be quite long.
5. Within our model the metastasis that was formed rst
is the one that has the largest volume 22:96 cm
3
at the
time of metastasis surveying (4/6/04). The inception time of
this metastasis estimated through the instantaneous seed-
ing model is 4/17/95, that is, as early as 10 days after the
onset of the primary tumor. Although direct extrapolation
of the laws of growth of primary tumor and metastases to
such early times would be misleading, the primary tumor
was clearly undetectable at the time of inception of the rst
metastasis. This very preliminary observation may shed
some light on the limited success with which treatment of
metastatic breast cancer has met so far. It may also suggest
that certain categories of patients should be treated for
occult metastases concurrently with or shortly after
extirpation of the primary tumor.
6. Parameter y exerts its inuence on the sizes of
detectable metastases through the rate m of metastasis
shedding by the primary tumor (see formula (1)). Because
homogeneous model in which y 0 is clearly inferior to
the instantaneous seeding model in tting the sizes of
detectable metastases, the impact of parameter y and hence
of the size of the primary tumor is signicant. However, the
values of parameter y produced by the instantaneous
seeding model applied to 31 bone metastases and full
model applied to 37 multiple site metastases (0.063 and
0.064, respectively) are surprisingly low. One could expect
that m should be proportional to the tumor surface area, in
which case y should be about 2/3. Yet another plausible
consideration is that y should be close to the fractal
dimension of blood vessels that feed the primary tumor,
which was also estimated to be about 2/3, see Iwata et al.
(2000). The observation that y is relatively smallif
conrmed in further analysesmay have important con-
sequences for the still debated question of whether the
outcome of local treatment has a signicant effect on
distant metastatic failure. Indeed, a negligible value of y
would show that the rate of metastasis shedding depends
only weakly on the primary tumor volume and would thus
support the notion that failure to eradicate the primary
tumor may not contribute in any important way to cause-
specic mortality.
Acknowledgements
The authors are grateful to the anonymous reviewers for
their constructive criticism and helpful suggestions.
Appendix
To prove the proposition in Section 3.2 we note that the
support of the function p given in (14) is the interval
m; M. Therefore, parameter M is uniquely determined by
the distribution (14). Next, denote qx:xpx; mpxpM.
Assuming that A4m and using formulas (14) we compute
the left- and right-sided derivatives of the function q at the
point A:
q
0
A
b
CA
M=A
a
A=M
b
40
and
q
0
A
1
CM
aM=A
a1
bA=M
b1
o0.
Therefore, A is the only point on the interval m; M where
the derivative of the function q is discontinuous. Thus,
parameter A is identiable from the distribution (14). Also,
because function (16) is continuously differentiable on
m; M, the same property of the point A enables
discrimination between the cases A m and A4m.
Finally, we will show that parameters a; b are also
uniquely determined by the p.d.f. (14) or its particular case
(16). A straightforward computation shows that
q
0
M
a b
CM
. (A.1)
Also, for the function rx:q
0
xx
2
we have
r
0
M
aa 1 bb 1
C
(A.2)
ARTICLE IN PRESS
L. Hanin et al. / Journal of Theoretical Biology 243 (2006) 407417 416
and
r
00
M
a
2
a 1 b
2
b 1
CM
. (A.3)
Dividing Eqs. (A.3) and (A.2) by (A.1) yields
r
0
M
Mq
0
M
b a 1
and
r
00
M
q
0
M
a
2
ab b
2
b a b ab a 1 ab.
Therefore, the combinations b a and ab of the para-
meters a and b are uniquely determined by the distributions
(14) or (16). Hence, so are the parameters a and b, which
completes the proof.
References
Barbour, A., Gotley, D.C., 2003. Current concepts of tumour metastasis.
Ann. Acad. Med. Singapore 32, 176184.
Bartoszyn ski, R., Edler, L., Hanin, L., Kopp-Schneider, A., Pavlova, L.,
Tsodikov, A., Zorin, A., Yakovlev, A., 2001. Modeling cancer
detection: tumor size as a source of information on unobservable
stages of carcinogenesis. Math. Biosci. 171, 113142.
Bernhardt, P., Forssell-Aronsson, E., Jacobsson, L., Skarnemark, G.,
2001. Low-energy electron emitters for targeted radiotherapy of small
tumors. Acta Oncol. 40, 602608.
Chambers, A.F., Macdonald, I.F., Schmidt, E., Koop, S., Morris, V.L.,
Khokha, R., Groom, A.C., 1995. Steps in tumor metastasis: new
concepts from intravital videomicroscopy. Cancer Metastasis Rev. 14,
279301.
Evans, C.W., 1991. The Metastatic Cell: Behavior and Biochemistry.
Chapman & Hall, London.
Fidler, I.J., 1990. Critical factors in the biology of human cancer
metastasis. 28th G.H.A. Clowes memorial award lecture. Cancer
Res. 50 (19), 61306138.
Fidler, I.J., 1991. The biology of human cancer metastasis. 7th Jan
Waldenstrom lecture. Acta Oncol. 30 (6), 668675.
Fidler, I.J., 1997. Molecular biology of cancer: invasion and metastasis.
In: DeVita, V.T., Hellman, S., Rosenberg, S.A. (Eds.), Cancer
Principles and Practice of Oncology, fth ed. Lippincott-Raven
Publishers, Philadelphia, pp. 135152.
Fidler, I.J., 2003. The pathogenesis of cancer metastasis: the seed and
soil hypothesis revisited. Nat. Rev. Cancer 3 (6), 453458.
Goddu, S.M., Rao, D.V., Howell, R.W., 1994. Multicellular dosimetry for
micrometastases: dependence of self-dose versus cross-dose to cell
nuclei on type and energy of radiation and subcellular distribution of
radionuclides. J. Nucl. Med. 35 (3), 521530.
Hanin, L.G., 2002. Identication problem for stochastic models with
application to carcinogenesis, cancer detection and radiation biology.
Discrete Dyn. Nat. Soc. 7 (3), 177189.
Hanin, L.G., Yakovlev, A.Y., 1996. A nonidentiability aspect of the two-
stage model of carcinogenesis. Risk Anal. 16, 711715.
Hanin, L.G., Miller, A.B., Yakovlev, A.Y., Zorin, A.V., 2006. The
University of Rochester model of breast cancer detection and survival.
NCI Monograph Series, in press.
Iwata, K., Kawasaki, K., Shigesada, N., 2000. A dynamical model for the
growth and size distribution of multiple metastatic tumors. J. Theor.
Biol. 203, 177186.
Kendal, W.S., 2001. The size distribution of human hematogenous
metastases. J. Theor. Biol. 211, 2938.
Kendal, W.S., 2006. Chance mechanisms affecting the burden of
metastases. BMC-Cancer, 5, 138.
Luebeck, E.G., Moolgavkar, S.H., 2002. Multistage carcinogenesis and
the incidence of colorectal cancer. Proc. Natl Acad. Sci. USA 99 (23),
1509515100.
Moolgavkar, S.H., Knudson, A.G., 1981. Mutation and cancer: a model
for human carcinogenesis. J. Natl Cancer Inst. 66, 10371052.
Moolgavkar, S.H., Luebeck, E.G., 1990. Two-event model for carcino-
genesis: biological, mathematical and statistical considerations. Risk
Anal. 10, 323341.
Moolgavkar, S.H., Venzon, D.J., 1979. Two event model for carcinogen-
esis: incidence curves for childhood and adult tumors. Math. Biosci.
47, 5577.
Moolgavkar, S.H., Dewanji, A., Venzon, D.J., 1988. A stochastic two
stage model for cancer risk assessment. I. The hazard function and the
probability of tumor. Risk Anal. 8, 383392.
ODonoghue, J.A., 2000. Dosimetric principles of targeted radiotherapy.
In: Abrams, P.G., Fritzberg, A. (Eds.), Radioimmunotherapy of
Cancer. Marcel Dekker, New York, pp. 120.
Puri, P.S., 1967. A class of stochastic models of response after infection in
the absence of defense mechanism. In: Proceedings of the Fifth
Berkeley Symposium on Mathematical Statistics and Probability,
vol. 4. University of California Press, Berkeley, Los Angeles,
pp. 511535.
Puri, P.S., 1971. A quantal response process associated with integrals
of certain growth processes. In: Wasan, M.T. (Ed.), Mathe
matical Aspects of Life Sciences, Queens Papers in Pure and
Applied Mathematics, vol. 26. Queens University, Kingston, Ont.,
Canada.
Puri, P.S., Senturia, J., 1972. On a mathematical theory of quantal
response assays. In: Proceedings of the Sixth Berkeley Symposium on
Mathematical Statistics and Probability, vol. 4. University of
California Press, Berkeley, Los Angeles, pp. 231247.
Ross, S.M., 1997. Introduction to Probability Models, sixth ed. Academic
Press, San Diego.
Schell, M.C., Bova, F.J., Larson, D.A., Leavitt, D.D., Lutz, W.R.,
Podgorsak, E.B., Wu, A., 1995. TG-42 Report on stereotactic external
beam irradiation. AAPM Report No. 54, Stereotactic Radiosurgery.
Yakovlev, A.Y., Polig, E., 1996. A diversity of responses displayed by a
stochastic model of radiation carcinogenesis allowing for cell death.
Math. Biosci. 132, 133.
Zorin, A.V., Hanin, L.G., Edler, L., Yakovlev, A.Y., 2005. Estimating the
natural history of breast cancer from bivariate data on age and tumor
size at diagnosis. In: Edler, L., Kitsos, C.P. (Eds.), Quantitative
Methods for Cancer and Human Health Risk Assessment. Wiley,
Chichester, pp. 317327.
ARTICLE IN PRESS
L. Hanin et al. / Journal of Theoretical Biology 243 (2006) 407417 417

You might also like