You are on page 1of 43

Chapter 2 Basic Functions

In survival analysis a variety of different functions


which give alternative ways of defining a
probability distribution are used.

1
Survivor function
A probability distribution is completely defined by
its distribution function F (t) = P (T t);
which when continuous has a probability density
function f (t).
By definition the pdf is
dF
f (t) =
dt
and a theorem establishes that
Rt
F (t) = 0 f (s)ds.
The survivor function,
S(t) = 1 F (t) = P (T > t), is the probability of
surviving beyond time t.

2
Exercise 2.1 Express S (t) in terms of f (t).
Find S(0).

3
Sol: 2.1
dS(t) d
S (t) = = (1 F (t) = f (t),
dt dt
S(0) = 1 F (0) = 1 0 as T > 0.

4
Hazard function
The hazard function is h(t) is defined by

h(t) = f (t)/S(t).

The motivation for this is that

P(failure before time t + t | survival to time t)


P(failure in (t, t + t))
=
P (survival to time t)
f (t)t
.
S(t)

Thus, h(t) is the instantaneous failure rate at


time t conditional on survival to time t.

5
Exercise 2.2 Give an alternative definition of
the hazard function based on a derivative.

6
Sol: 2.2
P(T (t, t + t]|T (t, ])
h(t) = lim .
t0 t

7
The cumulative or integrated hazard function is
Z t
H(t) = h(u)du.
0

Other relationships follow:


Z t
S(t) = 1 f (u)du,
0
f (t) = h(t) S(t),
d( log (S(t)))
h(t) = ,
dt
S(t) = exp{H(t)}.

8
f (t)
Exercise 2.3 Starting from h(t) = S(t) express
the survivor function and the pdf in terms of the
hazard function and the integrated hazard
function.

9
Sol: 2.3 Recall S (t) = f (t) so that
S (t) d log S(t)
h(t) = = .
S(t) dt

Taking the definite integral of both sides


 Z t 
S(t) = exp h(u) du = eH(t) .
0

As

f (t) = h(t)S(t) = h(t)eH(t) .

10
Relationships between basic functions
These functions capture the essential features of
lifetime variables.
Specifying one function completely determines
the others.
Consequently one may interchange between
them, but
a model may be better specified in terms of one
rather than another.

11
The exponential lifetime distribution

We look at a range of parametric models later,


here assume an exponential distribution:
T Exp().
Recall that the mean of this distribution is 1/.
The exponential distribution has many properties
(including positivity) which make it appropriate as
a model for lifetimes.
However, it is quite restrictive and we also need
to consider generalisations.

12
For the exponential distribution Exp():
Pdf f (t) = exp(t) t 0;
Distribution function
F (t) = 1 exp(t) t 0;
Survivor function S(t) = exp(t) t 0;
Hazard function h(t) = t 0;
Cumulative hazard function H(t) = t t 0;
Check each of these expressions.

13
The hazard function is constant for the
exponential distribution.
No matter how long a component has survived,
its instantaneous failure rate remains constant.
This is the so-called lack of memory property
only possessed by the exponential distribution.

14
Discussion
The hazard function tells us about the effect of
time on the probability of failure of a component.
An increasing hazard function is indicative of
ageing..
A decreasing hazard function may arise when
burn-in occurs, e.g. low grade components fail
first, leaving better quality components.
A bath-tub shaped hazard may combine these
two situations.
When the hazard changes with time, the
exponential distribution is inappropriate.

15
Some survival functions

Constant NBU

0.8

0.8
Survival
Survival

0.4

0.4
0.0

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Time Time

NWU Bathtub
0.8

0.8
Survival

Survival
0.4

0.4
0.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0

Time Time

16
Some hazard functions
Constant NBU

2.0

2.0
Hazard
Hazard

1.0

1.0
0.0

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Time Time

NWU Bathtub
2.0

2.0
Hazard

Hazard
1.0

1.0
0.0

0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Time Time

H(t) can be any non-decreasing function

17
satisfying H(0) = 0, H(t).
h(t) is the derivative of any such function.
h(t) can have mid life peaks, e.g. war,
childbirth.
The probability of censoring is P{T > c} and
Z
P(T > c) = exp(t)dt = exp(c).
c

from the exponential assumption.

18
Censoring
The underlying lifetime variable is T with pdf
f (t), but the observed lifetime variable when
censoring occurs at a fixed time c is
X = min(T, c).
The probability of censoring is P{T > c} and
Z
P(T > c) = f (t)dt = S(c)
c

in terms of the survivor function.


The expected value of X is smaller than the
expected value of T :
Z
E(X) = E(min(T, c)) = min(t, c)f (t)dt
0

19
Z c Z 
= + min(t, c)f (t)dt
Z c0 c
Z
= tf (t)dt + cf (t)dt
0
Z c c
= [tF (t)]c0 F (t)dt + cS(c) int by p
Z c0
= cF (c) F (t)dt + cS(c)
Z c 0
= c F (t)dt
Z c 0 Z c
= [1 F (t)]dt = S(t)dt.
0 0

20
R
As E(T ) = S(t)dt we have
0
Z
E(X) = E(T ) S(t)dt.
c

21
Properties of the Exponential distribution

22
Example 2.4
Show that the probability that an individual lives
longer than t1 + t2 years given s/he has attained
t1 years is equal to the unconditional probability
that s/he surives at least t2 years
if and only if the survival distribution is
exponential.
This requires the lemma that if
g(t1 + t2 ) = g(t1 ) + g(t2 ) for all t1 and t2 then
g(t) = t for some IR.
Proof: (): Suppose

P(T t1 + t2 | T > t1 ) = P(T t2 )

23
P(T t1 + t2 )
= P(T t2 )
P(T t1 )
S(t1 + t2 ) = S(t1 )S(t2 )
log S(t1 + t2 ) = log S(t1 ) + log S(t2 ).

Let g(t) = log S(t) then the lemma gives


log S(t) = t for some IR
S(t) = exp(t).

Since t > 0 and 0 S(t) 1, < 0 =


for some > 0.
Therefore
S(t) = exp(t).

24
(): If S(t) = exp(t)
P(T t1 + t2 )
P(T t1 + t2 | T > t1 ) =
P(T t1 )
exp{(t1 + t2 )}
=
exp(t1 )
= exp(t2 )
= S(t2 )
= P(T > t2 ).

25
Exercise 2.5 log transform: If T Exp(1) show
that U = exp(T ) Uniform(0, 1).
and consequently log (U ) Exp(1).

26
Sol: 2.5

P(U < u) = P (exp(T ) < u)


= P (T > log (u))
= exp[( log (u))] survivor function
= u

as required.

27
Exercise 2.6 Scale multiplication: If
T Exp() show that X = T Exp(1)
and that ST (t) = SX (t) .

28
Sol: 2.6

P(T > t) = P(T > t/)


= S(t/)
= exp(t/)
= exp(t)

which is the survival function of Exp(1).


ST (t) = exp(t) = exp(t) = SX (t)

29
Exercise 2.7 Lack of memory. If T Exp()
and it is known that T > t show that
T t Exp().
Writing down the LHS is the hard bit.

30
Sol: 2.7

P(T t > s|T > t) = P(T t > s T > t)/ P(T > t) def con
= P(T > s + t)/ P(T > t)
= exp((s + t))/ exp(t) = exp(s).

The survivor function determines the distribution.

31
Example 2.8 If X1 , X2 Exp(1)
independently show that the proportion of the
total lifetime attributable to the first client,
B = X1 /(X1 + X2 ), has a uniform distribution
on (0, 1).
See the coursework for a proof.

32
Example 2.9 Lack of memory and independence.
Lifetimes T1 Exp(1 ) and T2 Exp(2 ) are
independent.
If it is known that both T1 > t and T2 > t show
that T1 t and T2 t are also independent.
See the coursework for a proof.

33
Exercise 2.10 If T1 Exp(1 ) and
T2 Exp(2 ) show that
P(T1 < T2 ) = 1 /(1 + 2 ).
Does it make sense: eg if 1 is large which is the
most likely to fail?
Conjecture the form of P(T1 < T2 < T3 ).

34
Sol: 2.10

P(T1 < T2 ) = P(1 T1 < 2 T2 ) where = 1 /2


= P(X1 < X2 ) standard Expon
X1 X2
= P( < ) proportion
X1 + X2 X1 + X2
= P(B < (1 B)) B uniform

= P(B < )
1+
1
= = .
1+ 1 + 2

P(T1 < T2 < T3 ) = . . ..

35
Exercise 2.11 If T1 Exp(1 ) and
T2 Exp(2 ) show that
min(T1 , T2 ) Exp(1 + 2 ).
Does it make sense: should the first to fail be
quicker to fail?

36
Sol: 2.11 Note
M = min(T1 , T2 ) > t T1 > t, T2 > t.
Now find the survivor function.

P(M > t) = P(T1 > t, T2 > t)


= P(T1 > t) P(T2 > t)
= exp(1 t) exp(2 t)
= exp[(1 + 2 )t]

But this is the survivor function of Exp(1 + 2 ).

37
Exercise 2.12 Suppose T1 , T2 , T3 are
independently exponential with parameters
1 , 2 , 3 show that
P(T1 < min(T2 , T3 )) = 1 /(1 + 2 + 3 ).

38
Sol: 2.12 Consider P(T1 < min(T2 , T3 )).
T1 and min(T2 , T3 ) are independent.
M = min(T2 , T3 ) Exp(2 + 3 ) by ex above.
1
Hence P(T1 < M ) = 1 +(2 +3 )
by ex above.

39
Example 2.13 Suppose T1 , T2 , T3 are
independently exponential with parameters
1 , 2 , 3 .
Show that
1 2
P(T1 < T2 < T3 ) = .
1 + 2 + 3 2 + 3
Proof:

P(T1 < T2 < T3 ) = P(T1 < min(T2 , T3 ) T2 < T3 )


= P(T1 < min(T2 , T3 ))
P(T2 < T3 |T1 < min(T2 , T3 ))

The trick is to condition on the past.


The first term is given by ex above.

40
The second term is

P(T2 < T3 |T1 < min(T2 , T3 )) = P(T2 < T3 |T2 > T1 , T3 > T1 )

and the rhs can be written as

= P(T2 T1 < T3 T1 |T2 T1 > 0, T3 T1 > 0).

We cannot argue that all terms are independent,


unless we condition on T1 = t1 .
Then

(t1 ) = P(T2 t1 < T3 t1 |T2 t1 > 0, T3 t1 > 0)

41
and
Z
= (t1 )fT1 (t1 )dt1 .
t1 >0

By the ex above about independence and lack of


memory

(t1 ) = P(T2 t1 < T3 t1 |T2 t1 > 0, T3 t1 > 0) = P(T2 < T3 )

which does not depend on t1 .


Hence = P(T2 < T3 ), which by ex above is
2
.
2 + 3
This completes the proof.

42
Summary upto here
Lifetime variables
Common characteristics - positive and often
censored
Censoring mechanisms
Basic functions to characterise distribution
Hazard function
Exponential distribution

43

You might also like