You are on page 1of 13

STA635 Project by Benjamin Hall

Cox proportional hazards


model
Model: h (t) = h (t)e
is the hazard
yi

1X1+...+ kXk

function of the ith individual


Assumption: The hazard function for each
individual is proportional to the basine hazard,
h0(t). This assumption implies that the hazard
function is fully determined by the covariate
vector.
Problem: There may be unobserved
covariates that cause this assumption to be
violated.

Simluation Ex: Unobserved


covariate
Consider a situation with the following population:
Group

Proportion
of
Population

Hazard
Rate with
placebo

Hazard
Rate with
Drug A

40%

.5

40%

20%

10

Obviously Drug A is effective for the entire

population.
But what happens in the Cox Model if the group is
unobservable?

Simulation Example,
continued
Lets simulate this example and apply the Cox
model:
Have R simulate 100 people according to the
previous tables probabilities and randomly
assign them to treatment or placebo
For each person, have R simulate # of
incidents within period of length 1. At the end
of period of length 1 right-censoring occurs.

Simulation Example,
continued
Here is some of the data generated by R (see
final slide for code):

id group

treat

time

status

[1, 1
]

.38

[2, 1
]

.07

[3, 1
]

.23

[4, 1
]

.15

[5, 1
]

.11

[6, 1
]

.89

Simulation Example,
continued
Now we run coxph on our data:
> myfit1
Call:
coxph(formula = Surv(time, status) ~ treat)
coef
exp(coef) se(coef)
z
p
treat
-0.128
0.88
0.11
-1.17 0.24
Likelihood ratio test=1.37 on 1 df, p=0.242 n= 445

Notice that the LRT has a p-value of .242

which is not significant. But we know that


treatment is effective for everyone. What is
happening?

Simulation Example,
continued
The problem is that we have heterogeneity in
the data due to the unobservable groups.
Since we cannot include group in our model,
the assumption of proportional hazards is
violated.
What can we do to solve this problem? Use a
frailty model.

Frailty Model
Frailty models can help explain the unaccounted

for heterogeneity.
Frailty Model: hyi (t) = z h0(t)e1X1+...+ kXk is the
hazard function of the ith individual
The distribution of z is specified to be, say,
Gamma. (Note: z must be non-negative since
the hazard is non-negative.)
In this situation, the shared frailty model is
appropriate, that is multiple observations of the
same individual always has the same value of z.

Frailty Model in R
Lets apply the frailty model to our simulated

data:
> myfit2
Call:
coxph(formula = Surv(time, status) ~ treat + frailty(id))
coef se(coef) se2
Chisq DF
p
treat
-0.147 0.160
0.111 0.85
1.0
3.6e-01
frailty(id)
93.89 43.5 1.4e-05
Iterations: 5 outer, 17 Newton-Raphson
Variance of random effect= 0.294 I-likelihood = -1887.9
Degrees of freedom for terms= 0.5 43.5
Likelihood ratio test=117 on 44.0 df, p=1.37e-08 n= 445

Notice that the LRT now has a highly significant p-

value.

Frailty Model in R
Now lets try implementing the frailty model

to a real data set, the kidnet data set.


Here are the results for the regular Cox Model:
> kfit1
Call:
coxph(formula = Surv(time, status) ~ age + sex, data = kidney)
coef
exp(coef) se(coef)
z
p
age
0.00203
1.002
0.00925 0.220 0.8300
sex -0.82931
0.436
0.29895 -2.774 0.0055
Likelihood ratio test=7.12 on 2 df, p=0.0285 n= 76

Here the LRT is significant with a p-value of .

0285 even without considering frailty.

Frailty
Model
in
R

However, a frailty model seems applicable in


this situation since their are multiple
oberservations (i.e. 2 kidneys) per person. Below
considers frailty:

> kfit2
Call:
coxph(formula = Surv(time, status) ~ age + sex + frailty(id), data = kidney)
coef
se(coef)
se2
Chisq DF
p
age
0.00525 0.0119
0.0088 0.2 1 0.66000
sex
-1.58749 0.4606 0.3520
11.9 1
0.00057
frailty(id)
23.1 13 0.04000
Iterations: 7 outer, 49 Newton-Raphson
Variance of random effect= 0.412 I-likelihood = -181.6
Degrees of freedom for terms= 0.5 0.6 13.0
Likelihood ratio test=46.8 on 14.1 df, p=2.31e-05 n= 76

Now the LRT is even more significant.

Resources
Therneau and Grambsch, Modeling Survival

Data, Chapter 9
Wienke, Andreas, Frailty Models,
http://www.demogr.mpg.de/papers/working/w
p-2003-032.pdf
Govindarajulu, Frailty Models and Other
Survival Models,
www.ms.uky.edu/~statinfo/nonparconf/govind
arajulu.ppt

R Code
library(survival)
#GEN_TIME
gen_time <- function(group, treat) {
if (group == 1) {
return (round(rexp(1, 1-(.5*treat)),2))}
if (group == 2) {
return (round(rexp(1, 2-treat),2))}
if (group == 3) {
return (round(rexp(1, 10-2*treat),2))}}
# PERSON DATA
person_data <- function() {
treat <- rbinom(1,1,.5)
x <- runif(1)
t1 <- matrix(NA, nrow=1, ncol=25)
if (x < .4) { group <- 1 }
if (x > .4 & x < .8) { group <- 2}
if (x > .8) { group <- 3}
elapse <- 0
count <- 1
while (elapse < 1) {
t1[(count+3)] <- gen_time(group, treat)
elapse <- elapse + t1[(count+3)]
count <- count + 1}
count <- count - 1
t1[1] <- group
t1[2] <- treat
t1[3] <- count
for (i in (count+4):25) { t1[i] <- 0 }
if (count == 1) { t1[count+3] <- 1 }
if (count > 1) { t1[count+3] <- 1-t1[count+2] }
return (t1)}

m1 <- matrix (NA, nrow=100, ncol=25)


for (i in 1:100) {
m1[i,] <- person_data() }
samp_size <- sum(m1[,3])
samp <- matrix(NA, nrow = samp_size, ncol= 5)
colnames(samp) <- c("id", "group", "treat", "time", "status")
count2 <- 1
for (i in 1:100) {
for (j in 1:m1[i,3]) {
samp[count2, 1] <- i
samp[count2, 2] <- m1[i,1]
samp[count2, 3] <- m1[i,2]
samp[count2, 4] <- m1[i,j+3]
samp[count2, 5] <- 1
if(j==m1[i,3]) { samp[count2, 5] <- 0 }
count2 <- count2 + 1}}
myfit1 <- coxph(Surv(samp[,4], samp[,5]) ~ samp[,3])
myfit2 <- coxph(Surv(samp[,4], samp[,5]) ~ samp[,3] + frailty(samp[,1]))
myfit1
myfit2

You might also like