Cox PH and Frailty Models for Unobserved Covariates

STA635 Project by Benjamin Hall
Cox proportional hazards

model
Model: h (t) = h (t)e
is the hazard
yi
1X1+...+ kXk
function of the ith individual

Assumption: The hazard function for each
individual is proportional to the basine hazard,
h0(t). This assumption implies that the hazard
function is fully determined by the covariate
vector.
Problem: There may be unobserved
covariates that cause this assumption to be
violated.
Simluation Ex: Unobserved

covariate
Consider a situation with the following population:
Group
Proportion
of
Population
Hazard
Rate with
placebo
Hazard
Rate with
Drug A
40%
.5
40%
20%
10
Obviously Drug A is effective for the entire
population.
But what happens in the Cox Model if the group is
unobservable?
Simulation Example,
continued
Lets simulate this example and apply the Cox
model:
Have R simulate 100 people according to the
previous tables probabilities and randomly
assign them to treatment or placebo
For each person, have R simulate # of
incidents within period of length 1. At the end
of period of length 1 right-censoring occurs.
Simulation Example,
continued
Here is some of the data generated by R (see
final slide for code):
id group
treat
time
status
[1, 1
]
.38
[2, 1
]
.07
[3, 1
]
.23
[4, 1
]
.15
[5, 1
]
.11
[6, 1
]
.89
Simulation Example,
continued
Now we run coxph on our data:
> myfit1
Call:
coxph(formula = Surv(time, status) ~ treat)
coef
exp(coef) se(coef)
z
p
treat
-0.128
0.88
0.11
-1.17 0.24
Likelihood ratio test=1.37 on 1 df, p=0.242 n= 445
Notice that the LRT has a p-value of .242
which is not significant. But we know that

treatment is effective for everyone. What is
happening?
Simulation Example,
continued
The problem is that we have heterogeneity in
the data due to the unobservable groups.
Since we cannot include group in our model,
the assumption of proportional hazards is
violated.
What can we do to solve this problem? Use a
frailty model.
Frailty Model
Frailty models can help explain the unaccounted
for heterogeneity.
Frailty Model: hyi (t) = z h0(t)e1X1+...+ kXk is the
hazard function of the ith individual
The distribution of z is specified to be, say,
Gamma. (Note: z must be non-negative since
the hazard is non-negative.)
In this situation, the shared frailty model is
appropriate, that is multiple observations of the
same individual always has the same value of z.
Frailty Model in R
Lets apply the frailty model to our simulated
data:
> myfit2
Call:
coxph(formula = Surv(time, status) ~ treat + frailty(id))
coef se(coef) se2
Chisq DF
p
treat
-0.147 0.160
0.111 0.85
1.0
3.6e-01
frailty(id)
93.89 43.5 1.4e-05
Iterations: 5 outer, 17 Newton-Raphson
Variance of random effect= 0.294 I-likelihood = -1887.9
Degrees of freedom for terms= 0.5 43.5
Likelihood ratio test=117 on 44.0 df, p=1.37e-08 n= 445
Notice that the LRT now has a highly significant p-
value.
Frailty Model in R
Now lets try implementing the frailty model
to a real data set, the kidnet data set.

Here are the results for the regular Cox Model:
> kfit1
Call:
coxph(formula = Surv(time, status) ~ age + sex, data = kidney)
coef
exp(coef) se(coef)
z
p
age
0.00203
1.002
0.00925 0.220 0.8300
sex -0.82931
0.436
0.29895 -2.774 0.0055
Likelihood ratio test=7.12 on 2 df, p=0.0285 n= 76
Here the LRT is significant with a p-value of .
0285 even without considering frailty.
Frailty
Model
in
R
However, a frailty model seems applicable in

this situation since their are multiple
oberservations (i.e. 2 kidneys) per person. Below
considers frailty:
> kfit2
Call:
coxph(formula = Surv(time, status) ~ age + sex + frailty(id), data = kidney)
coef
se(coef)
se2
Chisq DF
p
age
0.00525 0.0119
0.0088 0.2 1 0.66000
sex
-1.58749 0.4606 0.3520
11.9 1
0.00057
frailty(id)
23.1 13 0.04000
Iterations: 7 outer, 49 Newton-Raphson
Variance of random effect= 0.412 I-likelihood = -181.6
Degrees of freedom for terms= 0.5 0.6 13.0
Likelihood ratio test=46.8 on 14.1 df, p=2.31e-05 n= 76
Now the LRT is even more significant.
Resources
Therneau and Grambsch, Modeling Survival
Data, Chapter 9
Wienke, Andreas, Frailty Models,
http://www.demogr.mpg.de/papers/working/w
p-2003-032.pdf
Govindarajulu, Frailty Models and Other
Survival Models,
www.ms.uky.edu/~statinfo/nonparconf/govind
arajulu.ppt
R Code
library(survival)
#GEN_TIME
gen_time <- function(group, treat) {
if (group == 1) {
return (round(rexp(1, 1-(.5*treat)),2))}
if (group == 2) {
return (round(rexp(1, 2-treat),2))}
if (group == 3) {
return (round(rexp(1, 10-2*treat),2))}}
# PERSON DATA
person_data <- function() {
treat <- rbinom(1,1,.5)
x <- runif(1)
t1 <- matrix(NA, nrow=1, ncol=25)
if (x < .4) { group <- 1 }
if (x > .4 & x < .8) { group <- 2}
if (x > .8) { group <- 3}
elapse <- 0
count <- 1
while (elapse < 1) {
t1[(count+3)] <- gen_time(group, treat)
elapse <- elapse + t1[(count+3)]
count <- count + 1}
count <- count - 1
t1[1] <- group
t1[2] <- treat
t1[3] <- count
for (i in (count+4):25) { t1[i] <- 0 }
if (count == 1) { t1[count+3] <- 1 }
if (count > 1) { t1[count+3] <- 1-t1[count+2] }
return (t1)}
m1 <- matrix (NA, nrow=100, ncol=25)

for (i in 1:100) {
m1[i,] <- person_data() }
samp_size <- sum(m1[,3])
samp <- matrix(NA, nrow = samp_size, ncol= 5)
colnames(samp) <- c("id", "group", "treat", "time", "status")
count2 <- 1
for (i in 1:100) {
for (j in 1:m1[i,3]) {
samp[count2, 1] <- i
samp[count2, 2] <- m1[i,1]
samp[count2, 3] <- m1[i,2]
samp[count2, 4] <- m1[i,j+3]
samp[count2, 5] <- 1
if(j==m1[i,3]) { samp[count2, 5] <- 0 }
count2 <- count2 + 1}}
myfit1 <- coxph(Surv(samp[,4], samp[,5]) ~ samp[,3])
myfit2 <- coxph(Surv(samp[,4], samp[,5]) ~ samp[,3] + frailty(samp[,1]))
myfit1
myfit2

Cox PH and Frailty Models for Unobserved Covariates

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cox PH and Frailty Models for Unobserved Covariates

Uploaded by

Copyright:

Available Formats

STA635 Project by Benjamin Hall

Cox proportional hazards

function of the ith individual

Simluation Ex: Unobserved

Obviously Drug A is effective for the entire

Notice that the LRT has a p-value of .242

which is not significant. But we know that

Notice that the LRT now has a highly significant p-

to a real data set, the kidnet data set.

Here the LRT is significant with a p-value of .

0285 even without considering frailty.

However, a frailty model seems applicable in

Now the LRT is even more significant.

m1 <- matrix (NA, nrow=100, ncol=25)

You might also like