Professional Documents
Culture Documents
A. Agresti (UF)
Ordinal
1 / 51
Ordinal
2 / 51
Ordinal
3 / 51
Ordinal
4 / 51
20, y = 2 if 20 < y
y = 4 if 60 < y 80,
40, y = 3 if 40 < y
60,
y = 5 if y > 80.
Ordinal
5 / 51
40
oo o
o
o
o
20
o
oo
1
1
1
1
1
oooo
ooo
o oo oo
o
1
o oo
o oooo ooo o oo oo 1o
1 1o
o oo o
1 11 1o11 111
1o
1 1
1
1
1 1
11
o11111111 1o1111111 1
11 11 1 1 1
1
0
20
11
1 o
11
11 11
1
1
11 1
1
1
1
1
1
1
y*
o o
oo o
oo
o
oo
o
oo o
o
o1
o
60
o
oo
o
o
o
80
1
1
11
oo
o z=0
1 z=1
20
40
60
80
100
A. Agresti (UF)
20
40
60
80
100
Ordinal
6 / 51
j = 1, . . . , c 1.
Ordinal
7 / 51
Model satisfies
.
.
j | x1 )/P(y > j | x1 )
log P(y
P(y j | x )/P(y > j | x ) = (x
2
x2)
Ordinal
8 / 51
Death
59 (28%)
48 (25%)
44 (21%)
43 (22%)
Good
Recov.
32 (15%)
30 (16%)
31 (15%)
41 (21%)
Ordinal
9 / 51
Ordinal
# null model
10 / 51
R for modeling dose-response data using polr in MASS library, for which
response must be an ordered factor:
> trauma2 <- read.table("trauma2.dat", header=TRUE)
> trauma2
dose response count
1
1
1
59
2
1
2
25
3
1
3
46
...
4
5
41
20
> y <- factor(trauma2$response)
> fit.clogit <- polr(y ~ dose, data=trauma2, weight=count)
> summary(fit.clogit)
Coefficients:
Value Std. Error z value
dose 0.1754816 0.05671224 3.094245
Intercepts:
Value
Std. Error z value
1|2 -0.7192 0.1589
-4.5256
2|3 -0.3186 0.1569
-2.0308
3|4 0.6917 0.1597
4.3323
4|5 2.0570 0.1751
11.7493
Residual Deviance: 2461.349
> fitted(fit.clogit)
1
2
3
4
5
1 0.2901467 0.08878330 0.2473217 0.2415357 0.1322126
2 0.2901467 0.08878330 0.2473217 0.2415357 0.1322126
...
20 0.1944866 0.07043618 0.2325084 0.2975294 0.2050394
Note: This uses the model formula logit[P(y j )] = j x based on a latent variable model (below), for which has
opposite sign.
A. Agresti (UF)
Ordinal
11 / 51
12 / 51
Ordinal
13 / 51
Any equally-spaced scores (e.g. 0, 10, 20, 30) for dose provide same
fitted values and same test statistics (different , SE ).
Unequally-spaced scores more natural in many cases (e.g., doses may
be 0, 125, 250, 500). Sensitivity analysis usually shows substantive
results dont depend much on that choice, unless data highly
unbalanced (e.g., Graubard and Korn 1987 show mid-rank scores may
not be appropriate).
The cumulative logit model uses ordinality of y without assigning
category scores.
Alternative analysis treats dose as factor, using indicator variables.
But double the log-likelihood increases only 0.13, df = 2. With
1 = 0:
2 = 0.12, 3 = 0.32, 4 = 0.52 (SE = 0.18 each).
Testing H0: 2 = 3 = 4 = 0 gives likelihood-ratio (LR) stat. = 9.8
(df = 3, P = 0.02).
Using ordinality often increases power (focused on df = 1).
A. Agresti (UF)
Ordinal
14 / 51
A. Agresti (UF)
Ordinal
15 / 51
16 / 51
xi )
exp(
j +
(yi j ) =
.
P
1 + exp(
j + xi )
Estimated probability of outcome j is
(yi = j ) = P
(yi j ) P
(yi j 1).
P
Can motivate proportional odds structure by a regression model for
underlying continuous latent variable
(Anderson and Philips 1981, McKelvey and Zavoina 1975).
A. Agresti (UF)
Ordinal
17 / 51
y
=
y
=
= P( j x) = G (j x).
Model
G 1[P(y
j | x)] = j x
Ordinal
18 / 51
Note: This derivation suggests such models are designed to detect shifts in
location (center), not dispersion (spread), at different settings of
explanatory variables.
A. Agresti (UF)
Ordinal
19 / 51
link[P(yi j )] = j xi
of cumulative prob.s (McCullagh 1980).
e.g., cumulative probit model (link function = inverse of standard
normal cdf) applies naturally when underlying regression model has
normal y .
Effects invariant to choice and number of response categories (If
model holds for given response categories, holds with same when
response scale collapsed in any way).
A. Agresti (UF)
Ordinal
20 / 51
.Y
Y
n
c.
i =1
j =1
i =1
c
Y.Y.
i =1
j =1
A. Agresti (UF)
. .
yij
P(Y j | x ) P(Y j 1 | x )
i
P(Y i = j | xi )
.yij
.
=
j =1
. yij .
exp(j + xi )
exp(j 1 + xi )
.
1 + exp(j + xi ) 1 + exp(j 1 + xi )
Ordinal
21 / 51
Ordinal
22 / 51
SES life
1
1
1
9
1
4
0
0
A. Agresti (UF)
8
9
Ordinal
23 / 51
z value
-0.45223
1.86260
3.08075
1.80884
-2.66973
A. Agresti (UF)
Ordinal
24 / 51
A. Agresti (UF)
Ordinal
25 / 51
26 / 51
Ordinal
27 / 51
0
j
Ordinal
28 / 51
ij = ni P
j = 1, . . . , c .
. (nij
ij )2
.
ij
i ,j
ij
i ,j
df = No. multinomial parameters no. model parameters.
A. Agresti (UF)
Ordinal
29 / 51
Ordinal
30 / 51
A. Agresti (UF)
Ordinal
31 / 51
Other criteria besides significance tests can help select a good model,
such as by minimizing
AIC = 2(log likelihood number of parameters in model)
which penalizes a model for having many parameters. This attempts
to find model for which fit is closest to reality, and overfitting (too
many parameters) can hurt this.
Advantages of utilizing ordinality of response include:
No interval-scale assumption about distances between response
categories (i.e., no scores needed for y ).
Greater variety of models, including ones that are more parsimonious
than models that ignore ordering (such as baseline-category logit
models, which have a separate set of parameters for each logit).
Greater statistical power for testing effects (compared to treating
categories as nominal), because of focusing effect on smaller df .
A. Agresti (UF)
Ordinal
32 / 51
logit[P(yi j )] = j + xj i ,
j = 1, . . . , c 1.
Ordinal
33 / 51
The improvement in fit is statistically significant, but perhaps not substantively significant;
effect of dose is moderately negative for each cumulative probability.
A. Agresti (UF)
Ordinal
34 / 51
Fundamentalist
Moderate
Liberal
92 (14%)
274 (27%)
739 (44%)
192 (20%)
352 (52%)
399 (40%)
536 (32%)
423 (44%)
234 (34%)
326 (33%)
412 (24%)
355 (37%)
We create indicator variables {ri } for region and fit the model
logit[P(y j )] = j + 1r1 + 2r2 + 3r3.
A. Agresti (UF)
Ordinal
35 / 51
R for religion and region data, using vglm for cumulative logit modeling
with proportional odds structure:
> religion <> religion
region y1
1
1 92
2
2 274
3
3 739
4
4 192
read.table("religion_region.dat",header=TRUE)
y2
352
399
536
423
y3
234
326
412
355
Ordinal
36 / 51
R for religion and region data, using vglm for cumulative logit modeling
without proportional odds structure:
> fit.npo <- vglm(cbind(y1,y2,y3) ~ r1+r2+r3, family=cumulative,religion)
> summary(fit.npo)
Coefficients:
Value Std. Error
z value
(Intercept):1 -1.399231 0.080583 -17.36377
(Intercept):2 0.549504 0.066655
8.24398
r1:1
-0.452300 0.138093 -3.27532
r1:2
0.090999 0.104731
0.86888
r2:1
0.426188 0.107343
3.97032
r2:2
0.175343 0.094849
1.84866
r3:1
1.150175 0.094349 12.19065
r3:2
0.580174 0.087490
6.63135
Residual Deviance: -5.1681e-13 on 0 degrees of freedom
Log-likelihood: -28.1464 on 0 degrees of freedom
> 1 - pchisq(deviance(fit.po)-deviance(fit.npo),
df=df.residual(fit.po)-df.residual(fit.npo))
[1] 4.134028e-21
A. Agresti (UF)
Ordinal
37 / 51
logit[P(yi j )] = j + xi + j ui ,
A. Agresti (UF)
Ordinal
j = 1, . . . , c 1.
38 / 51
R for modeling mental impairment data with partial proportional odds (life
events but not SES), using vglm in VGAM library:
> fit3 < -vglm(impair ~ ses + life, family=cumulative(parallel=FALSE~ses))
> summary(fit3)
Coefficients:
Estimate Std. Error
(Intercept):1 -0.17660
0.69506
(Intercept):2 1.00567
0.66327
(Intercept):3 2.39555
0.77894
ses:1
0.98237
0.76430
ses:2
1.54149
0.73732
ses:3
0.73623
0.81213
life
-0.32413
0.12017
z value
-0.25408
1.51623
3.07539
1.28531
2.09066
0.90655
-2.69736
Deviance (97.36, df=113) similar to obtained with more complex non-proportional odds model (96.75, df=111) and the simpler
proportional odds model (99.10, df=115), which seems adequate.
A. Agresti (UF)
Ordinal
39 / 51
Placebo
A. Agresti (UF)
<20
2030
3060
>60
<20
2030
3060
>60
4
5
9
11
Ordinal
2
1
18
14
>60
0
2
1
8
1
0
2
22
40 / 51
j = 1, 2, 3.
Ordinal
41 / 51
A. Agresti (UF)
Ordinal
42 / 51
san.se
san.z Pr(>|san.z|)
0.21952 -10.3715
< 2e-16 ***
0.18119 -5.2855
< 2e-16 ***
0.17857 1.9334
0.05319 .
0.16915 6.1387
< 2e-16 ***
0.23871 0.1543
0.87738
0.24489 2.9110
0.00360 **
A. Agresti (UF)
Ordinal
43 / 51
A. Agresti (UF)
Ordinal
44 / 51
j = 1, 2, 3
GEE
1.04 (.17)
0.03 (.24)
0.71 (.24)
GLMM
1.60 (.28)
0.06 (.37)
1.08 (.38)
Ordinal
45 / 51
P(Y=1)
1.0
Conditional
Marginal
0.0
A. Agresti (UF)
x
Ordinal
46 / 51
For R, here we use the clmm function in the ordinal package, which
employs a Laplace approximation for the likelihood function.
----------------------------------------------------------------------------------> insomnia<- read.table(file="insomnia.txt", header=TRUE)
> attach(insomnia)
> insomnia[1:6,] # part of input data set (each case has been duplicated count times)
case treat occasion outcome count
1
1
1
0
1
7
2
1
1
1
1
7
3
2
1
0
1
7
4
2
1
1
1
7
> outcome<-factor(outcome,levels=1:4,labels=c("20","20-30","30-60","60"),ordered=TRUE)
> treat <- factor(treat, levels=0:1,labels=c("Placebo","Active"))
> occasion <- factor(occasion, levels=0:1, labels=c("Initial","Follow-Up"))
> require("ordinal")
> fit <- clmm(outcome ~ treat*occasion + (1|case) ) # random intercept
> summary(fit)
Cumulative Link Mixed Model fitted with the Laplace approximation
Random effects:
Var Std.Dev
case 3.043 1.744
Estimate Std. Error z value Pr(>|z|)
treatActive
-0.04831
0.34390 -0.140 0.88828
occasionFollow-Up
-1.57116
0.28404 -5.531 3.18e-08 ***
treatActive:occasionFollow-Up -1.05796
0.37812 -2.798 0.00514 **
--Threshold coefficients:
Estimate Std. Error z value
20|20-30
-3.3986
0.3501 -9.709
20-30|30-60 -1.4451
0.2780 -5.198
30-60|60
0.5457
0.2569 2.124
----------------------------------------------------------------------------------A. Agresti (UF)
Ordinal
47 / 51
A. Agresti (UF)
Ordinal
48 / 51
789.00
Effect
outcome
t Value
Pr > |t|
Intercept
1
-3.4896
0.3588
237
-9.73
<.0001
Intercept
2
-1.4846
0.2903
237
-5.11
<.0001
Intercept
3
0.5613
0.2702
237
2.08
0.0388
treat
0.05786
0.3663
235
0.16
0.8746
occasion
1.6016
0.2834
235
5.65
<.0001
treat*occasion
1.0813
0.3805
235
2.84
0.0049
-------------------------------------------------------------------------------
Estimated standard deviation of random effects is 3.628 = 1.90.
A. Agresti (UF)
Ordinal
49 / 51
A. Agresti (UF)
Ordinal
50 / 51
Ordinal
51 / 51