Professional Documents
Culture Documents
1. Introduction
1
Example
Examples
Ordinal
2
Nominal
3
Probability Distributions for Categorical Data
Binomial Distribution
4
n!
P (y) = π y (1 − π)n−y , y = 0, 1, 2, . . . , n
y!(n − y)!
5
Note
• E(Y ) = nπ p
V ar(Y ) = nπ(1 − π), σ = nπ(1 − π)
Y
•p= n
= proportion of success (also denoted π̂)
Y
E(p) = E =π
n
r
Y π(1 − π)
σ =
n n
6
Inference for a Proportion
2! 1
p(1) = 1!1!
π (1 − π)1 = 2π(1 − π)
= ℓ(π)
the likelihood function defined for π between 0 and 1
7
If π = 0, probability is ℓ(0) = 0 of getting y = 1
ML estimate of π is π̂ = 0.50.
8
Note
y
• For binomial, π̂ = n
= proportion of successes.
9
ML Inference about Binomial Parameter
y
π̂ = p =
n
q
π(1−π)
Recall E(p) = π, σ(p) = n .
• Note σ(p) ↓ as n ↑, so
p → π (law of large numbers, true in general for ML)
10
Significance Test for binomial parameter
Ho : π = πo
Ha : π 6= πo (or 1-sided)
Test statistic
p − πo p − πo
z= =q
σ(p) πo (1−πo)
n
11
Confidence interval (CI) for binomial parameter
Example θ = π, θ̂ = π̂ = p
q
σ(p) = π(1−π)
n
estimated by
q
SE = p(1−p)
n
q
95% CI is p ± 1.96 p(1−p)
n
12
Example: Estimate π = population proportion of vege-
tarians
0
p= 20
= 0.0
q
95% CI: 0 ± 1.96 0×1
20
= 0 ± 0,
= (0, 0)
13
• Note what happens with Wald CI for π if p = 0 or 1
Ho : π = πo Ha : π 6= πo
using
14
Definition Score test, score CI use null SE
Ho : π = πo Ha : π 6= πo
using
15
Example π = probability of being vegetarian
y = 0, n = 20, p=0
What πo satisfies
±1.96 = q π0−π o
o (1−πo )
?
q 20
1.96 πo(1−π
20
o)
= |0 − πo|
πo = 0 is one solution
16
• When solve quadratic, can show midpoint of 95% CI
is
y+1.962/2 y+2
n+1.962
≈ n+4
p
• Wald CI p ± 1.96 p(1 − p)/n also works well if add
2 successes, add 2 failures before applying (this is the
“Agresti-Coull method”)
17
2. Two-Way Contingency Tables
HEART ATTACK
Yes No Total
GROUP Placebo 189 10,845 11,034
Aspirin 104 10,933 11,037
ր
2x2 table
18
A conditional dist refers to prob. dist. of Y at fixed
level of x.
Example:
Y
Yes No Total
X Placebo .017 .983 1.0
Aspirin .009 .991 1.0
Y = response var.
X = explanatory var.
19
Example: Diagnostic disease tests
Y
1 2
X 1
2
sensitivity = P (Y = 1|X = 1)
specificity = P (Y = 2|X = 2)
20
What if X, Y both response var’s?
21
Definition X and Y are statistically independent if
true conditional dist. of Y is identical at each level of x.
Y
.01 .99
X
.01 .99
22
Comparing Proportions in 2x2 Tables
Y
S F
X 1 π1 1 − π1
2 π2 1 − π2
Conditional Distributions
π̂1 − π̂2 = p1 − p2
s
p1(1 − p1) p2(1 − p2)
SE(p1 − p2) = +
n1 n2
r
.017 × .983 .009 × .991
SE = + = .0015
11, 034 11, 037
95% CI for π1 − π2 is .008 ± 1.96(.0015) = (.005, .011).
23
π1
Relative Risk = π2
p1 .017
Example: Sample p2 = .009 = 1.82
Note
24
Odds Ratio
S F
Group 1 π1 1 − π1
2 π2 1 − π2
prob(S)
The odds the response is a S instead of an F = prob(F )
25
Odds = 3 ⇒ P (S) = 34 , P (F ) = 1
4
odds
P (S) =
1 + odds
1 1/3 1
odds = 3 ⇒ P (S) = 1+1/3 = 4
π1/(1 − π1)
θ=
π2/(1 − π2)
26
Example
Yes No Total
Heart Attack Placebo 189 10,845 11,034
Aspirin 104 10,933 11,037
Sample Proportions
p1 1 − p1 .0171 .9829 1.0
=
p2 1 − p2 .0094 .9906 1.0
Sample odds =
.0171 189
= = .0174, placebo
.9829 10, 845
104
= = .0095, aspirin
10, 933
27
Sample odds Ratio
.0174
θ̂ = = 1.83
.0095
28
• If rows interchanged, or if columns interchanged, θ →
1/θ.
• For counts
S F
n11 n12
n21 n22
n11 /n12 n11n22
θ̂ = n21 /n22
= n12n21
= cross-product ratio
29
• Treats X, Y symmetrically
Placebo Aspirin
Heart Attack Yes
No
→ θ̂ = 1.83
•θ=1 ⇔ log θ = 0
e.g., θ = 2 ⇒ log θ = .7
30
• Sampling dist. of log θ̂ is closer to normal, so con-
struct CI for log θ and then exponentiate endpoints to
get CI for θ.
CI for log θ is
(eL, eU ) is CI for θ
31
189×10,933
Example: θ̂ = 104×10,845
= 1.83
log θ̂ = .605
r
1 1 1 1
SE(log θ̂) = + + + = .123
189 10, 933 104 10, 845
95% CI for log θ is
95% CI for θ is
Apparently θ > 1
32
e denotes exponential function
e0 = 1, e1 = e = 2.718 . . .
1
e−1 = e = .368
ex > 0 all x
e0 = 1 means loge(1) = 0
e1 = 2.718 loge(2.718) = 1
33
Notes
π1/(1 − π1) π1
θ= ≈
π2/(1 − π2) π2
the relative risk
34
Example: Case-control study in London hospitals
(Doll and Hill 1950)
Y = Lung Cancer
Lung Cancer
Yes No
X Yes 688 650
No 21 59
709 709
Case control studies are “retrospective.” Binomial sam-
pling model applies to X (sampled within levels of Y ),
not to Y .
or π1 − π2 =
P (Y = yes|X = yes) − P (Y = yes|X = no)
or π1/π2
35
We can estimate P (X|Y ), so can estimate θ.
36
Chi - Squared Tests of Independence
Example
JOB SATISFACTION
INCOME
Very Little Mod. Very
Dissat. Satis. Satis. Satis
< 5000 2 4 13 3 22
5000-15,000 2 6 22 4 34
15,000-25,000 0 1 15 8 24
> 25,000 0 3 13 8 24
4 14 63 23 104
Data from General Social Survey (1991)
37
Ho: X and Y independent
Ho means
P (X = i, Y = j) = P (X = i)P (Y = j)
πij = πi+π+j
Expected frequency µij = nπij
= mean of dist. of cell count nij
= nπi+π+j under Ho.
38
Test Statistic
X (nij − µ̂ij )2
2
X =
µ̂ij
all cells
called Pearson chi - squared statistic (Karl Pearson,
1900)
39
Example: Job satisfaction and income
X 2 = 11.5
df = (I − 1)(J − 1) = 3 × 3 = 9
40
Note
√
• Chi-squared dist. has µ = df , σ = 2df ,
more bell-shaped as df ↑
X nij
G2 = 2 nij log
µ̂ij
maximize likelihood when Ho true
= −2 log
maximize likelihood generally
41
Example: G2 = 13.47, df = 9, P -value = .14
• df for X 2 test
= no. parameters in general - no. parameters under Ho
42
• These tests treat X, Y as nominal. Reorder rows
columns, X 2, G2 unchanged
43
Standardized (Adjusted) Residuals
nij −µ̂ij
rij = √
µ̂ij (1−pi+)(1−p+j )
24×23
Example: n44 = 8, µ̂44 = 104 = 5.31
8 − 5.31
r44 = q = 1.51
24 23
5.31(1 − 104 )(1 − 104 )
44
Example
RELIGIOSITY
Very Mod. Slightly Not
Female 170 340 174 95
GENDER (3.2) (1.0) (-1.1) (-3.5)
Male 98 266 161 123
(-3.2) (-1.0) (1.1) (3.5)
General Social Survey data (variables Sex, Relpersn)
nij − µ̂ij
eij = p
µ̂ij
e2ij = X 2)
P
(
45
SPSS
ANALYZE menu
CROSSTABS suboption
click on STATISTICS
click on CELLS
“adjusted standardized”
See www.ats.ucla.edu/stat/examples/icda
function chisq.test
47
Partitioning Chi-squared
VD LS MS VS
<5 2 4 13 3
5 -15 2 6 22 4
VD LS MS VS
15-25 0 1 15 8
> 25 0 3 13 8
VD LS MS VS
< 15 4 10 35 7
> 15 0 4 28 16
48
X2 G2 df
.30 .30 3
1.14 1.19 3
10.32 11.98 3
(P = .02 P = .01)
11.76 13.47 9
Note
49
Small-sample test of indep.
n1+ n2+
n11 n+1 −n11
P (n11) = n
n+1
a a!
Where b = b!(a−b)!
Hypergeometric dist.
50
Example Tea Tasting (Fisher)
GUESS
Milk Tea
Milk ? 4
Poured First
Tea 4
4 4 8
n11 = 0,1,2,3, or 4
4 0
0 4
has prob.
4! 4!
4 4
4!0! 0!4!
4 4−4
P (4) = 8
=
4 8!
4!4!
4!4! 1
= = = .014
8! 70
4 4
3 1 16
P (3) = 8 = = .229
4
70
51
n11 P (n11)
0 .014
1 .229
2 .514
3 .229
4 .014
For 2 x 2 tables,
For Ho : θ = 1, Ha : θ > 1,
52
For Ha : θ 6= 1, P -value =
Note:
• Fisher’s exact test extends to I × J tables (P -value
= .23 for job sat. and income)
53
Three - Way Contingency Tables
54
53 414
11 37
0 16
4 139
are partial tables
53×37
Z = white : θ̂XY (1) = 414×11 = .43
Z = black : θ̂XY (2) = 0.00 (.94 after add .5 to cells)
55
Add partial tables → XY marginal table
Yes No
W n11 n12
B n21 n22
θ̂XY = 1.45
Cause ?
56
Def. X and Y are conditionally independent given Z,
if they are independent in each partial table.
In 2 × 2 × K table,
Marginal A 20 20 2.0
B 20 40
57
3. Generalized Linear Models
Components of a GLM
1. Random Component
58
2. Systematic component
α + β1x1 + β2x2 + · · · + βk xk
3. Link function
g(µ) = α + β1x1 + · · · + βk xk
59
Example
60
Note:
61
GLMs for Binary Data
Suppose Y = 1 or 0
P (Y = 0) = 1 − π
E(Y ) = π
V ar(Y ) = π(1 − π)
62
This is a GLM for binomial random component and
identity link fn.
1 = present, 0 = absent
63
Alcohol Malformation Proportion
Consumption Present Absent Total Present
0 48 17,066 17,114 .0028
<1 38 14,464 14,502 .0026
1-2 5 788 793 .0063
3-5 1 126 127 .0079
>6 1 37 38 .0262
Using x scores (0, .5, 1.5, 4.0, 7.0), linear prob. model
for π = prob. malformation present has ML fit
At x = 0, π̂ = α̂ = .0025
64
Note
β̂ − 0
z=
SE(β̂)
(for large n has approx std. normal dist. under null)
.0011
ex. z = .0007 = 1.50
65
• Same fit if enter 5 binomial “success totals” or the
32,574 individual binary responses of 0 or 1.
66
Logistic regression model
π
log = α + βx
1−π
π̂
ex. logit(π̂) = log 1−π̂
= −5.96 + .32x
Note
67
• Both the linear probability model and logistic regres-
sion model fit adequately
e−µµy
P (y) = , y = 0, 1, 2, . . .
y!
• µ = E(y)
√
µ = V ar(Y ), σ = µ
69
Poisson regression for count data
A : 3, 7, 6, . . . ȳA = 5.0
B : 6, 9, 8, . . . ȳB = 9.0
70
For model µ = α + βx (identity link)
µ̂ = 5.0 + 4.0x
(← test, CI for β)
71
Inference for GLM parameters
CI: β̂ ± z α2 (SE)
Test: Ho : β = 0
1. Wald test
β̂
z= SE has approx. N (0, 1) dist.
72
2. Likelihood-ratio test
= −2(L0 − L1)
73
ex. Wafer defects
β = log µB − log µA
HO : µA = µB ⇔ β = 0
Wald test
β̂ .588
z= SE
= .176
= 3.33
Likelihood-ratio test
L1 = 138.2, L0 = 132.4
74
Note
75
µB
β = log µB − log µA = log µA
µB
eβ = µA
µ̂B
eβ̂ = e.5878 = 1.8 = µ̂A
.
76
Deviance of a GLM
i.e., for
Ho : model holds
Ha : saturated model
77
For Poisson and binomial models for counts,
Deviance = G2 = 2 yi log µ̂yii ← for M
P
2
P (yi−µ̂i)2
X = µ̂i
(Pearson)
78
ex. Wafer defects
deviance G2 = 16.3
Pearson X 2 = 16.0
df = 20-2 = 18
79
Note
80
Negative binomial GLM
81
ex. GSS data “In past 12 months, how many people
have you known personally that were victims of homi-
cide?”
Y Black White
0 119 1070
1 16 60
2 12 14
3 7 4
4 3 0
5 2 0
6 0 1
Model log(µ) = α + βx
82
For Poisson or neg. bin. model,
.522 ȳB
e1.73 = 5.7 = .092
= ȳW
83
For neg. bin. model, estimated dispersion para. D̂ =
4.94 (SE = 1.0)
84
Models for Rates
µ
Loglinear model log t
= α + βx
or log(µ) − log(t) = α + βx
85
4. Logistic Regression
Y = 0 or 1
π = P (Y = 1)
h i
π(x)
log 1−π(x) = α + βx
h i
π(x)
logit[π(x)] = log 1−π(x)
Properties
eα
• If β = 0, π(x) = 1+eα
constant as x ↑ (π > 21 if α > 0)
86
1 1 1
= β4 .
e.g., at x with π(x) = 2
, slope = β 2 2
π
= 12 , log 1−π
0.5
• When π = log 0.5 = log(1) = 0 =
α + βx −→ x = − αβ is the x value where this happens
87
Ex. Horseshoe crabs
Y = 0 if no satellite.
n = 173
• β̂ > 0 so π̂ ↑ as x ↑
• At x = x̄ = 2.44,
exp (−3.69 + 1.82(2.44)) e0.734 2.08
π̂ = = 0.734
= = 0.676
1 + exp (−3.69 + 1.82(2.44)) 1 + e 3.08
88
1
• π̂ = 2 when x = − α̂β̂ = 3.69
1.82 = 2.04.
89
Note
µ̂ = −0.145 + 0.323x
At x = 5.2, µ̂ = 1.53! (for estimated prob. of satellite)
Ex.
90
Odds ratio interpretation
π
Since log 1−π = α + βx, the odds
π
= eα+βx
1−π
π
When x ↑ 1, 1−π = eα+β(x+1) = eβ eα+βx
91
Inference
CI
95% CI for β is
β̂ ± z0.025(SE) (Wald method)
1.815 ± 1.96(0.377), or (1.08, 2.55)
95% CI for eβ , multiplicative effect on odds of 1-unit in-
crease in x, is
(e1.08, e2.55) = (2.9, 12.8).
95% CI for e0.1β is
(e0.108, e0.255) = (1.11, 1.29).
(odds increases at least 11%, at most 29%).
92
Note:
Ex. LR CI for eβ is
(e1.11, e2.60) = (3.0, 13.4)
93
Significance Test
H0 : β 6= 0
β̂ 1.815
z= = = 4.8
SE 0.377
or Wald stat. z 2 = 23.2, df = 1 (chi-squared)
P-value< 0.0001
Likelihood-ratio test
Test stat.
−2(L0 − L1) = 30.0
Under H0, has approx. χ2 dist. df = 1 (P < 0.0001)
94
Note: Recall for a model M ,
deviance = −2(LM − Ls)
Ls means the log-likelihood under saturated (perfect fit)
model.
M0 : logit[π(x)] = α
95
Multiple Logistic Regression
Y binary, π = P (Y = 1)
Model form is
logit [P (Y = 1)] = α + β1x1 + β2x2 · · · + βk xk
or, equivalently
eα+β1x1+β2 x2···+βk xk
π=
1 + eα+β1 x1+β2x2···+βk xk
96
Ex. Horseshoe crab data
97
Model:
logit [P (Y = 1)] = α + β1c1 + β2c2 + β3c3 + β4x4
has ML fit
98
• At each weight, medium-light crabs are more likely
than dark crabs to have satellites.
β̂ = 1.27, e1.27 = 3.6
At a given weight, estimated odds a med-light crab has
satellite are 3.6 times estimated odds for dark crab.
e.g., at x = 2.44,
odds for med-light 0.70/0.30
= = 3.6
odds for dark 0.40/0.60
99
Note
H0 : β1 = β2 = β3 = 0
Given weight, Y is indep. of color.
100
Likelihood-ratio test statistic
101
Note Other simple models also adequate.
103
Ordinal factors
ML fit:
logit (π̂) = −2.03 + 1.65x1 − 0.51x2
SE for βˆ1 is 0.38, SE for β̂2 is 0.22.
e−0.51 = 0.60
which is estimated odds ratio for 1-category increase in
darkness.
104
Does model treating color as nominal fit better?
105
Qualitative predictors
95% CI is:
e0.87±1.96(0.367) = (1.16, 4.89)
106
Note
• Lack of interaction term means estimated odds ratio
between Y and
2.40 1
(e = ) = 11.1
0.09
Ha : β1 6= 0
β̂1 0.868
z= = = 2.36
SE 0.367
or Wald stat. z 2 = 5.59, df = 1, P = 0.018.
Exercise
108
Note
2 1
2
.. .. ..
K 1
2
logit [P (Y = 1)] = α+β1c1 +β2c2 +· · ·+βK−1cK−1 +βx
109
A model like this with several dummy var’s for a factor
is often expressed as
logit [P (Y = 1)] = α + βic + βx
βic is effect for center i (relative to last center).
• likelihood-ratio test
• Wald test
110
Ex. Exercise 4.19
where βhG is for gender, βiR is for religion, and βjP is for
party affil.
111
Chapter 5: Building Logistic Regression Models
• Model selection
• Model checking
Explanatory variables
• Weight
• Width
113
Ex. Using backward elimination
• Refit model
114
Ex. H0 : Model C + S + W has 3 parameters for C, 2
parameters for S, 1 parameter for W .
Ha : Model
C ∗S∗W = C +S+W +C ×S
+ C ×W +S×W +C ×S×W
115
Results in model fit (see text for details)
logit(π̂) = −12.7 + 1.3c1 + 1.4c2 + 1.1c3 + 0.47(width)
Setting β1 = β2 = β3 gives
logit(π̂) = −13.0 + 1.3c + 0.48(width)
where c = 1, ML, M, MD; c = 0, D.
Conclude
• Given width, estimated odds of satellite for nondark
crabs equal e1.3 = 3.7 times est. odds for dark crabs.
116
Criteria for selecting a model
117
Note
• Some software (e.g. PROC LOGISTIC in SAS) has
options for stepwise selection procedures.
Predictors Correlation
color 0.28
width 0.40
color+width 0.452
color=dark+width 0.447
28
Specificity=P(Ŷ = 0|Y = 0) = 28+34
= 0.45
118
SAS: Get with CTABLE option in PROC LOGISTIC,
for various “cutpoints”.
Model checking
119
EX. Florida death penalty data
Goodness of fit
120
Observed counts = 53 and 414
df = 4 − 3 = 1
4 = no. binomial observ’s, 3 = no. model parameters.
Note
121
(available in PROC LOGISTIC ).
d1 = 1, dept. A; d1 = 0, otherwise
······
d5 = 1, dept. E; d5 = 0, otherwise
For department F, d1 = d2 = · · · = d5 = 0.
123
• Model
logit [P (Y = 1)] = α + β1d1 + · · · + β5d5 + β6g
seems to fit poorly (deviance G2 = 20.2, df=5, P=0.01)
124
• In SPSS (version 16.0)
Analyze
↓
Generalized linear models
ւ ց
Statistics Save
↓ ↓
Can choose Can choose
• Likelihood-ratio χ2 stat. • Predicted value of mean
• Profile likelihood CI • CI for mean (π)
• Standardized Pearson residual
Analyze
↓
Generalized linear models
ւ ↓ ց
Type of model Predictors Response
• Binary logistic Predictors •Factors(qualitative dummy var.) • Dep. var Y
• Covariates(quantitative) • Binary, or
identify “trials variable” n
125
Sparse data
Ex.
S F
1 8 2
0 10 0
Model
P (S)
log = α + βx
P (F )
8×0 = 0
eβ̂ = odds ratio = 2×10
126
Ex. y = 0 for x < 50 and y = 1 for x > 50.
logit[P (Y = 1)] = α + βx
has β̂ = ∞. Software may not realize this!
PROC GENMOD
β̂ = 3.84, SE= 15601054.
SPSS
β̂ = 1.83, SE= 674.8.
127
Ch 6: Multicategory Logit Models
Let πj = P (Y = j), j = 1, 2, . . . , J
128
Note:
129
Ex. Income and job satisfaction (1991 GSS data)
Prediction equations
π̂1
log = 0.56 − 0.20x
π̂
4
π̂2
log = 0.65 − 0.07x
π̂
4
π̂3
log = 1.82 − 0.05x
π̂4
130
Note
131
Estimating response probabilities
eαj +βj x
πj = , j = 1, 2, . . . , J −1
1 + eα1+β1 x + · · · + eαJ−1 +βJ−1 x
1
πJ =
1 + eα1+β1x + · · · + eαJ−1 +βJ−1x
Then,
πj
= eαj +βj x
πJ
πj
log = αj + βj x
πJ
P
Note πj = 1.
132
Ex. Job satisfaction data
e0.56−0.20x
π̂1 =
1 + e0.56−0.20x + e0.65−0.07x + e1.82−0.05x
e0.65−0.07x
π̂2 =
1 + e0.56−0.20x + e0.65−0.07x + e1.82−0.05x
e1.82−0.05x
π̂3 =
1 + e0.56−0.20x + e0.65−0.07x + e1.82−0.05x
1
π̂4 =
1 + e0.56−0.2x + e0.65−0.07x + e1.82−0.05x
1
π̂4 = = 0.365.
1+ e0.56−0.20x + e0.65−0.07x + e1.82−0.05x
133
• ML estimates determine effects for all pairs of cate-
gories, e.g.
π̂1 π̂1 π̂2
log = log − log
π̂2 π̂4 π̂4
= (0.564 − 0.199x) − (0.645 − 0.070x)
= −0.081 − 0.129x
134
Note: Inference uses usual methods
135
Model for Ordinal Responses
for
j = 1, 2, . . . , J − 1
logit [P (Y ≤ j)] = αj + βx
136
Note
•
odds of (Y ≤ j) at x2
= eβ(x2−x1)
odds of (Y ≤ j) at x1
= eβ when x2 = x1 + 1
Also called proportional odds model.
Software notes
137
• SPSS:
138
Note
139
These tests give stronger evidence of association than if
treat:
πj
log = αj + βj x
π4
• X, Y as nominal
140
Ex. Y = political ideology (GSS data)
(1= very liberal, . . . , 5 = very conservative)
x1 = gender (1 = F, 0 = M)
ML fit
logit [P̂ (Y ≤ j)] = α̂j + 0.117x1 + 0.964x2
142
Find P̂ (Y = 1) (very liberal) for male Republicans, fe-
male Republicans.
eα̂j +0.366x1+1.265x2−0.509x1x2
P̂ (Y ≤ j) =
1 + eα̂j +0.366x1+1.265x2−0.509x1x2
For j = 1, α̂1 = −2.674
e−2.674
P̂ (Y = 1) = = 0.064
1 + e−2.674
e−2.674+0.366
P̂ (Y = 1) = = 0.090
1 + e−2.674+0.366
Note P (Y = 5) = P (Y ≤ 5) − P (Y ≤ 4) =
1 − P (Y ≤ 4) (use α̂4 = 0.879)
143
Note
144
Ch 8: Models for Matched Pairs
Treatment S F Total
Drug 61 25 86
Placebo 22 64 86
Methods so far (e.g., X 2, G2 test of indep., CI for θ, lo-
gistic regression) assume independent samples, they are
inappropriate for dependent samples (e.g., same subjects
in each sample, which yield matched pairs).
145
Population probabilities
S F
S π11 π12 π1+
F π21 π22 π2+
π+1 π+2 1.0
Compare dependent samples by making inference about
π1+ − π+1.
Note:
π1+ − π+1 = (π11 + π12) − (π11 − π21)
= π12 − π21
So, π1+ = π+1 ⇐⇒ π12 = π21 (symmetry)
146
By normal approximation to binomial, for large n∗
n12 − n∗/2
z= q ∼ N (0, 1)
n∗( 21 )( 12 )
n12 − n21
=√
n12 + n21
Or,
2 (n12 − n21)2
z = ∼ χ21
n12 + n21
called McNemar’s test
Ex.
Placebo
S F
S 12 49 61 (71%)
Drug
F 10 15
22 86
(26%)
n12 − n21 49 − 10
z=√ =√ = 5.1
n12 + n21 49 + 10
P < 0.0001 for H0 : π1+ = π+1 vs Ha : π1+ 6= π+1
Extremely strong evidence that probability of success is
higher for drug than placebo.
147
CI for π1+ − π+1
n11 n12
n21 n22
n
12 49
10 15
86
n11 + n12 n11 + n21
p1+ − p+1 = −
n n
n12 − n21 49 − 10
= = = 0.453
n 86
148
The standard error of p1+ − p+1 is
r
1 (n12 − n21)2
(n12 + n21) −
n n
149
Measuring agreement (Section 8.5.5)
Let πij = P (S = i, E = j)
X
P (agreement) = π11 + π22 + π33 = πii
= 1 if perfect agreement
150
Note
•κ = 1 if perfect agreement.
Ex.
X 24 + 13 + 64
π̂ii = = 0.63
160
X 45 42 83 88
π̂i+π̂+i = +· · ·+ = 0.40
160 160 160 160
0.63 − 0.40
κ̂ = = 0.389
1 − 0.40
151
•95% CI for κ: 0.389 ± 1.96(0.06) = (0.27, 0.51).
•For H0 : κ = 0,
κ̂ 0.389
z= = = 6.5
SE 0.06
•In SPSS,
152
Ch 9: Models for Correlated, Clustered Re-
sponses
153
•Fitting method uses empirical dependence to adjust
standard errors to reflect actual observed dependence.
155
95% CI: 0.45±1.96(0.075) = (0.307, 0.600) for true diff.
Note
GEE is a “quasi-likelihood” method
156
Ex y = response on mental depression (1= normal, 0=
abnormal)
three times (1,2,4 weeks)
two drug treatments (standard, new)
two severity of initial diagnosis groups (mild, severe)
Model
P (Yt = 1)
log = α + β1s + β2d + β3t
P (Yt = 0)
Unrealistic?
158
More realistic model
P (Yt = 1)
log = α + β1s + β2d + β3t + β4(d × t)
P (Yt = 0)
permits time effect to differ by drug
GEE estimates
β̂1 = -1.31 s
β̂2 = -0.06 d
β̂3 = 0.48 t
β̂4 = 1.02 d × t
Test of H0 : no interaction (β4 = 0) has
β̂4 1.02
z= = = 5.4
SE 0.188
Wald stat. z 2 = 29.0 (P<0.0001)
160
Cumulative Logit Modeling of Repeated Or-
dinal Responses
161
Ex. Data from randomized, double-blind clinical trial
comparing hypnotic drug with placebo in patients with
insomnia problems
Response
Treatment occasion <20 20–30 30–60 >60
Active Initial 0.1 0.17 0.34 0.39
Follow-up 0.34 0.41 0.16 0.09
Placebo Initial 0.12 0.17 0.29 0.42
Follow-up 0.26 0.24 0.29 0.21
162
Ex.
7+4+1+0
0.10 =
7 + 4 + 1 + 0 + · · · + 9 + 17 + 13 + 8
7 + 11 + 13 + 9
0.34 =
7 + 4 + 1 + 0 + · · · + 9 + 17 + 13 + 8
163
data francom;
input case treat occasion outcome;
datalines;
1 1 0 1
1 1 1 1
2 1 0 1
2 1 1 1
3 1 0 1
3 1 1 1
4 1 0 1
4 1 1 1
5 1 0 1
5 1 1 1
6 1 0 1
6 1 1 1
7 1 0 1
7 1 1 1
8 1 0 1
8 1 1 2
9 1 0 1
9 1 1 2
10 1 0 1
10 1 1 2
239 0 0 4
239 0 1 4
;
proc genmod data=insomnia;
class case;
model outcome = treat occasion treat*occasion /
dist=multinomial link=cumlogit ;
repeated subject=case / type=indep corrw;
run;
164
Yt = time to fall asleep
x = treatment (0 = placebo, 1 = active)
t = occasion (0 = initial, 1 = follow-up after 2 weeks)
Model
logit[P (Yt ≤ j)] = αj +β1t+β2x+β3(t×x), j = 1, 2, 3
GEE estimates:
β̂1 = 1.04 (SE = 0.17), placebo occasion effect
β̂2 = 0.03 (SE = 0.24), treatment effect initially
β̂3 = 0.71 (SE = 0.24), interaction
165
• e0.74 = 2.1 times the odds for placebo, at follow-up
time (t = 1)
166
Handling repeated measurement and other forms of clustered da
167
2. Random effects models (Ch. 10)
Model
logit[P (Yit = 1)] = αi + βxt
intercept αi varies by subject
Large positive αi
large P (Yit = 1) each t
Large negative αi
small P (Yit = 1) each t
These will induce dependence, averaging over subjects.
Model
logit[P (Yit = 1)] = ui + α + βxt
{ui} are random effects. Parameters such as β are fixed
effects.
169
Response at Three Times
Diag Drug nnn nna nan naa ann ana aan aaa
Mild Stan 16 13 9 3 14 4 15 6
Mild New 31 0 6 0 22 2 9 0
Sev Stan 2 2 8 9 9 15 27 28
Sev New 7 2 5 2 31 5 32 6
Graph here
171
Ch 7: Loglinear Models
172
Loglinear models treat cell counts as Poisson
and use the log link function
log(µij ) = λ + λX
i + λY
j
λXi : effect of X falling in row i
λYj : effect of Y falling in column j
173
Ex. Income and job satisfaction
174
EX. Income (I) and job satisfaction (S)
Model
log(µij ) = λ + λIi + λSj
can be expressed as
log(µij ) = λ + λI1 z1 + λI2 z2 + λI3 z3 + λS1 w1 + λS2 w2 + λS3 w3
where
z1 =1, income cat. 1 w1 =1, sat. cat. 1
0, otherwise 0, otherwise
...... ......
z3 =1, income cat. 3 w3 =1, sat. cat. 3
0, otherwise 0, otherwise
175
Parameter No. nonredundant
λ 1
λXi I −1 (can set λX
I = 0 )
λYj J −1 (can set λYJ = 0 )
λXY
ij (I − 1)(J − 1) (no. of products of dummy var’s)
Note.
For a Poisson loglinear model
df = no. Poisson counts − no. parameters
(no. Poisson counts = no. cells)
log µij = λ + λX
i + λY
j
df = IJ − [1 + (I − 1) + (J − 1)] = (I − 1)(J − 1)
Test of indep. using Pearson X 2 or like-ratio G2 is a
goodness-of-fit test of the indep. loglinear model.
log µij = λ + λX Y
i + λj + λij
XY
176
Ex. Recall 4 × 4 table
Independence model
log(µij ) = λ + λIi + λSj
has X 2 = 11.5, G2 = 13.5, df= 9.
177
Estimated odds ratio using highest and lowest cate-
gories is
µˆ11µˆ44 h
ˆ ˆ ˆ ˆ
i
= exp λIS IS IS
11 + λ44 − λ14 − λ41
IS
µˆ14µˆ41
= exp(24.288) = 35, 294, 747, 720 (GENMOD)
n11n44 2 × 8
= = =∞
n14n41 3 × 0
since model is saturated
(software doesn’t quite get right answer when ML est.=∞)
178
Loglinear Models for Three-way Tables
Two-factor terms represent conditional log odds ratios, at
fixed level of third var.
Ex. 2 × 2 × 2 table
Ex.
log µijk = λ + λX
i + λY
j + λZ
k + λXY
ij + λXZ
ik + λYZ
jk
called model of homogeneous association
Each pair of var’s has association that is identical at all
levels of third var.
Denote by (XY, XZ, Y Z).
179
Ex. Berkeley admissions data (2 × 2 × 6)
Gender(M,F) × Admitted(Y, N) × Department(1,2,3,4,5,6)
180
• Model (AG, AD, DG)
529.3 × 36.3
θ̂AG(1) = = 0.90
295.7 × 71.7
= θ̂AG(2) = . . . = θ̂AG(6)
= exp(λAGˆ + λAG ˆ − λAGˆ − λAG
ˆ )
11 22 12 21
= exp(−.0999) = .90
Control for dept., estimated odds of admission for males
equal .90 times est. odds for females.
181
Residual analysis
Dept. 1 has
• fewer males accepted than expected by model
• more females accepted than expected by model
If re-fit model (AD, DG) to 2 × 2 × 5 table for Depts 2-6,
G2 = 2.7, df = 5, good fit.
log µijk = λ + λG A D GA GD AD
i + λj + λk + λij + λik + λjk
H0 : λGA
ij = 0 (A cond. indep of G, given D)
182
Recall θ̂AG(D) = exp λ̂AG
11 = exp(−.0999) = .90
Note.
• Loglinear models extend to any no. of dimensions
• Loglinear models treat all variables symmetrically;
Logistic regr. models treat Y as response and other
var’s as explanatory.
Logistic regr. is the more natural approach when one
has a single response var. (e.g. grad admissions) See
output for logit analysis of data
183
Ex Text Sec. 6.4, 6.5
Auto accidents
G = gender (F, M)
Loglinear model (GLS, IG, IL, IS) fits quite well (G2 =
7.5, df = 4)
Loglinear model (GI, GL, GS, IL, IS, LS) (G2 = 23.4, df =
5)fits almost as well as (GLS, GI, IL, IS) (G2 = 7.5, df =
4) in practical terms but n is huge(68,694)
185
jured are 2.26 times the estimated odds of injury for those
wearing seat belts, cont. for gender and location. (or. in-
terchanges S and I in interp.)
Dissimilarity index
X
D= |pi − π̂i|/2
• 0 ≤ D ≤ 1, with smaller values for better fit.
Simpler model (GL, GS, LS, IG, IL, IS) has G2 = 23.4
(df=5) for testing fit (P<0.001), but D = 0.008. (Good
fit for practical purposes, and simpler to interpret GS,LS
associations.)
186
Can be useful to describe closeness of sample cell pro-
portions {pi} in a contingency table to the model fitted
proportions {π̂i}.
187