Professional Documents
Culture Documents
Variables
Edpsy/Psych/Soc 589
Carolyn J. Anderson
Department of Educational Psychology
I L L I N O I S
university of illinois at urbana-champaign
c Board of Trustees, University of Illinois
Spring 2014
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Outline
In this set of notes:
Spring 2014
2.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
3.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
logit() = log
1
Spring 2014
4.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
5.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
6.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
7.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
8.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
9.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
9.2/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
10.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
10.2/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
10.3/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
11.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
12.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
McFaddens model.
Spring 2014
13.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
14.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
15.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
16.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
17.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
HSB Example
HSB Example: The simplest model, the linear probability model
(i.e., link= identity, and distribution is Binomial).
i
=
+ x
= 7.0548 + .1369xi
Note: ASE for
equals .6948 and ASE for equals .0133.
C.J. Anderson (Illinois)
Spring 2014
18.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
# attend
acad
8
18
14
17
18
22
34
23
35
56
26
# cases
46
87
44
43
50
50
58
44
68
78
32
Observed
proportion
.17
.20
.31
.39
.36
.44
.58
.52
.51
.71
.81
Predicted Prob
Sum Equation
.12
.13
.19
.23
.27
.32
.36
.38
.40
.45
.49
.52
.49
.58
.56
.65
.58
.72
.75
.82
.80
.89
Spring 2014
Pred
#acad
5.64
16.75
12.82
15.46
20.00
24.98
28.52
24.70
39.52
58.62
25.68
19.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Where. . .
P
Predicted # academic = i
(xi ), where the sum is over
those values of i that correspond to observations with xi
within each of math score categories.
P
(xi )/(# cases).
Predicted probability Sum = i
Predicted probability Equation
= exp(7.0548+.1369(achieve i ))/(1+exp(7.0548+.1369(achieve i )
Spring 2014
20.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
then
then
then
then
then
then
then
then
then
grp=2;
grp=3;
grp=4;
grp=5;
grp=6;
grp=7;
grp=8;
grp=9;
grp=10;
Spring 2014
21.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
22.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
23.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Interpreting
Recall that determines the rate of change of the curve of (x)
(plotted with values of x along the horizontal axis) such that
If
> 0,
If
< 0,
If
= 0,
Spring 2014
24.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Figure: Interpreting
Spring 2014
25.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Different s
Spring 2014
26.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
27.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
(xi ) =
exp{7.0548 + .1369xi }
1 + exp{7.0548 + .1369xi }
Spring 2014
28.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
(70) =
= .92619
1 + exp{7.0548 + .1369(70)}
1
(70) = 1 .9261 = .0739
and the slope equals
(70)(1 (70)) = (.1369)(.9261)(.0739) = .009.
Spring 2014
29.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
30.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
31.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
(51.53) = (1
(51.53)) = .5
i
1
i Slope at xi
70
.9261 .0739 .009
60
.7612 .2388 .025
52
.5160 .4840 .03419
51.5325 .5000 .5000 .03423
Spring 2014
32.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
33.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
34.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
35.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
(x = 1) = e .1369 = 1.147
Spring 2014
36.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Race
Smoking status of mother
History of high blood pressure
History of premature labor
Presence of uterine irritability
Mothers pre-pregnancy weight
37.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Example Continued
Youll have to wait for the results until later in semester (or
check references).
Spring 2014
38.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
A Special Case
Spring 2014
39.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
A Special Case
Spring 2014
39.2/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
A Special Case
Spring 2014
39.3/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
40.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
41.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
42.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Hypothesis Testing: HO : = 0
i.e., X is not related to response.
1. Wald test
2. Likelihood ratio test
Wald test: For large samples,
ASE
is approximated N (0, 1) when HO : = 0 is true.
So for 1-tailed tests, just refer to standard normal distribution.
z=
Wald statistic =
ASE
!2
Spring 2014
43.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
.1369
.0133
2
= (10.29)2 = 106
Spring 2014
44.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
= 2(LO L1 )
= 2(415.6749 (346.1340)) = 139.08,
df = 1,
Spring 2014
p < .01
45.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Parameter
Intercept
achieve
Pr > ChiSq
< .0001
< .0001
Why is the Wald statistic only 106.41, while the likelihood ratio
statistic is 139.08 and both have the same df & testing the same
hypothesis?
C.J. Anderson (Illinois)
Logistic Regression for Dichotomous
Spring 2014
46.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
(x) =
exp{
+ x}
1 + exp{
+ x}
(51.1) =
exp{7.0548 + .1369(51.1)}
1 + exp{7.0548 + .1369(51.1))}
Spring 2014
47.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Translation
(x)
logit(
+ x
q (x)) =
c
Var(logit(
(x)))
HSB Example
(for x = 51.1)
0.4853992
0.05842
0.0927311
.4402446
.5307934
Spring 2014
48.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
c + x)
var(
= var(
+ 2x cov(
c + x)
c ) + x 2 var(
c )
c , )
var(
Spring 2014
49.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Prm1
0.48271
-0.009140
Prm2
-0.009140
0.0001762
Note: ASE of =
C.J. Anderson (Illinois)
0.48271
-0.009140
-0.009140
0.0001762
Spring 2014
50.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Computing CI for
(xi )
So for x = 51.1,
+ 2x cov(
c + x)
c ) + x 2 var(
c )
c , )
var(
= var(
c + x)
= .008697 = .0933
and Var(
A 95% confidence interval for the true logit when x = 51.1 is
q
+ x)
1.96 Var(
+ x
0.0584 1.96(.0933) (.2413, .1245)
and finally to get the 95% confidence interval for the true
probability when x = 51.1, transform the endpoints of the interval
for the logit to probabilities:
exp(.2413)
exp(.1245)
(.44, .53)
,
1 + exp(.2413) 1 + exp(.1245)
C.J. Anderson (Illinois)
Spring 2014
51.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Model based CIs are better than the non-model based ones.
e.g., mean achievement 58.84, non-model based
n = 2,
# academic = 1,
and p = 1/2 = .5
Whereas the model based estimate equals
(58.84) = .73.
Cant even compute a 95% confidence interval for using the
(non-model) based sample proportion?
With the logistic regression model, the model based interval is
(.678, .779).
The model confidence interval will tend to be much narrower
than ones based on the sample proportion p, because. . . e.g.,
the estimated standard error of p is
p
p
p(1 p)/n = .5(.5)/2 = .354
while the estimated standard error of
(58.84) is .131.
The model uses all 600 observations, while the sample
proportion only uses 2 out of the 600 observations.
Spring 2014
52.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
53.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
54.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
55.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Model checking
Outline:
1. Goodness-of-fit tests for continuous x.
1.1 Group observed counts & fitted counts from estimated model.
1.2 Group observed counts and then re-fit the model.
1.3 Hosmer & Lemeshow goodness-of-fit test.
Spring 2014
56.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
57.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
58.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Group
x < 40
40 x < 45
45 x < 47
47 x < 49
49 x < 51
51 x < 53
53 x < 55
55 x < 57
57 x < 60
60 x < 65
65 x
Mean
score
37.77
42.60
46.03
47.85
50.12
52.12
53.95
56.05
58.39
62.41
66.56
Observed Data
# attend # attend # cases
acad non-acad in group
8
38
46
18
69
87
14
30
44
17
26
43
18
32
50
22
28
50
34
24
58
23
21
44
35
33
68
56
22
78
26
6
32
Fitted Values
# attend # attend
acad non-acad
6.48
39.52
16.86
70.14
11.98
32.02
13.97
29.03
17.41
32.59
21.44
28.56
24.21
33.79
20.86
23.15
33.45
34.55
50.52
27.48
22.73
9.27
Spring 2014
59.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Plot of Counts
Spring 2014
60.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
61.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
= # group # parameters
df
(11 2) = 9
(11 2) = 9
Value
9.40
9.72
pvalue
Spring 2014
62.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
63.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
64.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Alternate Step 3:
PROC RANK out=group1(keep=grp achieve acedemic)
groups=11;
var achieve;
ranks grp;
run;
Spring 2014
65.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
66.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
67.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
df
= 11 2 = 9
C.J. Anderson (Illinois)
Spring 2014
68.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Statistic
Deviance
Pearson Chi-Square
C.J. Anderson (Illinois)
G2
X2
df
9
9
Value
9.7471
9.4136
p-value
.37
.40
Spring 2014
69.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
70.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Figure of this
Spring 2014
71.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Comparison
Spring 2014
72.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
How:
1. Order the observations according to
(x) from smallest to
largest. In the HSB example, there is only 1 explanatory
variable and probabilities increase as achievement scores go up
(i.e., > 1) so we can just order the math scores. When there
is more than 1 explanatory variable, the ordering must be done
using the
(x)s.
2. Depending on the number of groups desired, it is common to
partition observations such that they have the same number
observations per group.
C.J. Anderson (Illinois)
Spring 2014
73.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Grouping Data by
(xi )
Spring 2014
74.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Hosmer-Lemeshow Statistic
PROC/LOGISTIC will compute the Hosmer-Lemeshow statistic, as
well as print out the partitioning.
Example using PROC LOGISTIC;
Spring 2014
75.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
76.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
833.350
837.747
831.350
696.268
705.062
692.268
139.0819
127.4700
106.4038
1
1
1
< .0001
< .0001
< .0001
Spring 2014
77.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Parameter
DF
Estimate
Standard
Error
Chi-Square
Pr >ChiSq
Intercept
achieve
1
1
-7.0543
0.1369
0.6948
0.0133
103.0970
106.4038
< .0001
< .0001
Spring 2014
78.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Parameter
DF
Estimate
Standard
Error
Chi-Square
Pr >ChiSq
Intercept
achieve
1
1
-7.0543
0.1369
0.6948
0.0133
103.0970
106.4038
< .0001
< .0001
Effect
Point
Estimate
achieve
1.147
95% Wald
Confidence Limits
1.117
1.177
Spring 2014
78.2/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
79.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
80.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
81.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
= 2[(L0 LS ) (L1 LS )]
= G 2 (M0 ) G 2 (M1 )
C.J. Anderson (Illinois)
Spring 2014
82.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
and . . .
and df = df0 df1 .
Assuming that M1 holds, this statistic tests
logit(xi ) = + xi
G 2 (M
Spring 2014
83.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Source
ach bar
DF
1
ChiSquare
134.57
Pr > ChiSq
<.0001
Spring 2014
84.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
y =1
y =0
Predicted
yi = 1
yi = 0
correct
incorrect (false negative)
incorrect (false positive)
correct
Logistic Regression for Dichotomous
Spring 2014
85.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Classification Table
Were more interested in conditional proportions and probabilities:
Actual
y =1
y =0
Predicted
yi = 1
yi = 0
n11 /n1+ n12 /n1+
n21 /n2+ n22 /n2+
n1+
n2+
Spring 2014
86.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Classification Table
Were more interested in conditional proportions and probabilities:
Actual
y =1
y =0
Predicted
yi = 1
yi = 0
n11 /n1+ n12 /n1+
n21 /n2+ n22 /n2+
n1+
n2+
Spring 2014
86.2/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Classification Table
Were more interested in conditional proportions and probabilities:
Actual
y =1
y =0
Predicted
yi = 1
yi = 0
n11 /n1+ n12 /n1+
n21 /n2+ n22 /n2+
n1+
n2+
Spring 2014
86.3/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
HSB Example
Spring 2014
87.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
308 academic
81 predicted non-academic
600 students
94 predicted academic
292 nonacademic
Spring 2014
88.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
89.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
90.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
If
i <
j , then the pair is discordant
If
i =
j , then the pair is tie
The area under the ROC curve equals the concordance index. The
concordance index is an estimate of the probability that predictions
and outcomes are concordant. In PROC LOGISTIC, this index is c
in the table of Association of Predicted Probabilities and
Observed Responses.
This also provides a way to compare models, the solid dots in the
next figure are from a model with more predictors. For example,. . .
C.J. Anderson (Illinois)
Spring 2014
91.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
92.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Residuals
Goodness-of fit-statistics are global, summary measures of lack of
fit.
We should also
Describe the lack of fit
Look for patterns in lack of fit.
Let
yi = observed number of events/successes.
ni = number of observations with explanatory variable equal
to xi .
i = the estimated (fitted) probability at xi .
ni
i = estimated (fitted) number of events.
Pearson residuals are
yi ni
i
ei = p
ni
i (1
i )
C.J. Anderson (Illinois)
Spring 2014
93.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Pearson Residuals
Pearson residuals are
ei = p
yi ni
i
ni
i (1
i )
P
X 2 = i ei2 ; that is, the ei s are components of X 2 .
When the model holds, the Pearson residuals are
approximately normal with mean 0 and variance slightly less
than 1. Values larger than 2 are large.
Just as X 2 and G 2 are not valid when fitted values are small,
Pearson residuals arent that useful (i.e., they have limited
meaning).
If ni = 1 at many values, then the possible values for yi = 1
and 0, so ei can assume only two values.
Spring 2014
94.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
mean
achieve
37.67
42.60
46.01
47.87
50.11
52.17
53.98
56.98
58.42
62.43
66.56
# attend
acad
8
18
14
17
20
22
43
28
47
62
29
Number
of cases
46
87
44
43
50
50
58
44
68
78
32
Observed
prop
.17
.21
.32
.40
.40
.44
.74
.63
.69
.79
.90
Predicted
prob
.13
.23
.32
.38
.45
.52
.58
.65
.72
.81
.88
Pearson
residuals
.78
.55
.07
.21
.75
1.15
2.47
.15
.45
.38
.42
Spring 2014
95.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
96.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
97.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
98.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
99.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
100.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
101.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
102.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Influence
Some observations may have too much influence on
1. Their effect on parameter estimates If the observation is
deleted, the values of parameter estimates are considerably
different.
2. Their effect on the goodness-of-fit of the model to data If
the observation is deleted, the change in how well the model
fits the data is large.
3. The effect of coding or misclassification error of the binary
response variable on statistic(s) of interest. Statistics of
interest include fit statistics and/or model parameter
estimates.
C.J. Anderson (Illinois)
Spring 2014
103.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
104.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
105.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
106.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
107.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
2.52
3.41
3.15
5.06
3.53
2.88
2.67
3.93
38
37
39
37
46
30
39
32
0
0
0
1
1
0
0
1
2
6
10
14
18
22
26
30
2.56
2.46
2.60
3.34
2.68
2.65
2.29
3.34
31
36
41
32
34
46
31
30
0
0
0
1
0
0
0
0
3
7
11
15
19
23
27
31
2.19
3.22
2.29
2.38
2.60
2.09
2.15
2.99
33
38
36
37
38
44
31
36
0
0
0
1
0
1
0
0
4
8
12
16
20
24
28
32
2.18
2.21
2.35
3.15
2.23
2.28
2.54
3.32
31
37
29
36
37
36
28
35
Spring 2014
0
0
0
0
0
0
0
0
108.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
109.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
110.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
1. Leverage or hi
These equal the diagonal elements of the hat matrix, which has
a row and column corresponding to each observation.
Spring 2014
111.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Pearson residuals:
yi ni
(xi )
ni
(xi )(1
(xi ))
Deviance residuals (where
(xi ) =
i )
p
i )
if yi = 0
di = 2ni log(1
p
i )) + (ni yi ) log((ni yi )/(ni (1
i )))
= 2{yi log(yi /(ni
ei = p
= +
=
if yi /ni <
i
2ni log(
i )
if yi = ni
Spring 2014
112.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
113.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
3. Dfbeta
This assesses the effect that an individual observation has on the
parameter estimates.
Dfbeta =
The larger the value of Dfbeta, the larger the change in the
estimated parameter when the observation is removed.
Spring 2014
114.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
4. c and c
These measure the change in the joint confidence interval for the
parameters produced by deleting an observation.
ei2 hi
(1 hi )2
ci =
ei hi
(1 hi )
and
Spring 2014
115.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
ci
hi
Spring 2014
116.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
13
X
x
x
x
Case Number
14 15 17 23
X
x
X x
X
X
X
X
X
X
X
x
X
x
X
x
X
x
X
x
X
x
X
x
X
x
X
x
X
x
X
29
X
x
X
x
x
x
x
x
x
Spring 2014
117.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Purpose/Problem:
Spring 2014
118.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
119.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
logit(
i ) =
+ (fibrin)
i
= 6.8451 + 1.8271(fibrin)i
Which has ln(likelihood) = 12.4202.
Well look at ROI statistics for
Intercept:
()i =
i
=
i (6.8451)
Spring 2014
120.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
121.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
obs
changed
13
29
27
4
3
8
20
24
11
26
12
6
17
1
28
2
delta a
3.22002
1.54184
1.86539
1.81739
1.80128
1.76887
1.73622
1.65351
1.63678
1.63678
1.53500
1.34190
0.81094
1.23279
1.19579
1.15846
delta b
-1.15387
-0.59611
-0.54622
-0.52996
-0.52450
-0.51353
-0.50247
-0.47447
-0.46881
-0.46881
-0.43437
-0.36908
-0.35647
-0.33221
-0.31971
-0.30710
delta ll
-0.98268
0.08378
-2.54509
-2.50642
-2.49342
-2.46726
-2.44087
-2.37395
-2.36039
-2.36039
-2.27786
-2.12110
0.62283
-2.03251
-2.00249
-1.97222
**
Spring 2014
122.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
obs
changed
10
19
22
25
18
14
21
31
0
16
9
7
32
30
5
15
23
delta a
1.08282
1.08282
0.98634
0.94714
0.92740
0.41425
0.51242
0.26625
0.00003
-0.11873
-0.11873
-0.29759
-0.56549
-0.62092
-0.82005
-2.57889
-4.12667
delta b
-0.28156
-0.28156
-0.24900
-0.23577
-0.22911
-0.22689
-0.08926
-0.00644
0.00000
0.12280
0.12280
0.18275
0.27242
0.29095
0.35747
0.74032
1.23224
delta ll
-1.91091
-1.91091
-1.83282
-1.80112
-1.78518
0.91436
-1.45180
-1.25611
0.00002
-0.95460
-0.95460
-0.81602
-0.61059
-0.56842
-0.41784
2.85995
3.67215
***
***
Spring 2014
123.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
124.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Running %RangeOfInfluence
title Example 1: One numerical explanatory variable;
%RangeOfInfluence(indata=esr, rangedata=range1,
samplesize=32, obs=id,
Y=response,
linpred=fibrin );
run;
title Example 2: Two numerical explanatory variables;
%RangeOfInfluence(indata=esr, rangedata=range2,
samplesize=32, obs=id,
Y=response,
linpred=fibrin globulin );
run;
C.J. Anderson (Illinois)
Spring 2014
125.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Running %RangeOfInfluence
Spring 2014
126.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Spring 2014
127.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
The 6 who were unhealthy are 13, 14, 15, 17, 23, and 29.
Spring 2014
128.1/ 129
Review Interpretation Inference Model checking. GOF Grouping Hosmer-Lemeshow LR ROC Residuals Influence Titanic
Source:
Thomas Carson
http://stat.cmu.edu/S/Harrell/data/descriptions/titanic.html
Spring 2014
129.1/ 129