You are on page 1of 5

Chapter 12

Regression Models

12.1 The point (x̂0 , ŷ 0 ) is the closest if it lies on the vertex of the right triangle with vertices (x0 , y 0 )
and (x0 , a + bx0 ). By the Pythagorean theorem, we must have
h i h i
0 2 0 2 0 2 0 2 2
(x̂ −x0 ) + ŷ 0 −(a + bx ) + (x̂ −x0 ) +(ŷ −y 0 ) = (x0 − x0 )2 + (y 0 − (a + bx0 )) .

Substituting the values of x̂0 and ŷ 0 from (12.2.7) we obtain for the LHS above
"
0 2  2 0 2 # "  0 2  0 2 #
b(y −bx0 −a) b (y −bx0 −a) b(y −bx0 −a) y −bx−a)
+ + +
1+b2 1+b2 1+b2 1+b2
" #
2 4 2
2 b +b +b +1 2
= (y 0 − (a + bx0 )) 2 = (y 0 − (a + bx0 )) .
(1+b2 )
set
12.3 a. Differentiation yields ∂f /∂ξi = −2(xi − ξi ) − 2λβ [yi −(α+βξi )] = 0 ⇒ ξi (1 + λβ 2 ) =
xi −λβ(y i −α), which is the required solution. Also, ∂ 2 f /∂ξ 2 = 2(1 + λβ 2 ) > 0, so this is a
minimum.
b. Parts√i), ii), and iii) √
are immediate. For iv) just note that D is Euclidean distance between
(x1 , λy1 ) and (x2 , λy2 ), hence satisfies the triangle inequality.
12.5 Differentiate log L, for L in (12.2.17), to get
n
∂ −n 1 λ Xh i2
log L = + y i −(α̂ + β̂xi ) .
∂σδ2 σδ2 2
2(σδ2 ) 1+β̂ 2 i=1
Set this equal to zero and solve for σδ2 . The answer is (12.2.18).
12.7 a. Suppressing the subscript i and the minus sign, the exponent is
2 2  2 2 2 2
(x−ξ) [y−(α+βξ)] σ +β σδ 2 [y−(α+βx)]
2 + = 2 (ξ−k) + 2 ,
σδ σ2 σ2 σδ σ2 +β 2 σδ
σ2 x+σδ2 β(y−α)
where k = σ2 +β 2 σδ2
. Thus, integrating with respect to ξ eliminates the first term.
b. The resulting function must be the joint pdf of X and Y . The double integral is infinite,
however.
12.9 a. From the last two equations in (12.2.19),
1 1 1 Sxy
σ̂δ2 = Sxx − σ̂ξ2 = Sxx − ,
n n n β̂

which is positive only if Sxx > Sxy /β̂. Similarly,


1 1 1 Sxy
σ̂2 = Syy − β̂ 2 σ̂ξ2 = Syy − β̂ 2 ,
n n n β̂

which is positive only if Syy > β̂Sxy .


12-2 Solutions Manual for Statistical Inference

b. We have from part a), σ̂δ2 > 0 ⇒ Sxx > Sxy /β̂ and σ̂2 > 0 ⇒ Syy > β̂Sxy . Furthermore,
σ̂ξ2 > 0 implies that Sxy and β̂ have the same sign. Thus Sxx > |Sxy |/|β̂| and Syy > |β̂||Sxy |.
Combining yields
|Sxy | Syy
< β̂ < .
Sxx |Sxy |
12.11 a.

Cov(aY +bX, cY +dX)


= E(aY + bX)(cY + dX) − E(aY + bX)E(cY + dX)
= E acY 2 +(bc + ad)XY +bdX 2 − E(aY + bX)E(cY + dX)


= acVarY + ac(EY )2 + (bc + ad)Cov(X, Y )


+(bc + ad)EXEY + bdVarX + bd(EX)2 − E(aY + bX)E(cY + dX)
= acVarY + (bc + ad)Cov(X, Y ) + bdVarX.

b. Identify a = βλ, b = 1, c = 1, d = −β, and using (12.3.19)

Cov(βλYi +Xi , Yi −βXi ) = βλVarY + (1 − λβ 2 )Cov(X, Y ) − βVarX


= βλ σ2 + β 2 σξ2 + (1 − λβ 2 )βσξ2 − β σδ2 + σξ2
 

= βλσ2 − βσδ2 = 0

if λσ2 = σδ2 . (Note that we did not need the normality assumption, just the moments.)
c. Let W √i = βλY√i + Xi , Vi = Yi + βXi . Exercise 11.33 √ shows that p if Cov(Wi , Vi ) = 0,
then n − 2r/ 1 − r2 has a tn−2 distribution. Thus n − 2rλ (β)/ 1 − rλ2 (β) has a tn−2
distribution for all values of β, by part (b). Also
( 2
)!
(n − 2)rλ(β) (n − 2)rλ2 (β)
 
P β: 2 ≤ F1,n−2,α =P (X, Y ) : ≤ F1,n−2,α = 1 − α.
1 − rλ(β) 1 − rλ2 (β)

12.13 a. Rewrite (12.2.22) to get


 
   2
tσ̂β tσ̂β (β̂−β) 
β : β̂ − √ ≤ β ≤ β̂ + √ = β: . ≤F .
n−2 n−2  σ 2 (n − 2) 
β

b. For β̂ of (12.2.16), the numerator of rλ (β) in (12.2.22) can be written


 
1
βλSyy +(1−β 2 λ)S xy −βSxy = β 2 (λSxy ) + β(Sxx − λSyy ) + Sxy = λSxy (β − β̂) β+ .
λβ̂
Again from (12.2.22), we have

rλ2 (β)
1 − rλ2 (β)
2
βλSyy +(1−β 2 λ)Sxy −βSxy
= 2,
(β 2 λ2 Syy +2βλSxy +Sxx ) (Syy −2βSxy +β 2 Sxx ) − (βλSyy +(1−β 2 λ)Sxy −βSxx )

and a great deal of straightforward (but tedious) algebra will show that the denominator
of this expression is equal to

(1 + λβ 2 )2 Syy Sxx − Sxy


2

.
Second Edition 12-3

Thus
 2  2
rλ2 (β) λ2 Sxy
2
β − β̂ β+ λ1β̂
= y
1 − rλ2 (β) 2
(1−λβ 2 ) Syy Sx −S 2xy

 2 !2
β−β̂ 1+λβ β̂ (1 + λβ̂ 2 )2 Sxy
2
= 2
h i,
σ̂β 1+λβ 2 2
β̂ 2 (S − λS ) + 4λS 2
xx yy xy

after substituting σ̂β2 from page 588. Now using the fact that β̂ and −1/λβ̂ are both roots
of the same quadratic equation, we have
2 2 2 2
(1+λβ̂ 2 ) (S xx −λSyy ) +4λSxy

1
= +λβ̂ = 2
.
β̂ 2 β̂ Sxy

Thus the expression in square brackets is equal to 1.


12.15 a.
eα+β(−α/β) e0 1
π(−α/β) = = = .
1+e α+β(−α/β) 1 + e0 2
b.
eα+β((−α/β)+c) eβc
π((−α/β) + c) = = ,
1+e α+β((−α/β)+c) 1 + eβc
and
e−βc eβc
1 − π((−α/β) − c) = 1 − = .
1 + e−βc 1 + eβc
c.
d eα+βx
π(x) = β = βπ(x)(1 − π(x)).
dx [1 + eα+βx ]2

d. Because
π(x)
= eα+βx ,
1 − π(x)
the result follows from direct substitution.
e. Follows directly from (d).
f. Follows directly from
∂ ∂
F (α + βx) = f (α + βx) and F (α + βx) = xf (α + βx).
∂α ∂β

g. For F (x) = ex /(1 + ex ), f (x) = F (x)(1 − F (x)) and the result follows. For F (x) = π(x) of
f
(12.3.2), from part (c) if follows that F (1−F ) = β.
12.17 a. The likelihood equations and solution are the same as in Example 12.3.1 with the exception
that here π(xj ) = Φ(α + βxj ), where Φ is the cdf of a standard normal.
b. If the 0 − 1 failure response in denoted “oring” and the temperature data is “temp”, the
following R code will generate the logit and probit regression:
summary(glm(oring~temp, family=binomial(link=logit)))
summary(glm(oring~temp, family=binomial(link=probit)))
12-4 Solutions Manual for Statistical Inference

For the logit model we have


Estimate Std. Error z value P r(> |z|)
Intercept 15.0429 7.3719 2.041 0.0413
temp −0.2322 0.1081 −2.147 0.0318
and for the probit model we have
Estimate Std. Error z value P r(> |z|)
Intercept 8.77084 3.86222 2.271 0.0232
temp −0.13504 0.05632 −2.398 0.0165
Although the coefficients are different, the fit is qualitatively the same, and the probability
of failure at 31◦ , using the probit model, is .9999.
12.19 a. Using the notation of Example 12.3.1, the likelihood (joint density) is
J  yj∗  nj −yj∗ Y J  nj P
eα+βxj 1 1
P
α yj∗ +β xj yj∗
Y
= e j j .
j=1
1 + eα+βxj 1 + eα+βxj j=1
1 + eα+βxj

yj∗ and xj yj∗ are sufficient.


P P
By the Factorization Theorem, j j
b. Straightforward substitution.
d
12.21 Since dπ log(π/(1 − π)) = 1/(π(1 − π)),
   2
π̂ 1 π(1 − π) 1
Var log ≈ =
1 − π̂ π(1 − π) n nπ(1 − π)
P
12.23 a. If ai = 0, X X X
E ai Yi = ai [α + βxi + µ(1 − δ)] = β ai xi = β
i i i

for ai = xi − x̄.
b.
1X
E(Ȳ − β x̄) = [α + βxi + µ(1 − δ)] − β x̄ = α + µ(1 − δ),
n i

so the least squares estimate a is unbiased in the model Yi = α0 + βxi + i , where α0 =


α + µ(1 − δ).
12.25 a. The least absolute deviation line minimizes

|y1 − (c + dx1 )| + |y2 − (c + dx1 )| + |y3 − (c + dx3 )| .

Any line that lies between (x1 , y1 ) and (x1 , y2 ) has the same value for the sum of the first
two terms, and this value is smaller than that of any line outside of (x1 , y1 ) and (x2 , y2 ).
Of all the lines that lie inside, the ones that go through (x3 , y3 ) minimize the entire sum.
b. For the least squares line, a = −53.88 and b = .53. Any line with b between (17.9−14.4)/9 =
.39 and (17.9 − 11.9)/9 = .67 and a = 17.9 − 136b is a least absolute deviation line.
12.27 In the terminology of M -estimators
P (see the argument on pages 485 − 486), β̂L is consistent
for the β0 that satisfies Eβ0 i ψ(Yi − β0 xi ) = 0, so we must take the “true” β to be this
value. We then see that X
ψ(Yi − β̂L xi ) → 0
i

as long as the derivative term is bounded, which we assume is so.


Second Edition 12-5

12.29 The argument for the median is a special case of Example 12.4.3, where we take xi = 1
so σx2 = 1. The asymptotic distribution is given in (12.4.5) which, for σx2 = 1, agrees with
Example 10.2.3.
12.31 The LAD estimates, from Example 12.4.2 are α̃ = 18.59 and β̃ = −.89. Here is Mathematica
code to bootstrap the standard deviations. (Mathematica is probably not the best choice here,
as it is somewhat slow. Also, the minimization seemed a bit delicate, and worked better when
done iteratively.) Sad is the sum of the absolute deviations, which is minimized iteratively
in bmin and amin. The residuals are bootstrapped by generating random indices u from the
discrete uniform distribution on the integers 1 to 23.
1. First enter data and initialize
Needs["Statistics‘Master‘"]
Clear[a,b,r,u]
a0=18.59;b0=-.89;aboot=a0;bboot=b0;
y0={1,1.2,1.1,1.4,2.3,1.7,1.7,2.4,2.1,2.1,1.2,2.3,1.9,2.4,
2.6,2.9,4,3.3,3,3.4,2.9,1.9,3.9};
x0={20,19.6,19.6,19.4,18.4,19,19,18.3,18.2,18.6,19.2,18.2,
18.7,18.5,18,17.4,16.5,17.2,17.3,17.8,17.3,18.4,16.9};
model=a0+b0*x0;
r=y0-model;
u:=Random[DiscreteUniformDistribution[23]]
Sad[a_,b_]:=Mean[Abs[model+rstar-(a+b*x0)]]
bmin[a_]:=FindMinimum[Sad[a,b],{b,{.5,1.5}}]
amin:=FindMinimum[Sad[a,b/.bmin[a][[2]]],{a,{16,19}}]
2. Here is the actual bootstrap. The vectors aboot and bboot contain the bootstrapped values.
B=500;
Do[
rstar=Table[r[[u]],{i,1,23}];
astar=a/.amin[[2]];
bstar=b/.bmin[astar][[2]];
aboot=Flatten[{aboot,astar}];
bboot=Flatten[{bboot,bstar}],
{i,1,B}]
3. Summary Statistics
Mean[aboot]
StandardDeviation[aboot]
Mean[bboot]
StandardDeviation[bboot]
4. The results are Intercept: Mean 18.66, SD .923 Slope: Mean −.893, SD .050.

You might also like