Professional Documents
Culture Documents
Regression Models
12.1 The point (x̂0 , ŷ 0 ) is the closest if it lies on the vertex of the right triangle with vertices (x0 , y 0 )
and (x0 , a + bx0 ). By the Pythagorean theorem, we must have
h i h i
0 2 0 2 0 2 0 2 2
(x̂ −x0 ) + ŷ 0 −(a + bx ) + (x̂ −x0 ) +(ŷ −y 0 ) = (x0 − x0 )2 + (y 0 − (a + bx0 )) .
Substituting the values of x̂0 and ŷ 0 from (12.2.7) we obtain for the LHS above
"
0 2 2 0 2 # " 0 2 0 2 #
b(y −bx0 −a) b (y −bx0 −a) b(y −bx0 −a) y −bx−a)
+ + +
1+b2 1+b2 1+b2 1+b2
" #
2 4 2
2 b +b +b +1 2
= (y 0 − (a + bx0 )) 2 = (y 0 − (a + bx0 )) .
(1+b2 )
set
12.3 a. Differentiation yields ∂f /∂ξi = −2(xi − ξi ) − 2λβ [yi −(α+βξi )] = 0 ⇒ ξi (1 + λβ 2 ) =
xi −λβ(y i −α), which is the required solution. Also, ∂ 2 f /∂ξ 2 = 2(1 + λβ 2 ) > 0, so this is a
minimum.
b. Parts√i), ii), and iii) √
are immediate. For iv) just note that D is Euclidean distance between
(x1 , λy1 ) and (x2 , λy2 ), hence satisfies the triangle inequality.
12.5 Differentiate log L, for L in (12.2.17), to get
n
∂ −n 1 λ Xh i2
log L = + y i −(α̂ + β̂xi ) .
∂σδ2 σδ2 2
2(σδ2 ) 1+β̂ 2 i=1
Set this equal to zero and solve for σδ2 . The answer is (12.2.18).
12.7 a. Suppressing the subscript i and the minus sign, the exponent is
2 2 2 2 2 2
(x−ξ) [y−(α+βξ)] σ +β σδ 2 [y−(α+βx)]
2 + = 2 (ξ−k) + 2 ,
σδ σ2 σ2 σδ σ2 +β 2 σδ
σ2 x+σδ2 β(y−α)
where k = σ2 +β 2 σδ2
. Thus, integrating with respect to ξ eliminates the first term.
b. The resulting function must be the joint pdf of X and Y . The double integral is infinite,
however.
12.9 a. From the last two equations in (12.2.19),
1 1 1 Sxy
σ̂δ2 = Sxx − σ̂ξ2 = Sxx − ,
n n n β̂
b. We have from part a), σ̂δ2 > 0 ⇒ Sxx > Sxy /β̂ and σ̂2 > 0 ⇒ Syy > β̂Sxy . Furthermore,
σ̂ξ2 > 0 implies that Sxy and β̂ have the same sign. Thus Sxx > |Sxy |/|β̂| and Syy > |β̂||Sxy |.
Combining yields
|Sxy | Syy
< β̂ < .
Sxx |Sxy |
12.11 a.
= βλσ2 − βσδ2 = 0
if λσ2 = σδ2 . (Note that we did not need the normality assumption, just the moments.)
c. Let W √i = βλY√i + Xi , Vi = Yi + βXi . Exercise 11.33 √ shows that p if Cov(Wi , Vi ) = 0,
then n − 2r/ 1 − r2 has a tn−2 distribution. Thus n − 2rλ (β)/ 1 − rλ2 (β) has a tn−2
distribution for all values of β, by part (b). Also
( 2
)!
(n − 2)rλ(β) (n − 2)rλ2 (β)
P β: 2 ≤ F1,n−2,α =P (X, Y ) : ≤ F1,n−2,α = 1 − α.
1 − rλ(β) 1 − rλ2 (β)
rλ2 (β)
1 − rλ2 (β)
2
βλSyy +(1−β 2 λ)Sxy −βSxy
= 2,
(β 2 λ2 Syy +2βλSxy +Sxx ) (Syy −2βSxy +β 2 Sxx ) − (βλSyy +(1−β 2 λ)Sxy −βSxx )
and a great deal of straightforward (but tedious) algebra will show that the denominator
of this expression is equal to
Thus
2 2
rλ2 (β) λ2 Sxy
2
β − β̂ β+ λ1β̂
= y
1 − rλ2 (β) 2
(1−λβ 2 ) Syy Sx −S 2xy
2 !2
β−β̂ 1+λβ β̂ (1 + λβ̂ 2 )2 Sxy
2
= 2
h i,
σ̂β 1+λβ 2 2
β̂ 2 (S − λS ) + 4λS 2
xx yy xy
after substituting σ̂β2 from page 588. Now using the fact that β̂ and −1/λβ̂ are both roots
of the same quadratic equation, we have
2 2 2 2
(1+λβ̂ 2 ) (S xx −λSyy ) +4λSxy
1
= +λβ̂ = 2
.
β̂ 2 β̂ Sxy
d. Because
π(x)
= eα+βx ,
1 − π(x)
the result follows from direct substitution.
e. Follows directly from (d).
f. Follows directly from
∂ ∂
F (α + βx) = f (α + βx) and F (α + βx) = xf (α + βx).
∂α ∂β
g. For F (x) = ex /(1 + ex ), f (x) = F (x)(1 − F (x)) and the result follows. For F (x) = π(x) of
f
(12.3.2), from part (c) if follows that F (1−F ) = β.
12.17 a. The likelihood equations and solution are the same as in Example 12.3.1 with the exception
that here π(xj ) = Φ(α + βxj ), where Φ is the cdf of a standard normal.
b. If the 0 − 1 failure response in denoted “oring” and the temperature data is “temp”, the
following R code will generate the logit and probit regression:
summary(glm(oring~temp, family=binomial(link=logit)))
summary(glm(oring~temp, family=binomial(link=probit)))
12-4 Solutions Manual for Statistical Inference
for ai = xi − x̄.
b.
1X
E(Ȳ − β x̄) = [α + βxi + µ(1 − δ)] − β x̄ = α + µ(1 − δ),
n i
Any line that lies between (x1 , y1 ) and (x1 , y2 ) has the same value for the sum of the first
two terms, and this value is smaller than that of any line outside of (x1 , y1 ) and (x2 , y2 ).
Of all the lines that lie inside, the ones that go through (x3 , y3 ) minimize the entire sum.
b. For the least squares line, a = −53.88 and b = .53. Any line with b between (17.9−14.4)/9 =
.39 and (17.9 − 11.9)/9 = .67 and a = 17.9 − 136b is a least absolute deviation line.
12.27 In the terminology of M -estimators
P (see the argument on pages 485 − 486), β̂L is consistent
for the β0 that satisfies Eβ0 i ψ(Yi − β0 xi ) = 0, so we must take the “true” β to be this
value. We then see that X
ψ(Yi − β̂L xi ) → 0
i
12.29 The argument for the median is a special case of Example 12.4.3, where we take xi = 1
so σx2 = 1. The asymptotic distribution is given in (12.4.5) which, for σx2 = 1, agrees with
Example 10.2.3.
12.31 The LAD estimates, from Example 12.4.2 are α̃ = 18.59 and β̃ = −.89. Here is Mathematica
code to bootstrap the standard deviations. (Mathematica is probably not the best choice here,
as it is somewhat slow. Also, the minimization seemed a bit delicate, and worked better when
done iteratively.) Sad is the sum of the absolute deviations, which is minimized iteratively
in bmin and amin. The residuals are bootstrapped by generating random indices u from the
discrete uniform distribution on the integers 1 to 23.
1. First enter data and initialize
Needs["Statistics‘Master‘"]
Clear[a,b,r,u]
a0=18.59;b0=-.89;aboot=a0;bboot=b0;
y0={1,1.2,1.1,1.4,2.3,1.7,1.7,2.4,2.1,2.1,1.2,2.3,1.9,2.4,
2.6,2.9,4,3.3,3,3.4,2.9,1.9,3.9};
x0={20,19.6,19.6,19.4,18.4,19,19,18.3,18.2,18.6,19.2,18.2,
18.7,18.5,18,17.4,16.5,17.2,17.3,17.8,17.3,18.4,16.9};
model=a0+b0*x0;
r=y0-model;
u:=Random[DiscreteUniformDistribution[23]]
Sad[a_,b_]:=Mean[Abs[model+rstar-(a+b*x0)]]
bmin[a_]:=FindMinimum[Sad[a,b],{b,{.5,1.5}}]
amin:=FindMinimum[Sad[a,b/.bmin[a][[2]]],{a,{16,19}}]
2. Here is the actual bootstrap. The vectors aboot and bboot contain the bootstrapped values.
B=500;
Do[
rstar=Table[r[[u]],{i,1,23}];
astar=a/.amin[[2]];
bstar=b/.bmin[astar][[2]];
aboot=Flatten[{aboot,astar}];
bboot=Flatten[{bboot,bstar}],
{i,1,B}]
3. Summary Statistics
Mean[aboot]
StandardDeviation[aboot]
Mean[bboot]
StandardDeviation[bboot]
4. The results are Intercept: Mean 18.66, SD .923 Slope: Mean −.893, SD .050.