Professional Documents
Culture Documents
Yi = 0 + 1 Xi + i
N(0, 2 ) .
Weassumethatthetruemodelisknown.
Weassumethatyou'veseenconfidenceintervalsandhypothesistestsbefore.
0 = Y 1 X
1 = Cor(Y, X)
Sd(Y)
Sd(X)
2/14
Review
Statisticslike oftenhavethefollowingproperties.
1. IsnormallydistributedandhasafinitesampleStudent'sTdistributioniftheestimatedvariance
isreplacedwithasampleestimate(undernormalityassumptions).
2. CanbeusedtotestH0 : = 0 versusHa : >, <, 0 .
3. Canbeusedtocreateaconfidenceintervalfor via Q1/2 where Q1/2 istherelevant
quantilefromeitheranormalorTdistribution.
Inthecaseofregressionwithiidsamplingassumptionsandnormalerrors,ourinferenceswillfollow
verysimilarilytowhatyousawinyourinferenceclass.
Wewon'tcoverasymptoticsforregressionanalysis,butsufficeittosaythatunderassumptionson
thewaysinwhichthe X valuesarecollected,theiidsamplingmodel,andmeanmodel,thenormal
resultsholdtocreateintervalsandconfidenceintervals
3/14
n
Var (i=1 Yi (Xi X ))
2
n
i=1 (Xi X )2 )
n 2 (Xi X )2
(
i=1
2
n
i=1 (Xi X) 2 )
2
ni=1 (Xi X) 2
4/14
Results
2 = Var( 1 ) = 2 / i=1 ( Xi X ) 2
n
2 = Var( ) =
0
1
n
(
2
X
ni=1 (X iX ) 2
Inpractice, isreplacedbyitsestimate.
It'sprobablynotsurprisingthatunderiidGaussianerrors
j j
5/14
6/14
Example continued
coefTable
7/14
With95%confidence,weestimatethata0.1caratincreaseindiamondsizeresultsina355.6to388.6
increaseinpricein(Singapore)dollars.
8/14
Prediction of outcomes
ConsiderpredictingY atavalueofX
Predictingthepriceofadiamondgiventhecarat
Predictingtheheightofachildgiventheheightoftheparents
Theobviousestimateforpredictionatpointx 0 is
0 + 1 x 0
Astandarderrorisneededtocreateapredictioninterval.
There'sadistinctionbetweenintervalsfortheregressionlineatpoint x 0 andthepredictionofwhat
aywouldbeatpointx 0 .
Lineatx se,
0
1
(x0 X ) 2
+
n
2
n
i=1 (X iX )
Predictionintervalseatx ,
0
(x0
X ) 2
1 + 1n + n
2
i=1 (X iX )
9/14
10/14
11/14
Discussion
Bothintervalshavevaryingwidths.
LeastwidthatthemeanoftheXs.
Wearequiteconfidentintheregressionline,sothatintervalisverynarrow.
Ifweknew 0 and 1 thisintervalwouldhavezerowidth.
Thepredictionintervalmustincorporatethevariabilibityinthedataaroundtheline.
Evenifweknew 0 and 1 thisintervalwouldstillhavewidth.
12/14
In R
newdata <- data.frame(x = xVals)
p1 <- predict(fit, newdata, interval = ("confidence"))
p2 <- predict(fit, newdata, interval = ("prediction"))
plot(x, y, frame=FALSE,xlab="Carat",ylab="Dollars",pch=21,col="black", bg="lightblue", cex=2)
abline(fit, lwd = 2)
lines(xVals, p1[,2]); lines(xVals, p1[,3])
lines(xVals, p2[,2]); lines(xVals, p2[,3])
13/14
In R
14/14