Professional Documents
Culture Documents
Sept24,2009
Lecture6
REMIND Problem (Chatterjee S. & Hadi A.S., 2006): Consider a case of a company those markets and
repairssmallcomputers.Tostudytherelationshipbetweenthelengthofaservicecallandthenumber
of electronic components in the computer that must be repaired or replaced, a sample of records on
servicecallswastaken.Thedataconsistofthelengthofservicecallsinminutes(theresponsevariable)
andthenumberofcomponentsrepaired(thepredictorvariable).(see.DatainLecture4)
PREVIOUS LECTURE IN CLASS
Constructa simple linear regression model (Chatterjee S & Hadi A. S, Section 2.5, 2.7)
a. OLS estimators for the regression model
i
= 4.1S +1S.S1X
b. Calculate the coefficient of determination (R
2
) to interpret the relationship R
2
= u.9874
c. Confidence Interval for [
0
and [
1
P(-3.160103< [
0
< 11.46010)=0.95; P(14.40975 < [
1
< 16.61025)=0.95
In this lecture: We will answer some questions based on the regression model that we get.
(Chatterjee S & Hadi A. S, Section 2.6, 2.8, 2.9)
= 4.1S +1S.S1X
QUESTION 1: Does the length of service call depend on the number of computer units?
QUESTION 2: Can we expect the increase in service time for each additional unit to be repaired
is 16 minutes? Do this data support this conjecture?
QUESTION 3: Can I say what will be the length of the service call if the customer calls
regarding to 9 computer units? And what is the confidence intervals for this value with
confidence coefficient (1-)?
QUESTION 4: What if one calls for 18 computer units? What long will be the service call? And
what is the %95 confidence interval for that new observation?
QUESTION 5: What are the tools to examine the quality of fit?
2
= [
0
+[
1
X
+c
-[
1
sc([
1
)
[
1
=
SX
SXX
=
1768
114
= 1S.S1
sc([
1
) =
c
SXX
=
5.3917
114
=0.504979
t =
15.51-0
0.504979
= Su.7141S t.calc=30.71
Step 4: Decision
t.calc=30.71 > t(0.995,12)=3.054540 or p.val=4.454215e-13 < /2=0.005
(Ho is rejected) By 99% confidence we can say that there is significant relationship between the
length of service call and the number of computer components.
3
-[
1
sc([
1
)
[
1
=
SX
SXX
=
1768
114
= 1S.S1
sc([
1
) =
c
SXX
=
5.3917
114
=0.504979
t =
15.51-16
0.504979
= -u.97uSS74 t.calc= -0.9703
Step 4: Decision
t.calc= -0.9703 > t(0.025,12)= - 2.1788
OR
p-value = 0.1755154 > /2 = 0.025 (since it is two-sided, take /2)
Ho is accepted. So, this is a strong evidence to expect the increase in service time for each
additional unit to be repaired is 16 minutes.
4
QUESTION 3: Can I say what will be the length of the service call if the customer calls
regarding to 9 computer units? And what is the confidence intervals for this value with
confidence coefficient (1-)?
ANSWER: By using the simple linear regression model find the fitted value at the given value
of X. And construct the confidence interval for this fitted value.
min(Y) units(X)
= 4.1S +1S.S1X
Y.fit
23 1 19.66
29 2 35.17
49 3 50.68
64 4 66.19
74 4 66.19
87 5 81.7
96 6 97.21
97 6 97.21
109 7 112.72
119 8 128.23
149 9 143.74
145 9 143.74
154 10 159.25
166 10 159.25
y
11
= 14S.74
P(y
11
_t [
o
2
, n -2 - sc(y
11
)) = 1 -o
sc(y
11
) = o _
1
n
+
(x
11
-x )
2
SXX
_
12
x
11
= 9; x = 6; SXX=114; n=14; o = S.S917.
sc(y
11
) = S.S917__
1
14
+
(9 -6)
2
114
_ = 2.u9u812
t [
o
2
, n -2 = t(u.u2S,12) = 2.17881S
P(14S.74 _2.1788 - 2.u9u812) = u.9S
P(139.1845<y
11
<148.2955)=0.95
5
QUESTION 4: What if one calls for 18 computer units? What long will be the service call? And
what is the %95 confidence interval for that new observation?
ANSWER: Find the predicted value for this new value of the predictor.
y
15
= 4.1S +1S.S1 - 18
y
15
= 28S.SS min
sc(y
15
) = o _1 +
1
n
+
(x
15
-x )
2
SXX
_
12
= S.S917__1 +
1
14
+
(18 -6)
2
114
_ = 8.2S82
sc(y
15
) =8.238169
t [
o
2
, n -2 = t(u.u2S,12) = 2.17881S
P(y
15
_t [
o
2
, n -2 - sc(y
15
)) = 1 -o
P(28S.SS _2.1788t - 8.2S8169) = u.9S
P(265.3807 < y
15
< 301.2793)=0.95
QUESTION 5: What are the tools to examine the quality of fit?
ANSWER: The important tools are,
1) t-test for regression coefficients: The larger the t value or the smaller p-value, the
stronger the relationship between X and Y.
2) Correlation coefficient between Y and X: Corr(Y, X)=
SX
(SXX)(S)
i. Corr(Y, X)<0 negative relationship
ii. Corr(Y, X)>0 positive relationship
iii. Corr(Y, X) closer to 1 or -1 stronger relationship
3) Examine scaterplot Y versus
`
, the closer the set of points to a straight line the stronger
relationship.
4) Coefficient of determination (R
2
): The variability of response explained by predictor.