You are on page 1of 6

Stat-UB.0103.

03: Statistics for Business Control and Regression Models


Prof. Halina Frydman
December 16, 2011
Solutions to Review Questions for the Final Exam
1).Consider a simple regression of wifes height (Wife) on husbands
height (Husband). The heights are measured in centimeters.
a) Is there a statistically signicant linear relationship between the height
of the husband and height of the wife? (State the hypothesis test and carry
out the test at c = 0.01)
H
0
: ,
1
= 0
H
0
: ,
1
6= 0
0.69965
0.06106
= 11.46 .
0.005
= 2.57
Yes, there is a statistically signicant relationship between the
height of the husband and height of the wife. .
b) 41.93 is a y-intercept; it does not have an interpretation
0.7 inches = the expected increase in wives height for a one inch increase
in Husbands height.
c) What is the correlation coecient between the heights of the husband
and wife? We compute
r =
r
4613.7
7917
=

0.583 = 0.763
d) What would the correlation be if every woman married a man exactly
7 centimeters taller than her?
wife = 7 +husband
In this case there is a perfect positive linear relationship between
height of a wife and husband. The correlation coecient is 1.
1
e) Construct a 95% condence interval for the average height of wives
who are married to 195 cm tall men. Would you trust this condence inter-
val? (Explain)
Wife = 41.93 + 0.69965(195) = 178.36
178.36 t
0.025,94
(1.4)
178.36 1.96(1.4)
(175.6, 181.1)
I would not trust this condence interval because the height of 195 cm is
beyond the range of values for Husbands heights.
2 a) We perform the t test:
H
0
: ,
1
= 0
H
o
: ,
1
6= 0
Reject H
0
if |t| t
0.025,48
2. Since
|t| = 2.2 2
we reject H
0
and conclude that the model is statistically signicant at c =
0.05. One can get an exact p-value from Minitab: p-value= 20.0164=0.032 and
an approximate from the Z tables as 2 1(7 2.2) = 2 (0.5 0.4861) =
0.0278.
b)

,
0
= 23.7 has no interpretation.

,
1
= 4.86, for a one year increase in age the unscrambling time increases,
on average, by 4.86 seconds.
c) We rst compute SSE using :
oo1 = (: 2) :
2
= 958508
r
2
= 1
oo1
ooT
= 1
958508
1055380
= 0.0918,
or 9.18% of sample variation in the unscrambling time is explained by the
model.
2
d) Since the 95% condence interval is the narrowest at the mean value
of age, we nd the mean age in the sample by using the fact that (r, j) lies
on the regression line.
150.1 = 23.7 + 4.86r
Solving
r =
150.1 23.7
4.86
26.
The condence interval for 25 years old individuals will be narrower
because 25 is closer to the mean age in the sample (26) than 35. The closer
is the value of the predictor to the mean value of that predictor in the sample
the smaller is the standard error of the tted value and thus the narrower
the condence interval.
3 a)
H
0
: j 120
H
o
: j 120
Reject H
0
if
t =

A 120
:
A
t
0.01,49
.
0.01
= 2.33
We have
t =
150.1 120
20.8
= 1.45 < 2.33
Thus there is no evidence in the data that j 120. The students do not
seem to be correct.
b) We are using here standard normal distribution
p-value = 1(7 1.45) = 0.5 0.4265 = 0.0735.
4 a) From the correlation matrix we see that Gender is least correlated
with Time. In fact it does not have a statistically signicant correlation
with Time. (p-value=0.453).
b) The test is
H
0
: ,
age
= ,
words
= ,
English
= 0
H
1
: at least one coecient is not equal to zero
The test statistic is 1. We reject H
0
if 1
Aiaitob
= 4.39 1
0.01,3.46
. From
the table of F distribution we see that 1
0.01,3.46
< 4.31, which says that
4.39 1
0.01,3.46
, so the test is statistically signicant at c = 0.01.
3
c) Age and Prefer Words are statistically signicant at c = 0.05. They
have p-values less than 0.05
d) (0.364)
2
0.132
e) We have to compare observed 1 = 5.57 with a tabulated value, which
is 1
0.01,2,47
. 1
0.01,2,47
is between 4.13 and 4.31. Since 5.57 4.31, the model
is statistically signicant at c = 0.01.
Constant does not have an interpretation, 3.97-for every one year in-
crease in age the unscrambling time increases by about 4 seconds keeping
gender constant, 98.44 is the dierence in average unscrambling time be-
tween individuals who judge themselves to be more verbally oriented (com-
pared to those who are quantitatively oriented) assuming they are of the
same age.
5 a) Let A denote daily return of and 1 daily return of 1.
0.5 0.6 = 0.3
b) Let 7 = the number of A returns that are larger than 4 among 3
returns and W=the number of B returns that are exactly 2. Then 7
/(3, 0.5) and W /(3.0.6)
1(7 2)1(\ 2) = [11(7 = 3)][11(\ = 3)] =

1
1
8

(10.6
3
) = 0.686
c) Let 1 be the average return from B over 36 days. Then 1 has ap-
proximately normal distribution with mean equal to 0.4 and the standard
deviation equal to 1.96/6 0.33
1(1 0) = P(7
0 0.4
0.33
) = P(7 1.21) = 0.5 +1(0 < 7 < 1.21)
0.5 + 0.3869 = 0.8869
d)
1(A 6|A 4) = 1(A 6),1(A 4)
= 21(A 6) = 21(7 1) = 2 (0.5 0.3413) = 0.3174
e)
4
0.417 1.96
r
0.417 0.583
36
0.417 1.96 0.082
0.417 0.16
The distribution changed; we can be 95% condent that the probability of
a positive return is below 0.6.
6) a) Is this a statistically signicant regression at c = 0.05? (State H
0
and H
1
and your conclusion)
H
0
: the coecients of all explanatory variables are zero
H
1
: at least one coecient is not equal to zero
Since the value of F statistic is zero we reject H
0
.
b) Is the coecient of Oil statistically signicant at c = 0.01? (State
H
0
and H
1
, test statistic and your conclusion). Interpret the coecient of
Oil.
the test statistic is
10.791
3.069
= 3.52
Since its absolute value is larger than t
0.005,9
= 3.25, we conclude that the
coecient of Oil is statistically signicant. All else equal, the houses heated
by Oil sell, on average, for $10,000 less than those heated by Electricity.
c)
(0.521)
2
0.27
d) age is highly correlated with hsize. When hsize is in the regression
equation, age is not needed.
e) She made the choice based on adjusted r
2
(The best subset regression
was not provided in the original Minitab output so you couldnt answer
this question, but see the best subsets regression below) and the plot of
standardized residuals against tted values.
All else equal, houses heated by Oil sell on average for $11,100 less than
those heated by either Electricity or Gas.
5

You might also like