Aczel Business Statistics Solutions Ch8-12

Chapter 08 - The Comparison of Two Populations
CHAPTER 8
THE COMPARISON OF TWO POPULATIONS
8-1.
n = 25
D = 19.08
H0: D = 0
t (24) =
s D = 30.67
H1: D 0
D D0
sD / n
= 3.11
Reject H0 at = 0.01.
Paired Difference Test
Evidence
Size
25
Average Difference 19.08 D
Assumption
Populations Normal
Stdev. of Difference 30.67 sD

Note: Difference has been defined as
Test Statistic 3.1105 t
df 24
Hypothesis Testing
Null Hypothesis
p-value
H0: 1 2 =0
0.0048
8-2.
n = 40
D = 5 s D = 2.3
H0: D = 0
t(39) =
At an of
5%
Reject
H1: D 0
50
2.3 / 40
= 13.75
Strongly reject H0. 95% C.I. for D 2.023(2.3/ 40 ) = [4.26, 5.74].

8-3.
n = 12
D = 3.67
H0: D = 0
s D = 2.45 (D = Movie Commercial)
H1: D 0
8-1
(template: Testing Paired Difference.xls, sheet: Sample Data)

Data
Current Previous Evidence
M
C
Size
Assumption
15
10
Populations Normal
2
3
4
5
6
7
8
9
17
25
17
14
18
17
16
14
9
21
16
11
df
8
At an of
12 Hypothesis Testing
Null Hypothesis
p-value
13
5%
H0: 1 2 =0
Reject
15
0.0020
H0: 1 2 >=0
13
0.9990
H0: 1 2 <=0
Reject
0.0010
At = 0.05, we reject H0. There are more viewers for movies than commercials.
8-4.
D = 0.2
n = 60
H0: D 0
t(24) =
0.2 0
1 / 60
sD = 1
H1: D > 0
= 1.549. At = 0.05, we cannot reject H0.

Evidence
Size
60
Assumption
Average Difference
0.2
Populations Normal
Stdev. of Difference
sD

df 59
Hypothesis Testing
Null Hypothesis
p-value
H0: 1 2 =0
0.1267
H0: 1 2 >=0
0.9367
H0: 1 2 <=0
0.0633
8-5.
n = 15
D = 3.2
H0: D 0
t (14) =
s D = 8.436
At an of
5%
(D = After Before)
H1: D > 0
3.2 0
8.436 / 15
= 1.469
8-2
There is no evidence that the shelf facings are effective.

8-6.
n = 12
D = 37.08
H0: D = 0
s D = 43.99
H1: D 0
(template: Testing Paired Difference.xls, sheet: Sample Data)

Data
Current Previous Evidence
France Spain
Size
12
Assumption
258
214
Populations Normal
2
3
4
5
6
7
8
9
10
11
12
289
228
200
190
350
310
212
195
175
299
190
250
190
185
114
df 11
At an of
Null Hypothesis
p-value
378
5%
H0: 1 2 =0
Reject
230
0.0139
H
:
>=
160
0
0.9930
0
1
2
H0: 1 2 <=0
Reject
120
0.0070
220
105
Reject H0. There is strong evidence that hotels in Spain are cheaper than those in France,
based on this small sample. p-value = 0.0139
8-7.
Power at D = 0.1
H0: D 0
n = 60
D = 1.0
= 0.01
H1: D > 0
C = 0 + 2.326( / n ) = 0.30029
We need:
P( D > C | D = 0.1)
= P( D > 0.30029 | D = 0.1)
0.30029 0.1
= P Z
1 / 60
= P(Z > 1.551) = 0.0604

8-8.
n = 20
D = 1.25
H0: D = 0
s D = 42.896
H1: D 0
8-3
t (19) =
1.25 0
= 0.13
42.89 / 20
Do not reject H0; no evidence of a difference.

Evidence
Size
20
Assumption
Populations Normal

df 19
Hypothesis Testing
Null Hypothesis
p-value
H0: 1 2 =0
0.8977
8-9.
n1 = 100
At an of
5%
n 2 = 100 x1 = 76.5 x 2 = 88.1 s1 = 38 s 2 = 40
H0: 2 1 0
H1: 2 1 0
(Template: Testing Population Means.xls, sheet: Z-test from Stats)

(need to use the t-test since the population std. dev. is unknown)
Evidence
Assumptions
Populations Normal
Sample1 Sample2
Size
Mean
Std. Deviation
100
76.5
38
n
x-bar
s
100
88.1
40
H0: Population Variances Equal

F ratio 1.10803
p-value 0.6108
Assuming Population Variances are Equal

Pooled Variance 1522 s2p
Test Statistic -2.1025 t
df 198
Null Hypothesis
H0: 1 2 =0
H0: 1 2 >=0
H0: 1 2 <=0
At an of Confidence Interval for difference in Population Means

Confidence
p-value
5%
Interval
Reject
0.0368
95%
-11.6 10.8801 = [ -22.48, -0.7199 ]
0.0184
0.9816
Reject
Reject H0. There is evidence that gasoline outperforms ethanol.

8-10.
n1 = n 2 = 30
H0: 1 2 = 0
Nikon (1): x1 = 8.5
H1: 1 2 0
s1 = 2.1
Minolta (2): x 2 = 7.8 s 2 = 1.8
8-4
z=
8.5 7.8
2
= 1.386
(2.1 / 30) (1.8 / 30)
Do not reject H0. There is no evidence of a difference in the average ratings of the two cameras.
8-11.
Bel Air (1):
n1 = 32
x1 = 2.5M
s1 = 0.41M
Marin (2):
n 2 = 35
x 2 = 4.32M
s 2 = 0.87M
H0: 1 2 = 0
H1: 1 2 0
(Template: Testing Population Means.xls, sheet: t-test from Stats)

Equal variance assumptions is questionable.
t-Test for Difference in Population Means
Evidence
Assumptions
Populations Normal
Sample1 Sample2
Size
Mean
Std. Deviation
32
2.5
0.41
35
4.32
0.87
n
x-bar
s

F ratio 4.50268
p-value 0.0001

Pooled Variance 0.47609 s2p
df
65
Warning: Equal variance assumption is questionable.

At an of
Null Hypothesis
H0: 1 2 =0
p-value
0.0000
5%
Reject
H0: 1 2 >=0
H0: 1 2 <=0
0.0000
1.0000
Reject
Confidence Interval for difference in Population Means

Confiden
ce
Interval
95%
-1.82 0.33704 = [
-2.157, -1.48296 ]
Assuming Population Variances are Unequal

df
49
Null Hypothesis
H0: 1 2 =0
H0: 1 2 >=0
H0: 1 2 <=0
p-value
0.0000
0.0000
1.0000
At an of
5%
Reject
Reject
8-5

Confidence Interval
95%
-1.82 0.32946 = [
-2.1495, -1.49054 ]
Reject H0. There is evidence that the average Bel Air price is lower.
8-12.

H0: J SP = 0
H1: J SP 0

Evidence
Assumptions
Populations Normal
Sample1 Sample2
Size
Mean
Std. Deviation
40
15
3
n
x-bar
s
40
6.2
3.5

F ratio 1.36111
p-value 0.3398

df 78
Null Hypothesis
H0: 1 2 =0

Confidence
p-value
5%
Interval
0.0000 Reject
95%
8.8 1.45107 = [ 7.34893, 10.2511 ]
H0: 1 2 >=0
H0: 1 2 <=0
1.0000
0.0000
Reject
Reject the null hypothesis. The global equities outperform U.S. market.
8-13.
Music:
n1 = 128 x1 = 23.5
s1 = 12.2
Verbal: n 2 = 212
x 2 = 18.0
H0: 1 2 = 0
H1: 1 2 0
z=
23.5 18.0
(12.2 / 128) (10.5 2 / 212)
2
s 2 = 10.5
= 4.24
Reject H0. Music is probably more effective.
8-6
Evidence
Sample1 Sample2
n
Size
128
212
x-bar
Mean 23.5
18
Popn. Std. Devn.
Popn. 1
12.2
Popn. 2
10.5
Hypothesis Testing
Test Statistic 4.2397 z
At an of
p-value
5%
Reject
0.0000
Null Hypothesis
H0: 1 2 =0
8-14.
n1 = 13
n 2 = 13
x1 = 20.385
= .05
s1 = 7.622
s 2 = 4.292
H0: u1 = u2
H1: u1 u2
S p2
x 2 = 10.385
13 17.622 2 13 14.292 2
13 13 2
t 24
20.385 10.385
38.2581 1 1
13 13
38.2581
4.1219
df 24.
Use a critical value of 2.064 for a two-tailed test. Reject H0. The two methods do differ.
8-15.
Liz (1):
n1 = 32
x1 = 4,238
Calvin (2):
n 2 = 37
x 2 = 3,888.72 s 2 = 876.05
a. one-tailed: H0: 1 2 0
b. z =
s1 = 1,002.5
H1: 1 2 > 0
4,238 3,888.72 0
(1,002.52 / 32) (876.052 / 37)
= 1.53
c. At = 0.5, the critical point is 1.645. Do not reject H0 that Liz Claiborne models do not get
more money, on the average.
d. p-value = .5 .437 = .063 (It is the probability of committing a Type I error if we choose
to reject and H0 happens to be true.)
8-7
e.
S 2p
10 11002.5 2 11 1876.05 2
10 11 2
t 24
4238 3888.72
879983.804 1 1
10 11
879983.804
0.8522
df 19
8-16.

H0: 1 2 = 0
H1: 1 2 0
Evidence
Sample1
Sample2
28
0.19
5.72
28
0.72
5.1
Size
Mean
Std. Deviation
Assumptions
Populations Normal
n
x-bar
s

F ratio 1.25792
p-value 0.5552

df 54
Null Hypothesis
H0: 1 2 =0

Confidence
p-value
1%
Interval
0.7158
99%
-0.53 3.86682 = [ -4.3968, 3.33682 ]
H0: 1 2 >=0
H0: 1 2 <=0
0.3579
0.6421
Do not reject the null hypothesis. Pre-earnings announcements have no impact on earnings on
stock investments.
8-17.
Non-research (1):
n1 = 255
s1 = 0.64
Research (2):
n 2 = 300
s 2 = 0.85
x 2 x1 = 2.54
95% C.I. for 2 1 is: ( x 2 x1 ) z / 2 (s1 / n1 ) (s 2 / n2 )

2
= 2.54 1.96 (.64 / 255) (.85 / 300) = [2.416, 2.664] percent.
8-8
8-18.
Audio (1): n1 = 25
x1 = 87
s1 = 12
Video (2): n 2 = 20
x 2 = 64
s 2 = 23
H0: 1 2 = 0
t(43) =
H1: 1 2 0
x1 x 2 0
(n1 1) s1 (n 2 1) s 2
n1 n 2 2
2
1
1

n1 n 2
= 4.326
Reject H0. Audio is probably better (higher average purchase intent). Waldenbooks should
concentrate in audio.
Evidence
Sample1 Sample2
Size
Mean
Std. Deviation
25
87
12
n
x-bar
s
20
64
23

df
43
Null Hypothesis
H0: 1 2 =0
8-19.
With training (1):
p-value
0.0001
At an of
5%
Reject
n1 = 13
x1 = 55
s1 = 8
Without training (2): n 2 = 15
x 2 = 48
s2 = 6
H0: 1 2 4,000
H1: 1 2 > 4,000
(55 48) 4
t (26) =
= 1.132
(12)(8) (14)(6) 1
1

26
13 15
The critical value at = .05 for t (26) in a right-hand tailed test is 1.706. Since 1.132 < 1.706,
there is no evidence at = .05 that the program executives get an average of $4,000 per year
more than other executives of comparable levels.
8-20.
(Use template: testing difference in means.xls)

H0: P - L= 0
H1: P - L 0
8-9

Evidence
Assumptions
Populations Normal
Sample1 Sample2
Size
Mean
Std. Deviation
20
1
1.1
20
6
2.5
n
x-bar
s

F ratio 5.16529
p-value 0.0008
The variances are not equal.

df 26
Null Hypothesis
H0: 1 2 =0
H0: 1 2 >=0
H0: 1 2 <=0
At an Confidence Interval for difference in Population

of
Means
Confidence
Interval
p-value
5%
0.0000 Reject 95%

-5 1.25539 = [ -6.2554, -3.74461 ]
0.0000 Reject
1.0000
Reject the null hypothesis: the average cost of beer is cheaper in Prague. Londoners save
between $3.74 and $6.26.
8-21.

H1: 1 2 0
H0: 1 2 = 0
Evidence
Size
Mean
Std. Deviation
US
China
15
3.8
2.2
18
6.1
5.3
8-10
Assumptions
Populations Normal
n
x-bar
s

F ratio 5.80372
p-value 0.0018
Equal variance assumption is violated.

Assuming Population Variances are
Unequal
df 23
Null Hypothesis
H0: 1 2 =0
H0: 1 2 >=0
H0: 1 2 <=0
Confidence Interval for difference in Population

At an of Means
Confidence
p-value
Interval
1%
0.1073
99% -2.3 3.85252 = [ -6.1525, 1.55252 ]
0.0536
0.9464
Do not reject the null hypothesis (p-value = 0.1073), investment returns are the same in China
and the US.
8-22.
Old (1):
n1 = 19
x1 = 8.26
s1 = 1.43
New (2):
n 2 = 23
x 2 = 9.11
s 2 = 1.56
H0: 2 1 0
H1: 2 1 > 0
9.11 8.26 0
t (40) =
= 1.82
18(1.43) 22(1.56) 1
1

40
19 23
Some evidence to reject H0 (p-value = 0.038) for the t-distribution with df = 40, in a one-tailed
test.
8-23.
Take proposed route as population 1 and alternate route as 2. Assume equal variance for both
populations.
H0: 1 2 0
H1: 1 2 > 0
p-value from the template = 0.8674
cannot reject H0
8-24.

H0: 1 2 = 0
H1: 1 2 0
8-11
Evidence
Assumptions
Populations Normal
Sample1 Sample2
Size
Mean
Std. Deviation
20
3.56
2.8
n
x-bar
s
20
4.84
3.2

F ratio 1.30612
p-value 0.5662

df
38
Null Hypothesis
H0: 1 2 =0
p-value
0.1862
H0: 1 2 >=0
H0: 1 2 <=0
0.0931
0.9069
At an of
5%
Do not reject the null hypothesis. Neither investment outperforms the other.
8-25.
Yes (1): n1 = 25
x1 = 12
No (2):
x 2 = 13.5
n 2 = 25
s1 = 2.5
s2 = 1
Assume independent random sampling from normal populations with equal population variances.
H0: 2 1 0
H1: 2 1 > 0
13.5 12
t(48) =
= 2.785
24(2.5) 24(1) 1
1

48
25 25
At = 0.05, reject H0. Also reject at = 0.01. p-value = 0.0038.

Evidence
Sample1 Sample2
n
Size
25
25
Mean
12
13.5 x-bar
s
Std. Deviation 2.5
1

df
48
Null Hypothesis
H0: 1 2 =0
H0: 1 2 >=0
At an of
p-value
5%
0.0076 Reject
0.0038
Reject
8-12
8-26.
H0: 1 2 = 0
H1: 1 2 0
.1331 .105 0
z=
= 0.8887
20(.09) 27(.122) 1
1

47
21 28
Do not reject H0. There is no evidence of a difference in average stock returns for the two
periods.
8-27.

H0: N - O 0
H1: N - O > 0
Evidence
Assumptions
Populations Normal
Sample1 Sample2
Size
Mean
Std. Deviation
8
3
2
n
x-bar
s
10
2.3
2.1

F ratio 1.1025
p-value 0.9186

df
16
Null Hypothesis
H0: 1 2 =0
p-value
0.4834
H0: 1 2 >=0
H0: 1 2 <=0
0.7583
0.2417
At an of
5%
Do not reject the null hypothesis. (p-value = 0.2417) The new advertising firm has not resulted
in significantly higher sales.
8.28.
From Problem 8-25:

n1 = n 2 = 25 x1 = 12
x 2 = 13.5
s1 = 2.5
s2 = 1
We want a 95% C.I. for 2 1 :

(n1 1) s1 (n2 1) s 2
( x 2 x1 ) 2.011
n1 n2 2
2
1
1

n1 n2
24(2.5) 24(1) 1
1
= (13.5 12) 2.001

48
25 25
= [0.4170, 2.5830] percent.
8-13
8-29.
Before (1):
x1 = 85
n1 = 100
After (2):
x 2 = 68
n 2 = 100
H0: p1 p2 0
H1: p1 p2 > 0
p 1 p 2
.85 .68
z=
=
= 2.835
1
1
1
1
(.765)(.235)
p (1 p )
100 100
n1 n 2
Reject H0. On-time departure percentage has probably declined after NWs merger with
Republic. p-value = 0.0023.
Evidence
Sample Sample
1
2
Size 100
100 n
#Successes 85
68 x
Proportion 0.8500 0.6800 p-hat
Hypothesis Testing
Hypothesized Difference Zero
Pooled p-hat 0.7650
Null Hypothesis
H0: p1 - p2 = 0
H0: p1 - p2 >= 0
H0: p1 - p2 <= 0
8-30.
p-value
0.0046
0.9977
0.0023
At an of
5%
Reject
Reject
Small towns (1):
n1 = 1,000
x 1 = 850
Big cities (2):
n 2 = 2,500
x 2= 1,950
H0: p1 p2 0
H1: p1 p2 > 0
850 1,950
1,000 2,500
z=
1
850 1,950 2,800 1
3,500 3,500 1,000 2,500
= 4.677
Reject H0. There is strong evidence that the percentage of word-of-mouth recommendations in
small towns is greater than it is in large metropolitan areas.
8.31.
n1 = 31
x 1 = 11
H0: p1 p2 = 0
n 2 = 50
x 2= 19
H1: p1 p2 0
8-14
z=
p 1 p 2
1
1
p (1 p )
n
n
2
1
= 0.228
Do not reject H0. There is no evidence that one corporate raider is more successful than the other.
8-32.
Before campaign (1):
n1 = 2,060
p 1 = 0.13
After campaign (2):
n 2 = 5,000
p 2 = 0.19
H0: p2 p1 .05
H1: p2 p1 > .05
p 2 p 1 D
0.19 0.13 .05
z=
=
= 1.08
(.13)(.87) (.19)(.81)
p 1 (1 p 1 ) p 2 (1 p 2 )
2,060
5,000
n1
n2
No evidence to reject H0; cannot conclude that the campaign has increased the proportion of
people who prefer California wines by over 0.05.
8-33.
95% C.I. for p2 p1:

= .06 1.96
( p 2 p 1 ) 1.96
p 1 (1 p 1 ) p 2 (1 p 2 )
n1
n2
(.13)(.87) (.19)(.81)
= [0.0419, 0.0781]
2,060
5,000
We are 95% confident that the increase in the proportion of the population preferring California
wines is anywhere from 4.19% to 7.81%.
Confidence Interval
95%
8-34.
Confidence Interval
0.0600 0.0181
= [
0.0419 , 0.0782 ]
The statement to be tested must be hypothesized before looking at the data:

Chase Man. (1): n1 = 650
x 1 = 48
Manuf. Han. (2):
n 2 = 480
x 2 = 20
H0: p 1 p 2 0 H1: p 1 p 2 > 0

p 1 p 2
z=
= 2.248
1
1
p (1 p )
n1 n 2
Reject H0. p-value = 0.0122.
8-35.
American execs (1): n1 = 120
x 1 = 34
European execs (2):
x 2 = 41
H0: p 1 p 2 0
n 2 = 200
H1: p 1 p 2 > 0
8-15
z=
.283 .205
1
1
(.234)(1 .234)
120
200
= 1.601
At = 0.05, there is no evidence to conclude that the proportion of American executives who
prefer the A380 is greater than that of European executives. (p-value = 0.0547.)
Evidence
Sample 1 Sample 2
Size 120
200 n
x
#Successes
34
41
Hypothesis Testing
Pooled p-hat 0.2344
8-36.
Null Hypothesis
p-value
H0: p1 - p2 = 0
0.1093
H0: p1 - p2 >= 0
0.9454
H0: p1 - p2 <= 0
0.0546
At an of
5%
Cleveland (1): n1 = 1,000
x 1 = 75
p 1 = .075
n 2 = 1,000
x 2 = 72
p 2 = .072
Chicago (2):
H0: p 1 p 2 = 0
z=
H1: p 1 p 2 0
p 1 p 2
1
1
p (1 p )
n1 n 2
p = (72 +75)/2,000 = .0735
= 0.257
We cannot reject H0. p. value = 0.7971

8-37.
(Use template: testing difference in proportions.xls)

H0: pQ pN = 0
H1: pQ pN 0
8-16
Comparing Two Population Proportions

Evidence
Sample 1 Sample 2
Size 100
100 n
x
#Successes
18
6
Hypothesis Testing
Pooled p-hat 0.1200
At an of
p-value
5%
Null Hypothesis
H0: p1 - p2 = 0
0.0090
Reject
Reject the null hypothesis, the new accounting method is more effective.
8-38.

H0: pC pD = 0
H1: pC pD 0
Evidence
Sample 1 Sample 2
Size 100
100 n
x
#Successes
32
19
Hypothesis Testing
Pooled p-hat 0.2550
Null Hypothesis
H0: p1 - p2 = 0
At an of
p-value
1%
0.0349
Do not reject the null hypothesis: the proportions are not significantly different.
8-39.
Motorola (1):
n1 = 120
x 1 = 101 p1 = .842
Blaupunkt (2):
n 2 = 200
x 2 = 110 p2 = .550
H0: p 1 p 2
H1: p 1 > p 2
z=
p = (101 +110)/320 = .659
.842 .550
1
1
(.659)(1 .659)
120 200
= 5.33
8-17
Strongly reject H0; Motorolas system is superior (p-value is very small).

8-40.
Old method (1):
n1 = 40
2
s1 = 1,288
New method (2):
n 2 = 15
2
s 2 = 1,112
H0:
2
1
H1:
2
1
>2
use = .05
2
2
s 1 /s 2
F (39,14) =
= 1,288/1,112 = 1.158
The critical point at = .05 is F (39,14) = 2.27 (using approximate df in the table). Do not reject
H0. There is no evidence that the variance of the new production method is smaller.
F-Test for Equality of Variances
Sample 1 Sample 2
Size
40
15
Variance
1288
1112
Test Statistic 1.158273 F

df1
39
df2
14
Null Hypothesis
H0:
2
1
H0:
2
1
2
H0: 1
8-41.
2
2
p-value
= 0 0.7977
>= 0 0.6012
<= 0 0.3988
2
2
2
2
At an of
5%
Test the equal-variance assumption of Problem 8-27:

2
H0: 1 = 2
F = 1.1025
H1:
2
1
2
2
Assumptions
Populations Normal
F ratio 1.1025
p-value 0.9186
Do not reject H0. Variances are equal.

8-42.
Yes (1): n1 = 25 s1= 2.5

No (2):
n 2 = 25 s2= 1
H0: 1 = 2
H1: 1 2
2
Put the larger s in the numerator and use 2 :
2
F (24,24) = s1 / s2 = (2.5) /(1) = 6.25
8-18
From the F table using = .01, the critical point is F (24,24) = 2.66. Therefore, reject H0. The
population variances are not equal at = 2(.01) = 0.02.
Size
Variance
6.25
Test Statistic
df1
df2
6.25
24
24
Null Hypothesis
H0:
2
1
2
H0: 1
2
H0: 1
8-43.
Sample 1 Sample 2
25
25
n1 = 21
2
2
2
2
2
2
p-value
= 0 0.0000
1
F
At an of
5%
Reject
>= 0 1.0000
<= 0 0.0000
s1 = .09
2
n2 = 28
Reject
s2 = .122
F (27,20) = (.122) /(.09) = 1.838

At = .10, we cannot reject H0 because the critical point for = .05 from the table with dfs =
30, 20 is 2.04 and for dfs 24, 20 it is 2.08. We did not reject H0 at = .10 so we would also not
reject it at = .02. Hence this particular C.I. contains the value 1.00.
8-44.
Before (1):
n1 = 12
s1 = 16,390.545
After (2):
n 2 = 11
2
s 2 = 86,845.764
H0: 1 = 2
H1: 1 2
F (10,11) = 5.298
The critical point from the table, using = .01, is F (10,11) = 4.54. Therefore, reject H0. The
population variances are probably not equal. p-value < .02 (double the ).
8-19

Sample 1 Sample 2
Size
11
12
Variance 86845.76 16390.55
df1
10
df2
11
Null Hypothesis
2
H0: 1
2
H0: 1 2
H0: 1 -
8-45.
n1 = 25
p-value
At an of
1%
2
- 2=
2
2 >=
2
2 <=
0 0.9945
0 0.0055
Reject
s1 = 2.5
n2 = 25
s2 = 3.1
0 0.0109
H1: 1 2
H0: 1 = 2
= .02
F (24,24) = (3.1)2/(2.5)2 = 1.538
From the table: F .01(24,24) = 2.66. Do not reject H0. There is no evidence that the variances in the
two waiting lines are unequal.
8-46.
nA = 25
sA = 6.52
2
nB = 22
sB = 3.47
H1: A B
H0: A = B
= .01
F (24,21) = 6.52/3.47 = 1.879
The critical point for = .01 is F (24,21) = 2.80. Do not reject H0. There is no evidence that stock
A is riskier than stock B.
Size
Variance
Sample 1 Sample 2
25
22
6.52
3.47

df1
24
df2
21
Null Hypothesis
H0:
2
1
2
H0: 1
2
H0: 1
2
2
2
2
2
2
p-value
At an of
1%
= 0 0.1485
>= 0 0.9258
<= 0 0.0742
8-20
8-47.
The assumptions we need are: independent random sampling from the populations in question,
and normal population distributions. The normality assumption is not terribly crucial as long as
no serious violations of this assumption exist. In time series data, the assumption of random
sampling is often violated when the observations are dependent on each other through time. We
must be careful.
8-48.

H0: Leg Knee= 0
H1: Leg Knee 0
Evidence
Sample1 Sample2
Size 200
Mean 10402
Std. Deviation 8500
200 n
11359 x-bar
9100 s
Assumptions
Populations Normal
F ratio 1.14616
p-value 0.3367

Pooled Variance 7.8E+07 s2p
df
398
Null Hypothesis
H0: 1 2 =0
p-value
0.2778
H0: 1 2 >=0
H0: 1 2 <=0
0.1389
0.8611
At an of
5%
Do not reject the null hypothesis. The average cost of the two procedures are similar.
8-49.
99% C.I. for : Leg Knee:

Confidence Interval
99%
-957 2278.97 = [ -3235.97, 1321.97 ]
The C.I. contains zero as expected from the results of Problem 8-48.
8-50.
d = 51
d = 4.636
s d = 7.593
H0: u d 0 H1: u d > 0

4.636
= 2.025
t (10) =
7.593/ 11
Reject H0. Performance did improve after the sessions.
8-21
8-51.
For Problem 8-50:

95% C.I.: D t (10) s d / n
7.593
= 4.636 5.101 = [0.465, 9.737]
= 4.636 2.228
11
Confidence Intervals for the Difference in Means
(1 - ) Confidence Interval
95%
4.636
5.10105 = [ -0.465 ,
9.73705]
8-52.

H0: pNFL pSCI = 0
H1: pNFL pSCI 0
Evidence
Sample 1 Sample 2
Size 200
200 n
x
#Successes
96
52
Hypothesis Testing
Pooled p-hat 0.3700
Null Hypothesis
At an of
p-value
5%
H0: p1 - p2 = 0
0.0000
H0: p1 - p2 >= 0
1.0000
H0: p1 - p2 <= 0
0.0000
Reject
Reject
Reject H0. There is evidence that NFL viewers watch more commercials than those viewing
Survivor.
8-53. 99% C.I. pNFL pSCI (for the difference between viewing commercials for NFL viewers vs.
Survivor viewers.)
Confidence Interval
99%
Confidence Interval
0.2200 0.1211
= [
0.0989 , 0.3411 ]
The C.I. does not contain zero, as expected.
8-22
8-54.

H0: CR Guat = 0
H1: CR Guat 0
Evidence
Assumptions
Populations Normal
Sample1 Sample2
Size
Mean
Std. Deviation
15
1242
50
n
x-bar
s
15
1240
50

F ratio
1
p-value 1.0000

df 28
Null Hypothesis
H0: 1 2 =0
p-value
0.9136
H0: 1 2 >=0
H0: 1 2 <=0
0.5432
0.4568
At an of
5%
Do not reject the null hypothesis. The number of roses imported from both countries is about the
same.
8-55.
x1 = 60
n1 = 80
x2= 65
n2 = 100
p = 125/180 = .6944
H0: p1 p2 = 0
H1: p1 p2 0
p 1 p 2 0
.75 .65
=
= 1.447
z=
1
1
1
1
(.6944)(1 .6944)
p (1 p )
80 100
n1 n 2
Do not reject H0. (There is no evidence that one movie will be more successful than the other
(p-value = 0.1478).
8-23
Evidence
Sample 1 Sample 2
Size
80
100 n
x
#Successes
60
65
Hypothesis Testing
Pooled p-hat 0.6944
At an of
p-value
5%
Null Hypothesis
H0: p1 - p2 = 0
8-56.
0.1478
95% C.I. for the difference between the two population proportions:
( p 1 p 2 ) 1.96
= 0.10 1.96
p 1 (1 p 1 ) p 2 (1 p 2 )
n1
n2
(.75)(.25) (.65)(.35)
= [0.0332, 0.2332]
80
100
Yes, 0 is in the C.I., as expected from the results of Problem 8-55.

8-57.
K:
L:
nK = 12
nL = 12
x K = 12.55
x L = 11.925
H0: K L = 0
t (22) =
sK = .7342281
sL = .3078517
H1: K L 0
12.55 11.925
2
= 2.719
11(.7342281) 11(.3078517) 1
1

22
12
12
Reject H0. The critical points for t (22) at = .02 are 2.508. Critical points for t (22) at = .01
are 2.819. So .01 < p-value < .02. The L-boat is probably faster.
8-24
Evidence
Sample1 Sample2
n
Size
12
12
Mean 12.55
11.925 x-bar
Std. Deviation 0.73423 0.30785 s

df
22
Null Hypothesis
H0: 1 2 =0
8-58.
p-value
0.0125
At an of
5%
Reject
Do Problem 8-57 with the data being paired. The differences KL are:
0.2
n = 12
t (11) =
1.0
0.2
D = .625
.625 0
.7723929/ 12
1.0
2.2
0.2
0.8
0.9
1.0
0.2
0.6
1.2
sD = .7723929
= 2.803
2.718 < 2.803 < 3.106 (between the critical points of t (11) for = .01 and .02).
Hence, .01 < p-value < .02, which is as before, in Problem 8-57 (the pairing did not help much
herewe reach the same conclusion).
Evidence
Size
12
Assumption
Populations Normal

df 11
Hypothesis Testing
Null Hypothesis
p-value
H0: 1 2 =0
0.0172
H0: 1 2 >=0
0.9914
H0: 1 2 <=0
0.0086
8-59.
At an of
5%
Reject
Reject

H0: West South = 0
H1: West South 0
8-25

Evidence
Sample 1 Sample 2
Size 1000
1000 n
#Successes 49.5
67.9 x
Hypothesis Testing
Pooled p-hat 0.0587
Test Statistic -1.7503 z
At an of
p-value
5%
Null Hypothesis
H0: p1 - p2 = 0
0.0801
Do not reject the null hypothesis: the delinquency rates are the same.
8-60.
IIT (1):
n1 = 100
p 1 = 0.94
Competitor (2):
n2 = 125
p 2 = 0.92
H0: p1 p2 = 0
H1: p1 p2 0
z=
.02
1
1
(.9288)(1 .9288)
100 125
p = .92888
= 0.58
There is no evidence that one program is more successful than the other.
8-61.
Design (1):
n1 = 15
x1 = 2.17333
Design (2):
n2 = 13
x 2 = 2.5153846
H0: 2 1 = 0
s1 = .3750555
s2 = .3508232
H1: 2 1 0
2.5153846 2.173333
t (26) =
= 2.479
14(.3750555) 12(.3508232) 1
1

26
15
13
p-value = .02. Reject H0. Design 1 is probably faster.

8-62.
H0:
2
1
H1: 1
=2
2
2
2
F (14,12) = s1 / s2 = (.3750555)2/(.3508232)2 = 1.143
8-26
Do not reject H0 at = 0.10. (Since 1.143 < 2.62. Also < 2.10, so the p-value > 0.20.) The
solution of Problem 8-61 is valid from the equal-variance requirement.
8-63.
A = After:
nA = 16
B = Before: nB = 15
H0: A B 5
x A = 91.75
sA = 5.0265959
x B = 84.7333
sB = 5.3514573
H1: A B > 5
91.75 84.733 5
t (29) =
= 1.08
15(5.0265959) 14(5.3514573) 1
1

29
16 15
Do not reject H0. There is no evidence that advertising is effective.

8-64.
H0: 1 = 2 H1: 1 2
F (14,15) = (5.3514573)2/(5.0265959)2 = 1.133
Do not reject H0 at = 0.10. There is no evidence that the population variances are not equal.
Size
Sample 1 Sample 2
15
16
Variance 28.6381
25.26667

df1
14
df2
15
Null Hypothesis
H0:
2
1
2
H0: 1
2
H0: 1
8-65.
2
2
2
2
2
2
p-value
At an of
10%
= 0 0.8100
>= 0 0.5950
<= 0 0.4050
From problem 8-48:

sL = 8500
sK = 9100
H0:
2
L
=K
H1: L K
Evidence
Sample1 Sample2
Size 200
Mean 10402
Std. Deviation 8500
200 n
11359 x-bar
9100 s
8-27
Assumptions
Populations Normal
F ratio 1.14616
p-value 0.3367
F = 1.146 p = 0.34
Do not reject the null hypothesis of equal variances.
8-66.
H0: K 2 = L 2
H1: K 2 L 2
F (11,11) = (.7342281)2/(.3078517)2 = 5.688

Critical point for = 0.02 is about 4.5. Therefore, reject H0. Thus the analysis in Problem 8-57
is not valid. We need to use the other test. The other test also gives t = 2.719 but the df are
obtained using Equation (8-6):
( s1 / n1 s2 / n2 ) 2
2
df =
= approximately 14 (rounded downward).

( s12 / n1 ) ( s2 2 / n2 ) 2
n 1 n 1
1
2
t .02(14) = 2.624 < 2.719 < 2.977 = t .01(14), hence 0.01 < p-value < 0.02. Reject H0.
8-67.
Differences A B:
11 3
D = 2.375
t (15) =
14 8 10 5 7
2 12
5 10 22 12
sD = 9.7425185 n = 16
2.375 0
9.7425185/ 16
H0: D = 0
= 0.9751
H1: D 0
Do not reject H0. There is no evidence that one package is better liked than the other.
Evidence
Size
16
Average Difference -2.375 D
Assumption
Populations Normal

Test Statistic -0.9751
df 15
Hypothesis Testing
Null Hypothesis
H0: 1 2 =0
H0: 1 2 >=0
H0: 1 2 <=0
8-68.
t
At an of
5%
p-value
0.3450
0.1725
0.8275
Supplier A: nA = 200 xA = 12
Supplier B: nB = 250 xB = 38
H0: pA pB = 0
H1: pA pB 0
p = (12 +38)/450 = .1111
8-28
p A p B 0
z=
1
1
p (1 p )
n
n
2
1
.06 .152
1
1
(.1111)(.8888)
200
250
= 3.086
Reject H0. p-value = .002. Supplier A is probably more reliable as the proportion of defective
components is lower.
8-69.
95% C.I. for the difference in the proportion of defective items for the two suppliers:
( p B p A ) 1.96
p A (1 p A ) p B (1 p B )
nA
nB
=.092 1.96(.0282415) = [0.0366, 0.1474].

Confidence Interval
95%
8-70.
Confidence Interval
0.0920 0.0554
= [
0.0366 , 0.1474 ]
90% C.I. for the difference in average occupancy rate at the Westin Plaza Hotel before and after
the advertising:
2
15(5.0265959) 14(5.3514573) 1
1
( x B x A ) 1.699

29
15 16
= 7.016667 3.1666375 = [3.85, 10.18] percent occupancy.

8-71.

H0: B O = 0
H1: B O 0
Evidence
Assumptions
Populations Normal
Sample1 Sample2
Size
Mean
Std. Deviation
25
60
14
20
65
8
n
x-bar
s
Assumption of equal variances is violated.

df
39
Null Hypothesis
H0: 1 2 =0
H0: 1 2 >=0
H0: 1 2 <=0
p-value
0.1404
0.0702
0.9298
At an of
5%
8-29

F ratio 3.0625
p-value 0.0155
Do not reject the null hypothesis. The price of the two virtual dolls is about the same.
8-72.

H0: A B = 0
H1: A B 0
Evidence
Assumptions
Populations Normal
Sample1 Sample2
Size
Mean
Std. Deviation
74
28
6
n
x-bar
s
65
22
6

F ratio
1
p-value 1.0000

df 137
Null Hypothesis
H0: 1 2 =0
p-value
0.0000
At an of
5%
Reject
H0: 1 2 >=0
H0: 1 2 <=0
1.0000
0.0000
Reject
Reject the null hypothesis: the average returns are similar.

8-73.
(Use template: testing difference in means.xls; sheet:t-test from stats)

H0: 2 1 = 0
H1: 2 1 0
Evidence
Assumptions
Populations Normal
Sample1 Sample2
Size
Mean
Std. Deviation
74
50
20
65
14
8
The assumption of equal variances is violated.
8-30
n
x-bar
s

F ratio 6.25
p-value 0.0000

df
98
Confidence Interval for difference in Population
At an of Means
p-value
5%
Confidence Interval
Reject
0.0000
95%
36 5.01643 = [ 30.9836, 41.0164 ]
1.0000
Reject
0.0000
Null Hypothesis
H0: 1 2 =0
H0: 1 2 >=0
H0: 1 2 <=0
The 95% CI: [$30.98M, $41.02M]

8-74.
a.
n1 = 2500 x1 = 39
s1 = s2 = 2
H0: u1 = u2
z=
x 2 = 35
= .05
H1: u1 u2
39 35
n 2 = 2500
= 70.711
2 / 2500 2 / 2500
Reject H0. The average workweek has shortened.

2
b. 95% C.I.: (39 35) 1.96 2 / 2500 2 / 2500

= 4 .1109 = [3.8891, 4.1109]
8-75.
(Use template: testing difference in means.xls; sheet:t-test from stats)

H0: 2 1 = 0
H1: 2 1 0
Evidence
Assumptions
Populations Normal
Sample1 Sample2
Size
Mean
Std. Deviation
25
1.7
0.4
25
1.5
0.7
n
x-bar
s

F ratio 3.0625
p-value 0.0081
The assumption of equal variances is violated.

Assuming Population Variances are
Unequal
df
38
Null Hypothesis
H0: 1 2 =0
p-value
0.2225
At an of
5%
Do not reject the null hypothesis. The mean catches are about the same. p-value = 0.2225
8-31
8-76.
Yes. Lower income households are less likely to have internet access. (p-value = 0.0038)
Evidence
Size
#Successes
Proportion
Sample 1
500
350
0.7000
Sample 2
n
500
x
310
p-hat
0.6200
Hypothesis Testing
Pooled p-hat
0.6600
Test Statistic
2.6702
Null Hypothesis
p-value
At an of
5%
Reject
H0: p1 - p2 = 0
0.0076
H0: p1 - p2 >= 0
0.9962
H0: p1 - p2 <= 0
0.0038
Reject
8-77. The 95% C.I. contains 0, which supports the results from 8-75.
Confidence Interval
95%
0.2 0.32642
= [
-0.1264, 0.52642 ]
8-78
The ration of the variances is 3.18. The degrees of freedom for both samples is 10 1 = 9.
Using the F-table for 9 degrees of freedom in both the numerator and the denominator, we find a
value of 3.18 when = 0.05. Therefore, there is a 5% chance.
8-79
(Use template: testing difference in means.xls; sheet:t-test from data)

1. Assuming equal variances:
H0: 2 1 = 0
H1: 2 1 0
8-32

Data
Co.1
Co.2
Sample1 Sample2
2570
2480
2870
2975
2055
2940
2850
2475
2660
1940
2380
2590
2550
2485
2585
2710
2100
2655
1950
2115
Evidence:
Sample1 Sample2
n
Size
11
9
Mean 2623.18 2342.22 x-bar

df
18
Null Hypothesis
H0: 1 2 =0
H0: 1 2 >=0
H0: 1 2 <=0
p-value
0.0467
At an of
5%
Reject
0.9766
0.0234
Reject
At 0.05 level of significance, reject the null hypothesis that the charges are the same.
2. Test the assumption of equal variances.
H0 : 12 22
H1 : 12 22
Assumptions
Populations Normal
F ratio 5.11054
p-value 0.0193
Reject null hypothesis: the variances are not equal.

3.Assuming unequal variances,
H0: 2 1 = 0
H1: 2 1 0
8-33

df
10
Null Hypothesis
H0: 1 2 =0
H0: 1 2 >=0
H0: 1 2 <=0
At an of
5%
p-value
0.0748
0.9626
0.0374
Reject
Accept the null hypothesis: the charges are not different.

Case 10: Tiresome Tires II
1) Do not reject the null hypothesis at 5%
Evidence
Assumptions
Populations Normal
Sample1 Sample2
Size 40
40 n
Mean 2742.5 2729.35 x-bar

F ratio
1.16512
p-value
0.6356

df 78
Null Hypothesis
p-value
H0: 1 2 <=0
0.0518

Confidence Interval
5%
95%
13.15 15.8956
=
2) Increasing would decrease . Increasing to any value above 5.18% will cause the null
hypothesis to be rejected.
3) Paired difference test: Reject the null hypothesis, (p-value = 0.0471)
8-34

Data
Old Meth New Meth Evidence
Sample1 Sample2
Size
Average Difference
40
2792
2713
13.15
2
3
4
5
6
7
8
9
10
11
2755
2745
2731
2799
2793
2705
2729
2747
2725
2715
2741 Stdev. of Difference 48.4877

2701
2731
Test Statistic 1.7152
2747
df
39
Null Hypothesis
2773
2676
2677
H0: 1 2 <=0
2721
2742
Assumption
Populations Normal
sD
Sample1 - Sample2
p-value
At an of
5%
0.0471
Reject
4) Reducing the variance of the new process will decrease the chances of a Type I error.
8-35
Chapter 09 - Analysis of Variance
CHAPTER 9
ANALYSIS OF VARIANCE
9-1.
H0:
H1:
X X X X
X X
X X
X X X
X
X X X X
X X X X
1 2 3 4
All 4 are different
2 equal; 2 different
3 equal; 1 different
2 equal; other 2 equal but different from first 2
9-2.
ANOVA assumptions: normal populations with equal variance. Independent random sampling
from the r populations.
9-3.
Series of paired t-test are dependent on each other. There is no control over the probability of a
Type I error for the joint series of tests.
9-4.
r = 5 n1 = n2 = . . . = n5 = 21 n =105
dfs of F are 4 and 100. Computed F = 3.6. The p-value is close to 0.01. Reject H0. There is
evidence that not all 5 plants have equal average output.
F Distribution
10%
(1-Tail) F-Critical 2.0019
9-5.
5%
2.4626
1%
3.5127
0.50%
3.9634
r = 4 n1 = 52 n2 = 38 n3 = 43 n4 = 47
Computed F = 12.53. Reject H0. The average price per lot is not equal at all 4 cities. Feel very
strongly about rejecting the null hypothesis as the critical point of F (3,176) for = .01 is
approximately 3.8.
F Distribution
10%
(1-Tail) F-Critical 2.1152
5%
2.6559
1%
3.8948
0.50%
4.4264
9-6.
Originally, treatments referred to the different types of agricultural experiments being performed
on a crop; today it is used interchangeably to refer to the different populations in the study.Errors
are the differences between the data points and their sample means.
9-7.
Because the sum of all the deviations from a mean is equal to 0.
9-1
9-8.
9-9.
Total deviation = xij x = ( x i x ) + x ij xi

= treatment deviation + error deviation.
The sum of squares principle says that the sum of the squared total deviations of all the data
points is equal to the sum of the squared treatment deviations plus the sum of all squared error
deviations in the data.
9-10.
An error is any deviation from a sample menu that is not explained by differences among
populations. An error may be due to a host of factors not studied in the experiment.
9-11.
Both MSTR and MSE are sample statistics given to natural variation about their own means.
(If x > 0 we cannot immediately reject H0 in a single-sample case either.)
9-12.
The main principle of ANOVA is that if the r population means are not all equal then it is likely
that the variation of the data points about their sample means will be small compared to the
variation of the sample means about the grand mean.
9-13.
Distances among populations means manifest themselves in treatment deviations that are large
relative to error deviations. When these deviations are squared, added, and then divided by dfs,
they give two variances. When the treatment variance is (significantly) greater than the error
variance, population mean differences are likely to exist.
9-14.
a) degrees of freedom for Factor: 4 1 = 3

b) degrees of freedom for Error: 80 4 = 76
c) degrees of freedom for Total: 80 1 = 79
9-15
SST = SSTR + SSE, but this does not equal MSTR + MSE. A counterexample:
Let n = 21
r = 6 SST = 100
SSTR = 85
SSE = 15
Then SST = SSTR + SSE = 85 + 15 = 100.

But = MSTR MSE
SST
SSTR SSE 85 15
18
n 1
r 1 n r 5 15
9-16.
When the null hypothesis of ANOVA is false, the ratio MSTR/MSE is not the ratio of two
independent, unbiased estimators of the common population variance 2 , hence this ratio does
not follow an F distribution.
9-17.
For each observation xij , we know that

(tot.) = (treat.) + (error):
xij x = ( x i x ) + x ij xi
Squaring both sides of the equation:
( xij x )2 = ( x i x )2 + 2( x i x )( xij x i ) + ( xij x i )2
9-2
Now sum this over all observations (all treatments i = 1, . . . , r; and within treatment i, all
observations j = 1, . . . , ni:
ni
( xij x )2 =
i 1 j 1
ni
( x i x )2 +
i 1 j 1
ni
2( x i x )( xij x i ) +
ni
( xij x i )2
i 1 j 1
i 1 j 1
Notice that the first sum of the R.H.S. here equals
ni( x i x )2 since for each i the
i 1
summand doesnt vary over each of the ni) values of j. Similarly the second sum is
r
i 1
ni
[( x i x ) ( xij x i )]. But for each fixed i,

j 1
ni
( xij x i ) = 0 since this is just the sum
j 1
of all deviations from the mean within treatment i. Thus the whole second sum in the long R.H.S.
above is 0, and the equation is now
r
ni
( xij x )2 =
i 1 j 1
ni( x i x )2 +
ni
( xij x i )2
i 1 j 1
i 1
which is precisely Equation (9-12).

9-18.
(From Minitab):
Source
df
SS
MS
F
Treatment
2
381127
190563
20.71
Error
27
248460
9202
Total
29
629587
The critical point for F (2,27) at = 0.01 is 5.49. Therefore, reject H0. The average range of the 3
prototype planes is probably not equal.
5%
ANOVA Table
Source
SS
Between 381127
Within 248460
Total 629587
9-19.
df
MS
Fcritical p-value
F
2 190563.33 20.7084038 3.3541312 0.0000 Reject
27 9202.2222
29
(Template: Anova.xls, sheet: 1-way):

ANOVA Table
Source
SS
Between 187.696
Within 152.413
Total 340.108
5%
df
3
28
31
MS
Fcritical p-value
F
62.565 11.494 2.9467 0.0000 Reject
5.4433
9-3
MINITAB output
One-way ANOVA: UK, Mex, UAE, Oman
Source
Factor
Error
Total
DF
3
28
31
S = 2.333
SS
187.70
152.41
340.11
MS
62.57
5.44
R-Sq = 55.19%
F
11.49
P
0.000
R-Sq(adj) = 50.39%
Individual 95% CIs For Mean Based on
Pooled
Level
UK
Mex
UAE
Oman
Mean
StDev
8
8
8
8
60.160
58.390
55.190
54.124
2.535
2.405
2.224
2.149
StDev
+---------+---------+---------+-------(------*-----)
(------*-----)
(------*------)
(-----*------)
+---------+---------+---------+--------
52.5
55.0
57.5
60.0
Pooled StDev = 2.333
Critical point F (3,28) for = 0.05 is 2.9467. Therefore we reject H0. There is evidence of
differences in the average price per barrel of oil from the four sources. The Rotterdam oil market
may not be efficient. The conclusion is valid only for Rotterdam, and only for Arabian Light. We
need to assume independent random samples from these populations, normal populations with
equal population variance. Observations are time-dependent (days during February), thus the
assumptions could be violated. This is a limitation of the study. Another limitation is that
February may be different from other months.
9-20.
An F(.05,2,101) = 3.61 result, relative to a critical value of 3.08637, indicates a significant difference
in their perceptions on the roles played by African American models in commercials.
9-21.
(From Minitab):
Source
Treatment
Error
Total
df
2
38
40
SS
91.0426
140.529
231.571
9-4
MS
45.5213
3.69812
F
12.31
p-value = .0001. Critical point for F (2,38) at = .05 is 3.245. Therefore, reject H0. There is a
difference in the length of time it takes to make a decision.
5%
ANOVA Table
Source
SS
df
MS
Fcritical
p-value
F
Between 91.0426
2 45.521302 12.3093042 3.2448213 0.0001 Reject
Within 140.529 38 3.6981215
Total 231.571 40
9-22.
An F(.05,2,55) = 52.787 result, relative to a critical value of 3.165, indicates a significant difference
in the monetary-economic reaction to the three inflation fighting policies.
9-23.
The test results exceed the critical value of F(.01,3,236) = 3.866. The results indicate that the
performances of the four different portfolios are significantly different.
9-24.
95% C.I. for the mean responses:

Martinique: x2 t / 2 MSE / n2 = 75 1.96 504.4 / 40 = [68.04, 81.96]
Eleuthera: 73 1.96 MSE / n3 = [66.04, 79.96]
Paradise Island: 91 1.96 MSE / n4 = [84.04, 97.96]
St. Lucia: 85 1.96 MSE / n5 = [78.04, 91.96]
9-25.
Where do differences exist in the circle-square-triangle populations from Table 9-1, using
Tukey? From the text:
MSE = 2.125
triangles: n1 = 4
x1 = 6
squares:
n2 = 4
x 2 = 11.5
circles:
n3 = 3
x3 = 2
For = .01, q (r,nr) = q 0.01(3,8) = 5.63 Smallest ni is 3:

T = q MSE / 3 = 5.63 2.125 / 3 = 4.738
| x1 x 2 | = 5.5 > 4.738
sig.
| x 2 x 3 | = 9.5 > 4.738
sig.
| x1 x 3 | = 4.0 < 4.738
n.s.
Thus: 1 = 3; 2 > 1; 2 > 3
9-26.
Find which prototype planes are different in Problem 9-18:

MSE = 9,202 ni = 10 for all i
x A = 4,407
x B = 4,230
xC = 4,135
For = .05, q (3,27) = approximately 3.51. T = 3.51 9,202 / 10 = 106.475
9-5
| x A x B | = 177 > 106.475
sig.
| x B xC | = 95 < 106.475
n.s.
| x A xC | = 272 > 106.475
sig.
Prototype A is shown to have higher average range than both B and C. Prototypes B and C have
no significant difference in average range (all conclusions are at = 0.05).
Tukey test for pairwise comparison of group means
A
r
B Sig B
3
n-r
C Sig
C
27
q0
T
9-27.
3.51
106.476
Since H0 was rejected in Problem 9-19, there are significant differences.

T = q0.05(4,28) 5.4433 / 8 = 3.332
|UK MEX| = |60.16 58.39|
= 1.77
|UK UAE| = |60.16 55.19|
= 4.97
|UK OMAN| = |60.16 54.1238| = 6.0362
|MEX UAE| = |58.39 55.19|
= 3.2
|MEX OMAN| = |58.39 54.1238| = 4.2662
|UAE OMAN| = |55.19 54.1238| = 1.0662
All are < 0.22, thus not significantas expected.
UK
r
Mex
Mex
4
n-r
UAE Sig
UAE
28
Oman Sig
Sig
q0
4.04
T
3.33248
9-28.
(Question has no relevance to 9-20)
9-29.
Degrees of freedom for Factor: 3-1 = 2

Degrees of freedom for Error: 157 3 = 154
Degrees of freedom for Total: 157 1 = 156
The overall F test indicates that there is a difference in the groups reaction to pricing tactics.
The subsequent information also indicates that there is a significant difference between each of
the groups reactions.
9-30.
a) Total sample size = 275

b) The critical value for F(.05, 2, 272) is 3.029; therefore the overall ANOVA test is very significant.
c) Monopoly prices are significantly different than limited competition and strong competition.
9-6
9-31.
We cannot extend the results to planes built after the analysis. We used fixed effects here, not
random effects. The 3 prototypes were not randomly chosen from a population of levels as would
be required for the random effects model.
9-32.
A randomized complete block design is a design with restricted randomization. Each block of
experimental units is assigned to treatments with randomization of treatments within the block.
9-33.
Fly all 3 planes on the same route every time. The route (flown by the 3 planes) is the block.
9-34.
Look at the residuals. If the spread of the residuals is not equal, we probably have unequal 2 ,
the assumption of equal variances is violated. A histogram of the residuals will reveal normality
violations.
9-35.
Otherwise you are not randomly sampling from a population of treatments, and inference is not
valid for the entire population.
9-36.
No. Rotterdam (and Arabian Light) was not randomly chosen.
9-37.
If the locations and the artists are chosen randomly, we have a random effects model.
9-38.
1. Testing for possible interactions among factor levels.

2. Efficiency.
9-39.
Limitations and problems: (1) We dont know the overall significance level of the 3 tests; (2) If
we have 1 observation per cell then there are 0 degrees of freedom for error. Also, for a fixed
sample size there is a reduction of the df for error.
9-40.
1. As more factors are included, df for error decreases.

2. As more factors are included, we lose the control on , and the probability of at least one
Type I error increases.
9-41.
Since there are interactions, there are differences in emotions averaged over all levels of
advertisements.
9-42.
At = 0.05:
Location: F = 50.6, significant
Job type: F = 50.212, significant
Interaction: F = 2.14, n.s.
9-7
ANOVA Table
Source
SS
df
MS
F
Location 2520.988
2 1260.49 50.645
Job Type 2499.432
2 1249.72 50.212
Interaction 212.716
4 53.179 2.1367
Error
1792
72 24.8889
Total 7025.136
80
9-43.
Morning
Evening
Late Night
ABC
50
50
50
CBS
50
50
50
NBC
50
50
50
5%
Fcritical p-value
3.1239 0.0000 Reject
3.1239 0.0000 Reject
2.4989 0.0850
Source
Network
Newstime
Interaction
Error
Total
SS
145
160
240
6200
6745
df
2
2
4
441
449
MS
72.5
80
60
14.06
F
5.16
5.69
4.27
From table:
F 0.01(4,400) = 3.36
F 0.01(2,400) = 4.66
Therefore, all are significant at = 0.01. There are interactions. There are Network main effects
averaged over Newstime levels. There are Newstime main effects over Network levels.
Levels of task difficulty: a 1 = 1; therefore a = 2
Levels of effort: b 1 = 1; therefore b = 2
There are no task difficulty main effects because p-value = 0.5357
There are effort main effects because p-value < 0.0001
There are no significant interactions, as p-value = 0.1649.
9-44.
a.
b.
c.
d.
e.
9-45.
a. Explained is Treatment: Treat = Factor A + Factor B + (AB)

b. Levels of exercise price: a 1 = 2; therefore a = 3
c. Levels of time of expiration: b1 = 1; therefore b = 2
ab(n 1) = 144, a = 3, b = 2; therefore n 1 = 24, n = 25, N = 25 6 = 150
n = 25
There are no exercise-price main effects (F = 0.42 < 1).
There are time-of-expiration main effects at = 0.05 but not at = 0.01 because F (1,144) =
4.845. From the F table, for dfs = 1, 150: critical point for = 0.05 is 3.91 and for = 0.01
it is 6.81.
h. There are no interactions: F = .193 < 1
i. There is some evidence for time-of-expiration main effects. There is no evidence for
exercise-price main effects or interaction effects.
j. For time-of-expiration main effects, .01 < p-value < .05. For the other two tests, the p-values
are very high.
d.
e.
f.
g.
k. We could use a t-test for time-of-expiration effects: t
9-8
2
(144)
= F (1,144)
9-46.
Since there are interactions but neither of the main factors have significant F-tests, a likely
conclusion is that the two factors work in opposite directions, i.e., inverse to each other.
9-47.
Advantages: reduced experimental errors (the effects of extraneous factors) and greater economy
of sample sizes.
9-48.
Use blocking by firm, to reduce the error contributions arising from differences between firms.
9-49.
Could use a randomized blocking design: 4 observations, UK, Mexico, UAE, Oman at 4
locations and 4 different dates.
9-50.
A good blocking variable would be size of firm in terms of total assets or total sales, etc.
9-51.
Yes. Have people of the same occupation/age/demographics use sweaters of the 3 kinds under
study. Each group of 3 people are a block.
9-52.
As stated in 9-23, a good blocking variable would be some measure of diversity in the portfolio.
9-53.
We could group the executives into blocks according to some choice of common characteristics
such as age, sex, years employed at current firm, etc. The different blocks for the chosen attribute
would then form a third variable beyond Location and Type to use in a 3-way ANOVA.
9-54.
We must assume no block-factor interactions.
9-55.
SSTR = 3,233
SSE = 12,386 n = 100 blocks
df error = (n 1)(r 1) = 99(2) = 198 df treatment = r 1 = 2
3,233/ 2
= 25.84
12,386 / 198
Reject H0. p-value is very small. There are differences among the 3 sweeteners. Should be
very confident of results. Blocking reduces experimental error here, as people of the same
weight/age/sex will tend to behave homogeneously with respect to losing weight.
F = MSTR/MSE =
9-56.
n = 70
r=4
SSTR = 9,875 SSBL = 1,445 SST = 22,364
SSE = 22,364 1,445 9,875 = 11, 044
MSE =
11,044
= 53.35
(69)(3)
MSTR =
9,875
= 3,291.67
3
F (3,207) = MSTR/MSE = 61.7

Reject H0. p-value is very small. Not all of the four methods are equally effective.
9-9
9-57.
SSTR = 7,102
SSE = 10,511 r = 8 ni = 20 for all i
MSTR = SSTR/(r 1) = 7,102/7 = 1,014.57
MSE = SSE/(n r) = 10,511/(160 8) = 69.15
F (7,152) = 14.67 > 2.76 (crit. point for = 0.01). Therefore, reject H0. Not all tapes are equally
appealing. p-value is very small.
9-58.
n1 = 32
n2 = 30
n3 = 38
n4 = 41
n =141
MSTR = SSTR/(r 1) = 4,537/3 = 1,512.33
F (3,137) = MSTR/MSE = 1,512.33/412 = 3.67
(at = 0.05) 2.67 < 3.67 < 3.92
(at = 0.01)
We can reject H0 at = 0.05. There is some evidence that the four names are not all equally well
liked.
9-59.
Software packages: 3
SS software = 77,645
SS computer = 54,521
SS int. = 88,699
SSE = 434,557
n = 60
Source
software
computer
interaction
error
Total
Computers 4
SS
77,645
54,521
88,699
434,557
655,422
df
2
3
6
708
719
MS
38,822.5
18,173.667
14,783.167
613.78
F
63.25
29.60
24.09
Both main effects and the interactions are highly significant.

9-60.
Treatment df = (r-1) = 2
Block df = 74
Total df = 224
Total sample size was 225:
Error df = (n-1)(r-1) = (74)(2) = 148
Critical value of F(.05, 2, 148) = 3.0572, which is less than F = 13.65. The results are significant.
9-10
9-61.
Source
pet
location
interaction
error
Total
SS
22,245
34,551
31,778
554,398
642,972
df
3
3
9
144
159
MS
7,415
11,517
3,530.89
3,849.99
F
1.93
2.99
0.92
There are no interactions. There are no pet main effects.

( = 0.05) 2.68 < 2.99 < 3.92
( = 0.01)
Thus there are location main effects at = 0.05.
9-62. F-ratio = 4.5471 p-value = .0138 (using a computer). At = 0.05, only groups 1 and 3 are
significantly different from each other. Drug group is significantly different from the No.
Treatment group.
5%
ANOVA Table
Source
SS
Between 3203.12
Within 25359.6
Total 28562.7
df
MS
Fcritical
p-value
F
2
1601.56 4.54708749 3.123901138 0.0138 Reject
72 352.21667
74
Confidence Intervals of Group Means

Group
Confidence Interval
Drug
24.16
7.4824354
Placebo
27.8
7.4824354
N0-Treatment
39.48
7.4824354
95%
95%
95%

Drug
r
Placebo
Placebo
3
n-r
N0-Treatment
Sig
72
q0 3.41
T 12.7994
9-63.
N0-Treatment
a. Blocking (repeated measures) is more efficient as every person is his/her own control.
Reductions in errors. Limitations? Maybe carryover effects from trial to trial.
9-11
9-64.
b. SSTR = 44,572
SSE = 112,672
r=3
n= 30
MSTR = 44,572/2 = 22,286
MSE = 112,672/(29)(2) = 1,942.62
F (2,58) = 11.47. Reject H0.
n1 = n2 = n3 = 15
r = 3 A one-way ANOVA gives an F-value of 22.21, which is significant
even at < 0.001, hence we reject the hypothesis of no differences among the three models.
MSE = 48.1, so at = 0.01 we use the critical point q = 4.37 (closest to the required value for
dfs = 3, 42), giving the Tukey criterion T = q MSE / ni = 7.83. Observed means:
xGI = 124.73
x P = 121.40
| xGI x P | = 3.33
So:
x Z = 108.73
| xGI x Z | = 16.00*
| x P x Z | = 12.67*
Using T = 7.83, we reject the hypothesis of xGI = x Z and also x P = x Z (at the 0.01 level of
significance), but not the xGI = x P hypothesis.
5%
ANOVA Table
Source
SS
Between 2137.78
Within 2021.47
Total 4159.24
df
MS
Fcritical
p-value
F
2 1068.8889 22.2082976 3.219938094 0.0000 Reject
42 48.130159
44
Confidence Intervals of Group Means

Group
Confidence Interval
GI
124.733
3.6149467
Phillips
121.4
3.6149467
Zenith
108.733
3.6149467
95%
95%
95%

GI
r
Phillips
Phillips
3
n-r
Zenith
Sig
Sig
42
q0 4.37
T 7.82789
9-65.
n = 50
r =3
SSTR = 128,889
F (2,98) =
128,899 / 2
42,223,987 / 98
SSE = 42,223.987
= 0.14958
Do not reject the null hypothesis

9-66.
2
(df)
= F (1,df)
9-12
Zenith
9-67.
Rents are equal on average. There is no evidence of differences among the four cities.
9-68.
Answers will vary depending upon which report is selected.
9-69.
A one-way ANOVA strongly rejecting H0. For the three levels of Store, 95% confidence
intervals are calculated for means, as shown, which do not overlap at all.
Case 11: Rating Wines
(Template: ANOVA.xls, sheet: 1-Way)
data:
n
11
1
2
3
4
5
6
7
8
9
10
11
Chard
89
88
89
78
80
86
87
88
88
89
88
10
13
11
Merlot C.Blanc C.Sauv

91
81
92
88
81
89
99
81
89
90
82
9
91
81
92
88
78
90
88
79
91
89
80
93
90
83
91
87
81
97
88
88
85
86
1) Do not reject the null hypothesis, there is no difference in the average ratings due to the type of
grape.
ANOVA Table
Source
SS
Between 411.617
Within 6545.63
Total 6957.24
5%
df
3
41
44
MS
Fcritical p-value
F
137.21 0.8594 2.8327 0.4698
159.65
Case 12: Checking out Checkout
9-13
1.
ANOVA
n
1
2
3
4
5
6
7
8
9
10
10
10
10
Scan1 Scan2 Scan3

16
13
18
15
18
19
12
13
15
15
15
14
16
18
19
15
14
16
15
15
17
14
15
14
12
14
15
14
16
17
5%
ANOVA Table
Source
Between
Within
Total
SS
df
20.6
2
79.7 27
100.3 29
MS
10.3
2.9519
Fcritical p-value
F
3.4893 3.3541 0.0449 Reject
Reject the null hypothesis of equal number of scans per minute.

2. Rows = Clerks, columns = scanners
ANOVA Table
5%
Source
SS
df
MS
Fcritical p-value
F
Row 20.76667
4 5.19167 2.1239 2.5787 0.0934
Column
90.7
2
45.35 18.552 3.2043 0.0000 Reject
Interaction 14.13333
8 1.76667 0.7227 2.1521 0.6705
Error
110
45 2.44444
Total
235.6
59
Reject the null hypothesis of equal number of scans per minute (columns)
Do not reject the null hypothesis that the clerks are equally efficient.
There are no interaction effects present.
9-14
Chapter 10 - Simple Linear Regression and Correlation
CHAPTER 10
SIMPLE LINEAR REGRESSION AND CORRELATION
(The template for this chapter is: Simple Regression.xls.)
10-1.
A statistical model is a set of mathematical formulas and assumptions that describe some realworld situation.
10-2.
Steps in statistical model building: 1) Hypothesize a statistical model; 2) Estimate the model
parameters; 3) Test the validity of the model; and 4) Use the model.
10-3.
Assumptions of the simple linear regression model: 1) A straight-line relationship between X and
Y; 2) The values of X are fixed; 3) The regression errors, , are identically normally distributed
random variables, uncorrelated with each other through time.
10-4.
0 is the Y-intercept of the regression line, and 1 is the slope of the line.
10-5.
The conditional mean of Y, E(Y | X), is the population regression line.
10-6.
The regression model is used for understanding the relationship between the two variables, X and
Y; for prediction of Y for given values of X; and for possible control of the variable Y, using the
variable X.
10-7.
The error term captures the randomness in the process. Since X is assumed nonrandom, the
addition of makes the result (Y) a random variable. The error term captures the effects on Y of a
host of unknown random components not accounted for by the simple linear regression model.
10-8.
The equation represents a simple linear regression model without an intercept (constant) term.
10-9.
The least-squares procedure produces the best estimated regression line in the sense that the line
lies inside the data set. The line is the best unbiased linear estimator of the true regression line
as the estimators 0 and 1 have smallest variance of all linear unbiased estimators of the line
parameters. Least-squares line is obtained by minimizing the sum of the squared deviations of the
data points from the line.
10-10. Least squares is less useful when outliers exist. Outliers tend to have a greater influence on the
determination of the estimators of the line parameters because the procedure is based on
minimizing the squared distances from the line. Since outliers have large squared distances they
exert undue influence on the line. A more robust procedure may be appropriate when outliers
exist.
10-1
10-11. (Template: Simple Regression.xls, sheet: Regression)

Simple Regression
Income
X
Wealth
Y
Error
17.3
0.8
2
3
4
2
3
4
23.6
40.2
45.8
-3.02
3.46
-1.06
0.167 -0.967
95%
10.12 + or - 2.77974
0.833 0.967
0.333 -0.431 Confidence Interval for Intercept
56.8
-0.18
0.500 0.000
Quantile
Confidence Interval for Slope
0.667 0.431
95%
(1-) C.I. for 1
(1-) C.I. for 0

6.38 + or - 9.21937
Regression Equation: Wealth Growth = 6.38 + 10.12 Income Quantile
10-12. b1 = SSXY /SSX = 934.49/765.98 = 1.22

10-13. (Template: Simple Regression.xls, sheet: Regression)
Thus, b0 = 3.057
b1 = 0.187
2
r 0.9217 Coefficient of Determination

r 0.9601 Coefficient of Correlation

(1-) C.I. for 1
95% 0.18663 + or - 0.03609
s(b1)
Confidence Interval for Intercept

(1-) C.I. for 0
95% -3.05658 + or - 2.1372
s(b0) 0.97102Standard Error of Intercept
Prediction Interval for Y

(1-) P.I. for Y given X
X
95%
10
-1.19025 + or - 2.8317
0.0164Standard Error of Slope
s 0.99538Standard Error of prediction
Prediction Interval for E[Y|X]

X
(1-) P.I. for E[Y | X]
+ or ANOVA Table
Source
SS
Regn. 128.332
Error 10.8987
Total 139.231
df
1
11
12
MS
F
Fcritical p-value
128.332 129.525 4.84434 0.0000
0.99079
10-2
10-14. b1 = SSXY /SSX = 2.11

b0 = y b1 x = 165.3 (2.11)(88.9) = 22.279
10-15.
Simple Regression
Inflation
X
Return
Y
Error
-3
-20.0642
2
3
4
2
12.6
-10.3
36
12
-8
17.9677
-16.294
-14.1247
5
6
7
8
9
0.51
2.03
-1.8
5.79
5.87
53
-2
18
32
24
36.4102
-20.0613
3.64648
10.2987
2.22121
Inflation & return on stocks

2

(1-) C.I. for 1
95%
0.96809 + or - 2.7972
s(b1) 1.18294Standard Error of Slope
95%
(1-) C.I. for 0

16.0961 + or - 17.3299

ANOVA Table
Source
SS
Regn. 291.134
Error 3042.87
Total 3334
df
1
7
8
MS
F
291.134 0.66974
434.695
Fcritical p-value
5.59146 0.4401
10-3
60
50
y = 0.9681x + 16.096
40
30
Y
20
10
0
-15
-10
-5
-10 0
10
15
-20
X
There is a weak linear relationship (r) and the regression is not significant (r2, F, p-value)
10-16.
Simple Regression
Year
X
Value
Y
Error
1960
180000
84000
2
3
4
1970
1980
1990
40000
60000
160000
-72000
-68000
16000
2000
200000
40000
Average value of Aston Martin

2

(1-) C.I. for 1
95%
1600 + or - 7949.76
s(b1)
2498Standard Error of Slope
95%
(1-) C.I. for 0

-3040000 + or - 1.6E+07
s(b0) 4946165Standard Error of Intercept

ANOVA Table
Source
SS
Regn. 2.6E+09
Error 1.9E+10
Total 2.1E+10
df
1
3
4
MS
F
2.6E+09 0.41026
6.2E+09
Fcritical p-value
10.128 0.5674
10-4
250000
y = 1600x - 3E+06
200000
150000
100000
50000
0
1950
1960
1970
1980
X
1990
2000
2010
There is a weak linear relationship (r) and the regression is not significant (r 2, F, p-value).
Limitations: sample size is very small.
Hidden variables: the 70s and 80s models have a different valuation than other decades possibly
due to a different model or style.
10-17. Regression equation is:
Credit Card Transactions = 39.6717 + 0.06129 Debit Card Transactions
2

95%
(1-) C.I. for 1

0.6202
+ or -
0.17018
95%
(1-) C.I. for 0

177.641 + or - 110.147

X
+ or -
Prediction Interval for E[Y|X]

X
(1-) P.I. for E[Y | X]
+ or ANOVA Table
Source
SS
Regn. 332366
Error 12984.5
Total 345351
df
1
4
5
MS
F
332366 102.389
3246.12
Fcritical p-value
7.70865 0.0005
There is no implication for causality. A third variable influence could be increases in per capital
income or GDP Growth.
10-5
y b b x Take partial derivatives with respect to b

/ b [ ( y b b x) ] = 2 y b b x
/ b [ ( y b b x) ] = 2 x y b b x
2
10-18. SSE =
and b1:
Setting the two partial derivatives to zero and simplifying, we get:
y b b x = 0 and x y b
y nb xb = 0 and
0
b1 x = 0.
Expanding, we get:
xy - xb x b
2
=0
Solving the above two equations simultaneously for b0 and b1 gives the required results.
10-19. 99% C.I. for 1 :
1.25533 2.807(0.04972) = [1.1158, 1.3949].
The confidence interval does not contain zero.

10-20. MSE = 7.629
From the ANOVA table for Problem 10-11:
ANOVA Table
Source
SS
Regn. 1024.14
Error 22.888
Total 1047.03
df
1
3
4
MS
1024.14
7.62933
10-21. From the regression results for problem 10-11

s(b0) = 2.897 s(b1) = 0.873
10-22. From the regression results for problem 10-11

95%
(1-) C.I. for 1

10.12 + or - 2.77974
95%
(1-) C.I. for 0

6.38 + or - 9.21937
95% C.I. for the slope: 10.12 2.77974 = [7.34026, 12.89974]

95% C.I. for the intercept: 6.38 9.21937 = [-2.83937, 15.59937]
10-6
10-23. s(b0) = 0.971 s(b1) = 0.016; estimate of the error variance is MSE = 0.991. 95% C.I. for 1 :
0.187 + 2.201(0.016) = [0.1518, 0.2222]. Zero is not a plausible value at = 0.05.
(1-) C.I. for 1
95%
0.18663 + or - 0.03609

(1-) C.I. for 0
95% -3.05658 + or - 2.1372
10-24. s(b0) = 85.44 s(b1) = 0.1534

Estimate of the regression variance is MSE = 8122
95% C.I. for b1: 1.5518 2.776 (0.1534) = [1.126, 1.978]
Zero is not in the range.
(1-) C.I. for 1
95%
1.55176 + or - 0.42578

(1-) C.I. for 0
95%
-255.943 + or - 237.219
10-25. s 2 gives us information about the variation of the data points about the computed regression line.
10-26. In correlation analysis, the two variables, X and Y, are viewed in a symmetric way, where no one
of them is dependent and the other independent, as the case in regression analysis. In
correlation analysis we are interested in the relation between two random variables, both
assumed normally distributed.
10-27. From the regression results for problem 10-11:
10-28. r = 0.960
r
0.9601
Coefficient of Correlation
10-7
10-29. t(5) =
0.3468
(1 .1203) / 3
= 0.640
Accept H0. The two variables are not linearly correlated.

10-30. Yes. For example suppose n = 5 and r = .51; then:
t=
r
(1 r ) /(n 2)
2
= 1.02 and we do not reject H0. But if we take n = 10,000 and
r = 0.04, giving t = 14.28, this leads to strong rejection of H0.

10-31. We have: r = 0.875 and n = 10. Conducting the test:
t (8) =
r
(1 r ) /(n 2)
2
.875
(1 .8752 ) / 8
= 5.11
There is statistical evidence of a correlation between the prices of gold and of copper.
Limitations: data are time-series data, hence not dependent random samples. Also, data set
contains only 10 points.
10-34. n= 65 r = 0.37
t (63) =
.37
(1 .37 2 ) / 63
= 3.16
Yes. Significant. There is a correlation between the two variables.

1
1
ln [(1 + r)/(1 5)] = ln (1.37/0.63) = 0.3884
2
2
1
1
= ln [(1 + )/(1 )] = ln (1.22/0.78) = 0.2237
2
2
10-35. z =
= 1/ n 3 = 1/ 62 = 0.127
z = ( z )/ = (0.3884 0.2237)/0.127 = 1.297.
Cannot reject H0.
10-36. Using TINV(,df) function in Excel, where df = n-2 = 52: =TINV(0.05,52) = 2.006645
And TINV(0.01, 52) = 2.6737
Reject H0 at 0.05 but not at 0.01. There is evidence of a linear relationship at = 0.05 only.
10-37. t (16) = b1/s(b1) = 3.1/2.89 = 1.0727.
Do not reject H0. There is no evidence of a linear relationship using any .
10-38. Using the regression results for problem 10-11:
critical value of t is: t( 0.05, 3) = 3.182
computed value of t is: t = b1/s(b1) = 10.12 / 0.87346 = 11.586
Reject H0. There is strong evidence of a linear relationship.
10-8
10-39. t (11) = b1/s(b1) = 0.187/0.016 = 11.69

Reject H0. There is strong evidence of a linear relationship between the two variables.
10-40. b1/ s(b1) = 1600/2498 = 0.641
Do not reject H0. There is no evidence of a linear relationship.
10-41. t (58) = b1/s(b1) = 1.24/0.21 = 5.90
Yes, there is evidence of a linear relationship.
10-42. Using the Excel function, TDIST(x,df,#tails) to estimate the p-value for the t-test results, where
x = 1.51, df = 585692 2 = 585690, #tails = 2 for a 2-tail test:
TDIST(1.51, 585690,2) = 0.131.
The corresponding p-value for the results is 0.131. The resgression is not significant even at the
0.10 level of significance.
10-43. t (211) = z = b1/s(b1) = 0.68/12.03 = 0.0565
Do not reject H0. There is no evidence of a linear relationship using any . (Why report such
results?)
10-44. b1 = 5.49 s(b1) = 1.21
t (26) = 4.537
Yes, there is evidence of a linear relationship.
10-45. The coefficient of determination indicates that 9% of the variation in customer satisfaction can
be explained by the changes in a customers materialism measurement.
10-46 a. The model should not be used for prediction purposes because only 2.0% of the
variation in pension funding is explained by its relationship with firm profitability.
b. The model explains virtually nothing.
c. Probably not. The model explains too little.
10-47. In Problem 10-11 regression results, r 2 = 0.9781. Thus, 97.8% of the variation in wealth growth
is explained by the income quantile.
2
0.9781 Coefficient of Determination
10-48. In Problem 10-13, r 2 = 0.922. Thus, 92.2% of the variation in the dependent variable is
explained by the regression relationship.
10-49. r 2 in Problem 10-16: r 2 = 0.1203
10-50. Reading directly from the MINITAB output: r 2 = 0.962
10-9
10-51. Based on the coefficient of determination values for the five countries, the UK model explains
31.7% of the variation in long-term bond yields relative to the yield spread. This is the best
predictive model of the five. The next best model is the one for Germany, which explains 13.3%
of the variation. The regression models for Canada, Japan, and the US do not predict long-term
yields very well.
10-52. From the information provided, the slope coefficient of the equation is equal to -14.6. Since its
value is not close to zero (which would indicate that a change in bond ratings has no impact on
yields), it would indicate that a linear relationship exists between bond ratings and bond yields.
This is in line with the reported coefficient of determination of 61.56%.
10-53. r 2 in Problem 10-15: r 2 = 0.873
2
10-54.
( y y) = [( y y) ( y y )] = [( y y) 2( y y)( y y ) ( y y )
= ( y y ) 2 ( y y )( y y ) + ( y y )
But: 2 ( y y )( y y ) = 2 y ( y y ) 2 y ( y y ) = 0
2
because the first term on the right is the sum of the weighted regression residuals, which sum to
zero. The second term is the sum of the residuals, which is also zero. This establishes the result:
( y y ) 2 ( y y ) 2 ( y y ) 2 .
10-55. From Equation (10-10): b1 = SSXY/SSX.

From Equation (10-31):
SSR = b1SSXY. Hence, SSR = (SSXY /SSX)SSXY = (SSXY) 2/SSX
10-56. Using the results for problem 10-11:
F = 134.238 F(1,3) = 10.128
Reject H0.
F
134.238
Fcritical p-value
10.128 0.0014
10-57. F (1,11) = 129.525

F
129.525
10-58. F(1,4) = 102.39
t (11) = 11.381
Fcritical
4.84434
t (4) = 10.119
t 2 = 11.3812 = the F-statistic value already calculated.

p-value
0.0000
t 2 = F (10.119)2 = 102.39
10-10
F
102.389
p-value
0.0005
Fcritical
7.70865
10-59. F (1,7) = 0.66974 Do not reject H0.
10-60. F (1,102) = MSR/MSE =
87,691/ 1
= 701.8
12,745 / 102
There is extremely strong evidence of a linear relationship between the two variables.
10-61. t (k2 ) = F (1,k) . Thus, F(1,20) = [b1/s(b1)]2 = (2.556/4.122)2 = 0.3845
Do not reject H0. There is no evidence of a linear relationship.
10-62
t (k2 )
SS / SS
X
= [b1/s(b1)] = XY
s / SS
X
[using Equations (10-10) and (10-15) for b1 and s(b1), respectively]

SS / SS
XY
X
=
MSE / SS
X
= (SS XY / SS X )
MSE / SS X
SSR/1
MSR
SS 2XY / SS X
=
=
= F (1,k)
MSE
MSE
MSE
[because SS 2XY / SS X = SSR by Equations (10-31) and (10-10)]

10-63. a. Heteroscedasticity.
b. No apparent inadequacy.
c. Data display curvature, not a straight-line relationship.
10-64. a. No apparent inadequacy.
b. A pattern of increase with time.
10-65. a. No serious inadequacy.
b. Yes. A deviation from the normal-distribution assumption is apparent.
10-11
10-66. Using the results for problem 10-11:

Residual Analysis
Durbin-Watson statistic
d
3.39862
Residual Plot
4
3
2
Error
1
0
-1
-2
-3
-4
X
Residual variance fluctuates; with only 5 data points the residuals appear to be normally
distributed.
Normal Probability Plot of Residuals

3
Corresponding
Normal Z
2
1
0
-10
-5
0
-1
-2
-3
Residuals
10-12
10
10-67. Residuals plotted against the independent variable of Problem 10-14:

*
resids
1.2+
*
*
*
0.0+
*
*
*
*
*
-1.2+
Quality
30
40
50
60
No apparent inadequacy.
Residual Analysis
d
2.0846
10-68.
10-13
70
80
Residual Analysis
d
1.70855
Plot shows some curvature.
10-69. In the American Express example, give a 95% prediction interval for x = 5,000:
y = 274.85 + 1.2553(5,000) = 6,551.35.
P.I. = 6,551.35 (2.069)(318.16) 1
1 (5,000 3,177.92) 2
25
40,947,557.84
= [5,854.4, 7,248.3]
10-70. Given that the slope of the equation for 10-52 is 14.6, if the rating falls by 3 the yield should
increase by 43.8 basis points.
10-71. For 99% P.I.:
t .005(23) = 2.807
6,551.35 (2.807)(318.16) 1
1 (5,000 3,177.92) 2
25
40,947,557.84
= [5,605.75, 7,496.95]
10-72. Point prediction: y 6.38 10.12(4) 46.86
The 99% P.I.: [28.465, 65.255]
X
99%
4
46.86 + or - 18.3946
10-14
10-73. The 99% P.I.: [36.573, 77.387]

X
99%
5
56.98 + or - 20.407
10-74. The 95% P.I.: [-142633, 430633]

X
95% 1990
144000 + or - 286633
10-75. The 95% P.I.: [-157990, 477990]

X
95% 2000
160000 + or - 317990
16.0961 0.96809(5) 20.9365

10-76. Point prediction: y
10-77.
a) simple regression equation: Y = 2.779337 X 0.284157
when X = 10, Y = 27.5092
Intercept
Slope
b0
b1
-0.284157 2.779337
b) forcing through the origin: regression equation: Y = 2.741537 X.

Intercept
Slope
b0
0
b1
2.741537
When X = 10, Y = 27.41537

Prediction
X
10
Y
27.41537
c) forcing through (5, 13): regression equation: Y = 2.825566 X 1.12783

Intercept
Slope
b0
b1
-1.12783 2.825566
Prediction
X
5
Y
13
10-15
When X = 10, Y = 27.12783

Prediction
X
10
Y
27.12783
d) slope 2: regression equation: Y = 2 X + 4.236

Intercept
Slope
b0
4.236
b1
2
When X = 10, Y = 24.236

10-78. Using Excel function, TINV(x, df), where x = the p-value of 0.034 and df = 2058 2:
TINV(0.034, 2056) = 2.121487. Since the slope coefficient = -0.051, t-value becomes negative, t
= -2.121487.
a) standard error of the slope: sb1
b1
0.051
0.02404
2.12487
t
b) Using an = 0.05, we would reject the null hypothesis of no relationship between the
response variable and the predictor based on the reported p-value of 0.034.
10-79. Given the reported p-value, we would reject the null hypothesis of no relationship between
neuroticism and job performance. Given the reported coefficient of determination, 19% of the
variation in job performance can be explained by neuroticism.
10-80. The t-statistic for the reported information is:
b1 0.233
4.236
0.055
sb1
Using Excel function, TDIST(t,df,#tails), we get a p-value of 0.000068:

TDIST(4.236, 70, 2) = 6.8112E-05. There is a linear relationship between frequency of online
shopping and the level of perceived risk.
10-81 (From Minitab)

The regression equation is
Stock Close = 67.6 + 0.407 Oper Income
10-16
Predictor
Constant
Coef
Oper Inc
67.62
0.40725
s = 9.633
R-sq = 89.0%
Stdev
12.32
t-ratio
5.49
11.38
0.03579
p
0.000
0.000
R-sq(adj) = 88.3%
Analysis of Variance
SOURCE
DF
SS
MS
F
p
Regression
1
12016
12016
129.49 0.000
Error
16
1485
93
Total
17
13500
Stock close based on an operating income of $305M is y = $56.24.
(Minitab results for Log Y)

Log_Stock Close = 2.32 + 0.00552 Oper Inc
Predictor
Constant
Oper Inc
s = 0.08422
Coef
2.3153
0.0055201
Stdev
0.1077
0.0003129
R-sq = 95.1%
SOURCE
DF
Regression
1
Error
16
Total
17
p
0.000
0.000
R-sq(adj) = 94.8%
SS
2.2077
0.1135
2.3212
Unusual Observations
Obs.
x
y
1
240
3.8067
t-ratio
21.50
17.64
MS
2.2077
0.0071
Fit
3.6401
F
311.25
Stdev.Fit
0.0366
p
0.000
Residual
0.1666
R denotes an obs. with a large st. resid.

Stock close based on an operating income of $305M is y = $54.80
10-17
St.Resid
2.20R
The regression using the Log of monthly stock closings is a better fit. Operating Income explains
over 95% of the variation in the log of monthly stock closings versus 89% for non-transformed Y.
10-82. a) The calculated t-value for the slope coefficient is:
b1 0.92
92.00
sb1 0.01
Using Excel function, TDIST(t,df,#tails), we get a p-value of 0.0

TDIST(92.0, 598, 2) = 0. There is a linear relationship.
b) The excess return would be 0.9592:
FER = 0.95 + 0.92(0.01) = 0.9592
10-83
a) adding 2 to all X values: new regression: Y = 5 X + 17
since the intercept is b0 Y b1 X , the only thing that has changed is that the value for Xbar has increased by 2. Therefore, take the change in X-bar times the slope and add it to the
original regression intercept.
b) adding 2 to all Y values: new regression: Y = 5X + 9
using the formula for the intercept, only the value for Y-bar changes by 2. Therefore, the
intercept changes by 2
c) multiplying all X values by 2: new regression: Y = 2.5 X + 7
d) multiplying all Y Values by 2: new regression: Y = 10 X + 7
10-84 You are minimizing the squared deviations from the former x-values instead of the former yvalues.
10-85
a)
Y = 3.820133 X + 52.273036
Intercept
Slope
b0
b1
52.273036 3.820133
b)
90% CI for slope: [3.36703, 4.27323]

c)
(1-) C.I. for 1
90%
3.82013 + or - 0.4531
r2 = 0.9449, very high; F = 222.931 (p-value = 0.000): both indicate that X affects Y
10-18
d)
since the 99% CI does not contain the value 0, the slope is not 0
e)
(1-) C.I. for 1
99%
3.82013 + or - 0.77071
Y = 90.47436 when X = 10
Prediction
X
10
Y
90.47436
f)
X = 12.49354
g)
residuals appear to be random
Residual Analysis
d
h)
2.56884
appears to be a little flatter than normal
10-19
Case 13: Level of leverage

a) Leverage = -0.118 0.040 (Rights)
b) Using Excel function, TDIST(t,df,#tails), we get a p-value of 0.0
TDIST(2.62, 1307, 2) = 0.0089 There is a linear relationship.
c) The reported coefficient of determination indicates that shareholders rights explain 16.5%
of the variation in a firms leverage.
Case 14: Risk and Return
1)
Y = 1.166957 X 1.060724
Intercept
Slope
b0
b1
-1.090724 1.166957
2)
stock has above average risk: b1 > 1.10
3)
95 % CI for slope:
4)
(1-) C.I. for 1
95%
1.16696 + or - 0.37405
When X = 10, Y = 10.57884

95% CI on prediction:
X
95% 10
10.5788 + or - 5.35692
5)
residuals appear random

Residual Analysis
d
0.83996
10-20
6)
a little flatter than normal

Normal Probability Plot of Residuals
Cor responding
Nor mal Z
3
2
1
0
-10
-5
-1
10
-2
-3
Residuals
7)
Y = 1.157559 0.945353
Intercept
Slope
b0
b1
-0.945353 1.157559
Prediction
X
6
Y
6
risk has dropped a little but it is still above average since b1 > 1.10
10-21
Chapter 11 - Multiple Regression
CHAPTER 11
MULTIPLE REGRESSION
(The template for this chapter is: Multiple Regression.xls.)
11-1.
The assumptions of the multiple regression model are that the errors are normally and
independently distributed with mean zero and common variance 2 . We also assume that the X i
are fixed quantities rather than random variables; at any rate, they are independent of the error
terms. The assumption of normality of the errors is need for conducting test about the regression
model.
11-2.
Holding advertising expenditures constant, sales volume increases by 1.34 units, on average, per
increase of 1 unit in promotional experiences.
11-3.
In a correlational analysis, we are interested in the relationships among the variables. On the
other hand, in a regression analysis with k independent variables, we are interested in the effects
of the k variables (considered fixed quantities) on the dependent variable only (and not on one
another).
11-4.
A response surface is a generalization to higher dimensions of the regression line of simple linear
regression. For example, when 2 independent variables are used, each in the first order only, the
response surface is a plane is a plane in 3-dimensional euclidean space. When 7 independent
variables are used, each in the first order, the response surface is a 7-dimensional hyperplane in
8-dimensional euclidean space.
11-5.
8 equations.
11-6.
The least-squares estimators of the parameters of the multiple regression model, obtained as
solutions of the normal equations.
11-7.
Y nb b X b X
X Y b X b X b X X
X Y b X b X X b X
0
2
2
852 = 100b0 + 155b1 + 88b2

11,423 = 155b0 + 2,125b1 + 1,055b2
8,320 = 88b0 + 1,055b1 + 768b2
b0 = (852 155b1 88b2)/100
11,423 = 155(852 155b1 88b2)/100 + 2,125b1 + 1,055b2
8,320 = 88(852 155b1 88b2)/100 + 1,055b1 + 768b2
11-1
Continue solving the equations to obtain the solutions:

b0 = 1.1454469
11-8.
b1 = 0.0487011
b2 = 10.897682
Using SYSTAT:
DEP VAR:
VALUE
N: 9
MULTIPLE R: .909
ADJUSTED SQUARED MULTIPLE R:
.769
STANDING ERROR OF ESTIMATE:

VARIABLE
COEFFICIENT
SQUARED MULTIPLE R: .826
59.477
STD ERROR
STD COEF
TOLERANCE
P(2TAIL)
CONSTANT
9.800
80.763
0.121
0.907
SIZE
0.173
0.040
0.753
0.9614430
4.343
0.005
DISTANCE
31.094
14.132
0.382
0.9614430
2.200
0.070
0.000
ANALYSIS OF VARIANCE
SOURCE
SUM-OF-SQUARES
DF
MEAN-SQUARE
F-RATIO
REGRESSION
101032.867
50516.433
14.280
RESIDUAL
21225.133
3537.522
Multiple Regression Results

0
Intercept
b -9.7997
s(b) 80.7627
t -0.1213
p-value 0.9074
1
2
Size Distance
0.17331
31.094
0.0399
14.132
4.34343
2.2002
0.0049
0.0701
VIF 1.0401
P
0.005
Value
3
1.0401
ANOVA Table
Source
SS
Regn. 101033
Error 21225.1
df
2
6
MS
50516
3537.5
Total 122258
15282
11-2
FCritical p-value
F
14.28 5.1432 0.0052
2
R 0.8264
s 59.477
2
Adjusted R 0.7685
11-9.
With no advertising and no spending on in-store displays, sales are b0 47.165 (thousands) on
the average. Per each unit (thousand) increase in advertising expenditure, keeping in-store
display expenditure constant, there is an average increase in sales of b1 = 1.599 (thousand).
Similarly, for each unit (thousand) increase in in-store display expenditure, keeping advertising
constant, there is an average increase in sales of b2 = 1.149 (thousand).
11-10. We test whether there is a linear relationship between Y and any of the X, variables (that is, with
at least one of the Xi). If the null hypothesis is not rejected, there is nothing more to do since
there is no evidence of a regression relationship. If H0 is rejected, we need to conduct further
analyses to determine which of the variables have a linear relationship with Y and which do not,
and we need to develop the regression model.
11-11. Degrees of freedom for error = n 13.
11-12. k = 2
n = 82
SSE = 8,650 SSR = 988
MSR = SSR / k = 988 / 2 = 494
SST = SSR + SSE = 988 + 8650 = 9638
MSE = SSE / n (k+1) = 8650 / 79 = 109.4937
F = MSR / MSE = 494 / 109.4937 = 4.5116
Using Excel function to return the p-value, FDIST(F, dfN, dfD), where F is the F-test result and
the dfs refer to the degrees of freedom in the numerator and denominator, respectively.
FDIST(4.5116, 2, 79) = 0.013953
Yes, there is evidence of a linear regression relationship at = 0.05, but not at 0.01.
11-13. F (4,40) = MSR/MSE =
7,768 / 4
= 1,942/197.625 = 9.827
(15,673 7,768) / 40
Yes, there is evidence of a linear regression relationship between Y and at least one of the
independent variables.
11-14. Source
Regression
Error
Total
SS
7,474.0
672.5
8,146.5
df
3
13
16
MS
2,491.33
51.73
F
48.16
Since the F-ratio is highly significant, there is evidence of a linear regression relationship
between overall appeal score and at least one of the three variables prestige, comfort, and
economy.
11-15. When the sample size is small; when the degrees of freedom for error are relatively smallwhen
adding a variable and thus losing a degree if freedom for error is substantial.
11-3
11-16. R 2 = SSR/SST. As we add a variable, SSR cannot decrease. Since SST is constant, R 2 cannot
decrease.
11-17. No. The adjusted coefficient is used in evaluating the importance of new variables in the
presence of old ones. It does not apply in the case where all we consider is a single independent
variable.
11-18. By the definition of the adjusted coefficient of determination, Equation (11-13):
R2 = 1
n 1
SSE /( n k 1)
= 1 (SSE/SST)
n k 1
SST /( n 1)
But: SSE/SST = 1 R 2, so the above is equal to:

1 (1 R 2)
n 1
n (k 1)
which is Equation (11-14).
11-19. The mean square error gives a good indication of the variation of the errors in regression.
However, other measures such as the coefficient of multiple determination and the adjusted
coefficient of multiple determination are useful in evaluating the proportion of the variation in
the dependent variable explained by the regressionthus giving us a more meaningful measure
of the regression fit.
11-20. Given an adjusted R 2 = 0.021, only 2.1% of the variation in the stock return is explained by the
four independent variables.
FDIST(2.27, 4, 433) = 0.06093
There is evidence of a linear regression relationship at = 0.10 only.
11-21. R 2 = 7,474.0/8,146.5 = 0.9174 A good regression.

R 2 = 1 (1 0.9174)(16/13) = 0.8983
s=
MSE =
51.73 = 7.192
11-22. Given R 2 = 0.94, k = 2 and n = 383, the adjusted R 2is:

n 1
= 1 (1 0.94)(382/380) = 0.9397
n (k 1)
Therefore, security and time effects characterize 93.97% of the variation on market price. Given
the value of the adjusted R 2, the model is a reliable predictor of market price.
R2
=1 (1 R 2)
11-23. R 2 = 1 (1 R 2)
n 1
= 1 (1 0.918)(16/12) = 0.8907
n (k 1)
Since R 2 has decreased, do not include the new variable.
11-4
11-24. Given R 2 = 0.769, k = 6 and n = 242

R 2 = 1 (1 R 2)
n 1
= 1 (1 0.769)(241/235) = 0.7631
n (k 1)
Since R 2 =76.31%, approximately 76% of the variation in the information price is characterized
by the 6 independent marketing variables.
FDIST(44.8, 6, 235) = 2.48855E-36
There is evidence of a linear regression relationship at all s.
11-25. a.
The regression expresses stock returns as a plane in space, with firm size ranking and
stock price ranking as the two horizontal axes:
RETURN = 0.484 - 0.030(SIZRNK) 0.017(PRCRNK)
The t-test for a linear relationship between returns and firm size ranking is highly significant,
but not for returns against stock price ranking.
b. We know that R 2 = 0.093 and n = 50, k = 2. Using Equation (11-14) we calculate:

n 1
= 1 R 2
(1 R 2)
n
(
k
1
)
n (k 1)
R 2 = 1 (1 R 2 )
= 1 (1 0.093)(47/49) = 0.130
n 1
Thus, 13% of the variation is due to the two independent variables.
c. The adjusted R 2 is quite low, indicating that the regression on both variables is not a good
model. They should try regressing on size alone.
11-26. R 2 = 1 (1 - R 2)
n 1
= 1 (1 0.72)(712/710) = 0.719
n (k 1)
Based solely on this information, this is not a bad regression model.

11-27. k = 8
n = 500
Source
Regn.
Error
Total
SSE = 6179
SST = 23108
SS
16929
6179
df
8
491
23108
499
11-5
MS
2116.125
12.5845
3.0684E+14
F
168.153
FDIST(168.153, 8, 491) = 0.00 approximately
There is evidence of a linear regression relationship at all s.
R 2 = SSR/SST = 0.7326
R2= 1
SSE /[ n (k 1)]
= 0.7282
SST /( n 1)
MSE = 12.5845
11-28. A joint confidence region for both parameters is a set of pairs of likely values of 1 , and 2 at
95%. This region accounts for the mutual dependency of the estimators and hence is elliptical
rather than rectangular. This is why the region may not contain a bivariate point included in the
separate univariate confidence intervals for the two parameters.
11-29. Assuming a very large sample size, we use the following formula for testing the significance of
each of the slope parameters: z
bi
. and use = 0.05. Critical value of |z| = 1.96
sbi
For firm size: z = 0.06/0.005 = 12.00 (significant)

For firm profitability: z = -5.533 (significant)
For fixed-asset ratio: z = -0.08
For growth opportunities: z = -0.72
For nondebt tax shield: z = 4.29 (significant)
The slope estimates with respect to firm size, firm profitability and nondebt tax shield are
not zero. The adjusted R-square indicates that 16.5% of the variation in governance level is
explained by the five independent variables. Next step: exclude fixed-asset ratio and growth
opportunities from the regression and see what happens to the adjusted R-square.\
11-30. 1.
The usual caution about the possibility of a Type 1 error.
2. Multicollinearity may make the tests unreliable.
3. Autocorrelation in the errors may make the tests unreliable.
11-31. 95% C.I.s for 2 through 5 :
2 : 5.6 1.96(1.3) = [3.052, 8.148]

3 : 10.35 1.96(6.88) = [3.135, 23.835]
4 : 3.45 1.96(2.7) = [1.842, 8.742]
5 : 4.25 1.96(0.38) = [4.995, 3.505]
3 & 4 :contains the point (0,0)
11-32. Use the following formula for testing the significance of each of the slope parameters:
bi
sbi
11-6
For unexpected accruals: z = -2.0775 / 0.4111 = -5.054 (significant)

For auditor quality: z = 0.5176
For return on investment: z = 1.7785
For expenditure on R&D: z = 2.1161 (significant)
The R-square indicates that 36.5% of the variation in a firms reputation can be explained by the
four independent variables listed.
11-33. Yes. Considering the joint confidence region for both slope parameters is equivalent to
conducting an F test for the existence of a linear regression relationship. Since (0,0) is not in the
joint 95% region, this is equivalent to rejecting the null hypothesis of the F test at = 0.05.
11-34. Prestige is not significant (or at least appears so, pending further analysis). Comfort and
Economy are significant (Comfort only at the 0.05 level). The regression should be rerun with
variables deleted.
11-35. Variable Lend seems insignificant because of collinearity with M1 or Price.
11-36. a. As Price is dropped, Lend becomes significant: there is, apparently, a collinearity between
Lend and Price.
b.,c. The best model so far is the one in Table 11-9, with M1 and Price only. The adjusted R 2 for
that model is higher than for the other regressions.
d. For the model in this problem, MINITAB reports F = 114.09. Highly significant. For the
model in Table 11-9: F = 150.67. Highly significant.
e. s = 0.3697. For Problem 11-35: s = 0.3332. As a variable is deleted, s (and its square, MSE)
increases.
f. In Problem 11-35: MSE = s 2 = (0.3332)2 = 0.111.
11-37. Autocorrelation of the regression error may cause this.
bi
sbi
For new technological process: z = -0.014 / 0.004 = -3.50 (significant)

For organizational innovation: z = 0.25
For commercial innovation: z = 3.2 (significant)
For R&D: z = 4.50 (significant)
All but organizational innovation is an important independent variable in explaining
employment growth. The R-square indicates that 74.3% of the variation in employment growth
is explained by the four independent variables in the equation.
11-7
11-39. Regress Profits on Employees and Revenues

Multiple Regression
Sl.No.
1
2
3
4
5
6
7
8
9
10
Y
Profits
-1221
-2808
-773
248
38
1461
442
14
57
108
ANOVA Table
Source
Regn.
Error
Total
1
Ones
1
1
1
1
1
1
1
1
1
1
X1
Employees
96400
63000
70600
39100
37680
31700
32847
12867
11475
6000
SS
df
4507008.861 2
7281731.539 7
11788740.4
X2
Revenues
17440
13724
13303
9510
8870
6846
5937
2445
2254
1311
MS
2253504.43
1040247.363
1309860.044

0
1
2
Employees
Revenues
Intercept
0.0085493 -0.174148688
b 834.9510193
s(b) 621.1993315 0.064416986 0.340929503
t 1.344095167 0.132718098 -0.510805567
0.2208
0.8982
0.6252
p-value
VIF
FCritical p-value
F
2.166
4.737 0.1852
2
R 0.3823
29.8304
29.8304
1019.925
0.2058
Adjusted R
Correlation matrix
1
2
Employees Revenues
1 Employees
1.0000
2 Revenues
0.9831
1.0000
Y
Profits
-0.5994
-0.6171
Regression Equation:
Profits = 834.95 + 0.009 Employees - 0.174 Revenues
The regression equation is not significant (F value), and there is a large amount of
multicollinearity present between the two independent variables (0.9831). There is so much
multicollinearity present that the negative partial correlations between the independent variables
and profits are not maintained in the regression results (both of the parameters of the independent
variables should be negative). None of the values of the parameters are significant.
11-40. The residual plot exhibits both heteroscedasticity and a curvature apparently not accounted for in
the model.
11-8
11-41.
a) residuals appear to be normally distributed
b) residuals are not normally distributed
11-42. An outlier is an observation far from the others.
11-43. A plot of the data or a plot of the residuals will reveal outliers. Also, most computer packages
(e.g., MINITAB) will automatically report all outliers and suspected outliers.
11-44. Outliers, unless they are due to errors in recording the data, may contain important information
about the process under study and should not be blindly discarded. The relationship of the true
data may well be nonlinear.
11-45. An outlier tends to tilt the regression surface toward it, because of the high influence of a large
squared deviation in the least-squares formula, thus creating a possible bias in the results.
11-46. An influential observation is one that exerts relatively strong influence on the regression surface.
For example, if all the data lie in one region in X-space and one observation lies far away in X, it
may exert strong influence on the estimates of the regression parameters.
11-47. This creates a bias. In any case, there is no reason to force the regression surface to go through
the origin.
11-48. The residual plot in Figure 11-16 exhibits strong heteroscedasticity.
11-49. The regression relationship may be quite different in a region where we have no observations
from what it is in the estimation-data region. Thus predicting outside the range of available data
may create large errors.
11-50. y = 47.165 + 1.599(8) + 1.149(12) = 73.745 (thousands), i.e., $73,745.
11-51. In Problem 11-8: X 2 (distance) is not a significant variable, but we use the complete original
regression relationship given in that problem anyway (since this problem calls for it):
y = 9.800 + 0.173X 1 + 31.094X 2
y (1800,2.0) = 9.800 + (.173)1800 + (31.094)2.0 = 363.78
11-52. Using the regression coefficients reported in Problem 11-25:

Y = 0.484 0.030Sizrnk 0.017Prcrnk = 0.484 0.030(5.0) 0.017(6.0) = 0.232
11-53. Estimated SE( Y ) is obtained as:
(3.939 0.6846)/4 = 0.341.
Estimated SE(E(Y | x)) is obtained as: (3.939 0.1799)/4 = 0.085.
11-9
11-54. From MINITAB:

Fit: 73.742
St Dev Fit: 2.765
95% C.I. [67.203, 80.281] 95% P.I. [65.793, 81692]
(all numbers are in thousands)
11-55. The estimators are the same although their standard errors are different.
11-56. A prediction interval reflects more variation than a confidence interval for the conditional mean
of Y. The additional variation is the variation of the actual predicted value about the conditional
mean of Y (the estimator of which is itself a random variable).
11-57. This is a regression with one continuous variable and one dummy variable. Both variables are
significant. Thus there are two distinct regression lines. The coefficient of determination is
respectively high. During times of restricted trade with the Orient, the company sells 26,540
more units per month, on average.
bi
sbi
For the dummy variable: z = -0.003 / 0.29 = -0.0103 is not significant. A firms being regulated
or not does not affect its leverage level.
11-59. Two-way ANOVA.
11-60. Use analysis of covariance. Run it as a regressionLength of Stay is the concomitant variable.
11-61. Early investment is not statistically significant (or may be collinear with another variable). Rerun
the regression without it. The dummy variables are both significantthere is a distinct line (or
plane if you do include the insignificant variable) for each type of firm.
11-62. This is a second-order regression model in three independent variables with cross-terms.
11-63. The STEPWISE routine chooses Price and M1 * Price as the best set of explanatory variables.
This gives the estimated regression relationship:
Exports = 1.39 + 0.0229Price + 0.00248M1 * Price
The t-statistics are: 2.36, 4.57, 9.08, respectively. R 2 = 0.822.
11-64. The STEPWISE routine chooses the three original variables: Prod, Prom, and Book, with no
squares. Thus the original regression model of Example 11-3 is better than a model with squared
terms.
11-10
Example 11-3 with production costs squared: higher s than original model.
0
Intercept
b 7.04103
s(b) 5.82083
t 1.20963
0.2451
p-value
1
2
prod promo
3.10543 2.2761
1.76478 0.262
1.75967 8.6887
0.0988 0.0000
3
4
book prod^2
7.1125 -0.017
1.9099 0.1135
3.7241
-0.15
0.0020 0.8827
VIF 34.5783 1.7050 1.2454 32.3282

ANOVA Table
Source
SS
Regn. 6325.48
Error 217.472
df
4
15
MS
FCritical p-value
F
1581.4 109.07 3.0556 0.0000
14.498
Total 6542.95
19
344.37
s 3.8076
2
R 0.9668
Adjusted R 0.9579
Example 11-3 with production and promotion costs squared: higher s and slightly higher R2
0
Intercept
b 5.30825
s(b) 5.84748
t 0.90778
p-value 0.3794
1
2
prod promo
4.29943 1.2803
1.95614 0.8094
2.19792 1.5817
0.0453 0.1360
3
book
6.7046
1.8942
3.5396
0.0033
4
5
prod^2 promo^2
-0.0948 0.0731
0.1262 0.0564
-0.7511
1.297
0.4651 0.2156
VIF 44.4155 17.0182 1.2807 41.7465 16.2580

ANOVA Table
Source
SS
Regn. 6348.81
Error 194.145
df
5
14
MS
FCritical p-value
F
1269.8 91.564 2.9582 0.0000
13.867
Total 6542.95
19
344.37
11-11
0.9703
s 3.7239
2
Adjusted R 0.9597
Example 11-3 with promotion costs squared: slightly lower s, slightly higher R2
0
1
2
Intercept prod promo
b 9.21031 2.86071 1.5635
s(b) 2.64412 0.39039 0.7057
t 3.48332 7.3279 2.2157
p-value 0.0033 0.0000 0.0426
VIF
3
4
book promo^2
7.0476
0.053
1.8114 0.0489
3.8908 1.0844
0.0014 0.2953
1.8219 13.3224 1.2062 12.5901
ANOVA Table
Source
SS
Regn. 6340.98
Error 201.967
df
4
15
MS
1585.2
13.464
Total 6542.95
19
344.37
FCritical p-value
F
117.74 3.0556 0.0000
2
R 0.9691
s 3.6694
2
Adjusted R 0.9609
bi
sbi
For After * Bankdep: z = -0.398 / 0.035 = -11.3714 (significant interaction)

For After * Bankdep * ROA: z = 2.7193 (significant interaction)
For After * ROA: z = -3.00 (significant interaction)
For Bankdep * ROA: z = -3.9178 (significant interaction)
An adjusted R-square of 0.53 indicates that 53% of the variation in bank equity has been
expressed by interaction among the independent variables.
11-66. The squared X 1 variable and the cross-product term appear not significant. Drop the least
significant term first, i.e., the squared X 1, and rerun the regression. See what happens to the
cross-product term now.
11-67. Try a quadratic regression (you should get a negative estimated x 2 coefficient).
11-68. Try a quadratic regression (you should get a positive estimated x 2 coefficient). Also try a cubic
polynomial.
11-69. Linearizing a model; finding a more parsimonious model than is possible without a
transformation; stabilizing the variance.
11-12
11-70. A transformed model may be more parsimonious, when the model describes the process well.
11-71. Try the transformation logY.
11-72. A good model is log(Exports) versus log(M 1) and log(Price). This model has R 2 = 0.8652. Thus
implies a multiplicative relation.
11-73. A logarithmic model.
11-74. This dataset fits an exponential model, so use a logarithmic transformation to linearize it.
11-75. A multiplicative relation (Equation (11-26)) with multiplicative errors. The reported error term,
, is the logarithm of the multiplicative error term. The transformed error term is assumed to
satisfy the usual model assumptions.
11-76. An exponential model Y = (e 0 1x1 2 x2 ) =
(e3.79+1.66 1 +2.91 2 )
11-77. No. We cannot find a transformation that will linearize this model.
11-78. Take logs of both sides of the equation, giving:
log Q = log 0 + 1log C + 2log K + 3log L + log
11-79. Taking reciprocals of both sides of the equation.
11-80. The square-root transformation Y Y
11-81. No. They minimize the sum of the squared deviations relevant to the estimated, transformed
model.
11-82. It is possible that the relation between a firms total assets and bank equity is not linear.
Including the logarithm of a firms total assets is an attempt to linearize that relationship.
11-83.
Prod
Prom
Book
Earn
.867
.882
.547
Prod
Prom
.638
.402
.319
As evidenced by the relatively low correlations between the independent variables,

multicollinearity does not seem to be serious here.
11-13
11-84. The VIFs are: 1.82, 1.70, 1.20. No severe multicollinearity is present.
11-85. The sample correlation is 0.740. VIF = 2.2 minor multicollinearity problem
11-86.
a) Y = 11.031 + 0.41869 X1 7.2579 X2 + 37.181 X3
0
1
2
X1
X2
Intercept
11.031 0.41869 -7.2579
b
s(b) 20.9905 0.28418 5.3287
t 0.52552 1.47334
-1.362
0.6107
0.1714 0.2031
p-value
VIF
1.0561 557.7
ANOVA Table
Source
SS
Regn. 2459.78
Error 5981.02
Total
8440.8
3
X3
37.181
26.545
1.4007
0.1916
557.9
df
3
10
MS
819.93
598.1
13
649.29
FCritical p-value
F
1.3709 3.7083 0.3074
R
s 24.456
2
0.2914
Adjusted R 0.0788
b) Y = 20.8808 + 0.29454 X1 +16.583 X2 81.717 X3

0
Intercept
20.8808
b
s(b)
23.5983
t
0.88484
0.3970
p-value
VIF
1
X1
0.29454
0.29945
0.98361
0.3485
2
3
X2
X3
16.583 -81.717
23.96
119.5
0.6921 -0.6838
0.5046 0.5096
1.0262 9867.0
9867.4
ANOVA Table
Source
Regn.
Error
SS
1605.98
6834.82
df
3
10
MS
FCritical p-value
F
535.33 0.7832 3.7083 0.5300
683.48
Total
8440.8
13
649.29
11-14
R 0.1903
s 26.143
2
Adjusted R -0.0527
c) all parameters of the equation change values and some change signs. X2 and X3 are correlated
(0.9999) Solution: use either X2 or X3, but not both.
d) Yes, the correlation matrix indicated that X2 and X3 were correlated
X1
X2
X3
X1 1.0000
X2 -0.0137 1.0000
X3 -0.0237 0.9991 1.0000
11-87. Artificially high variances of regression coefficient estimators; unexpected magnitudes of some
coefficient estimates; sometimes wrong signs of these coefficients. Large changes in coefficient
estimates and standard errors as a variable or a data point is added or deleted.
11-88. Perfect collinearity exists when at least one variable is a linear combination of other variables.
This causes the determinant of the X X matrix to be zero and thus the matrix non-invertible. The
estimation procedure breaks down in such cases. (Other, less technical, explanations based on the
text will suffice.)
11-89. Not true. Predictions may be good when carried out within the same region of the
multicollinearity as used in the estimation procedure.
11-90. No. There are probably no relationships between Y and any of the two independent variables.
11-91. X 2 and X 3 are probably collinear.
11-92. Delete one of the variables X 2, X 3, X 4 to check for multicollinearity among a subset of these
three variables, or whether they are all insignificant.
11-93. Drop some of the other variables one at a time and see what happens to the suspected sign of the
estimate.
11-94. The purpose of the test is to check for a possible violation of the assumption that the regression
errors are uncorrelated with each other.
11-95. Autocorrelation is correlation of a variable with itself, lagged back in time. Third-order
autocorrelation is a correlation of a variable with itself lagged 3 periods back in time.
11-96. First-order autocorrelation is a correlation of a variable with itself lagged one period back in
time. Not necessarily: a partial fifth-order autocorrelation may exist without a first-order
autocorrelation.
11-97. 1) The test checks only for first-order autocorrelation. 2) The test may not be conclusive.
3) The usual limitations of a statistical test owing to the two possible types of errors.
11-15
11-98. DW = 0.93
n = 21
k=2
d L = 1.13
d U = 1.54 4 d L = 2.87
4 d U = 2.46
At the 0.10 level, there is some evidence of a positive first-order autocorrelation.
11-99. DW = 2.13
n = 20
k=3
d L = 1.00
d U = 1.68 4 d L = 3.00
4 d U = 2.32
At the 0.10 level, there is no evidence of a first-order autocorrelation.
Durbin-Watson d = 2.125388
11-100. DW = 1.79 n = 10
k = 2 Since the table does not list values for n = 10, we will use the
closest table values, those for n = 15 and k = 2:
d L = 0.95
d U = 1.54 4 d L = 3.05
4 d U = 2.46
At the 0.10 level, there is no evidence of a first-order autocorrelation. Note that the table values
decrease as n decreases, and thus our conclusion would probably also hold if we knew the actual
critical points for n = 10 and used them.
11-101. Suppose that we have time-series data and that it is known that, if the data are autocorrelated, by
the nature of the variables the correlation can only be positive. In such cases, where the
hypothesis is made before looking at the actual data, a onesided DW test may be appropriate.
(And similarly for a negative autocorrelation.)
11-102. DW analysis on results from problem 11-39:
Durbin-Watson d = 1.552891
k = 2 independent variables
n = 10 for the sample size.
Table 7 for the critical values of the DW statistic begins with sample sizes of 15, which is a little
larger than our sample. Using the values for size 15 as an approximation, we have:
for = 0.05, dl = 0.95 and du = 1.54
the value for d is slightly larger than du
indicating no autocorrelation.
Residual plot with employees on x-axis:
11-16
Residual Plot
2000
1500
Residual
1000
500
0
0
20000
40000
60000
80000
100000
120000
-500
-1000
-1500
-2000
11-103. F (r,n(k+1)) =
(6.996 6.9898) / 2
(SSE R SSE F ) / r
=
= 0.0275
0.1127
MSE F
Cannot reject H0. The two variables should definitely be droppedthey add nothing to the
model.
11-104. Y = 47.16 + 1.599X 1 + 1.1149X 2
The STEPWISE regression routine selects both
variables for the equation. R 2 = 0.961.
11-105. The STEPWISE procedure selects all three variables. R 2 = 0.9667.
11.106. All possible regression is the best procedure because it evaluates every possibility. It is expensive
in computer time; however, as computing power and speed increase, this becomes a very viable
option. Forward selection is limited by the fact that once a variable is in, there is no way it can
come out once it becomes insignificant in the presence of new variables. Backward elimination is
similarly limited. Stepwise regression is an excellent method that enjoys very wide use and that
has stood the test of time. It has the advantages of both the forward and the backward methods,
without their limitation.
11-107. Because a variable may lose explanatory power and become insignificant once other variables
are added to the model.
11-108. Highest adjusted R 2; lowest MSE; highest R 2 for a given number of variables and the assessment
of the increase in R 2 as we increase the number of variables; Mallowss Cp.
11-109. No. There may be several different best models. A model may be best using one criterion, and
not the best using another criterion.
11-17
11-110. Results will vary. Sample regression for Australia.

(Data source: Foreign Statistics/Handbook of International Economic Statistics/Tables)
Australia
Real GDP
Defense
Exp % GDP
1970
171
2.3
14.6
1,219
1980
238
2.7
17.0
1,052
1990
328
2.2
17.3
1,670
1992
330
2.3
17.5
1,800
1993
342
2.6
17.7
2,000
1994
359
2.5
17.9
1,230
1995
369
2.7
18.1
1,800
1996
382
2.6
18.3
2,090
2.5
18.4
1,790
1997
394
Grain
Population Yields

0
1
2
3
Intercept Defense Exp % GDP Population Grain Yields
-64.709
58.04
0.035
b -583.38
s(b) 123.753
45.0667
8.8057
0.0246
t -4.714
-1.4358
6.5912
1.4181
0.2105
0.0012
0.2154
p-value 0.0053
VIF
1.3387
ANOVA Table
Source
SS
Regn. 40654.3
Error 2039.75
Total
42694
2.0253
1.6331
df
3
5
MS
FCritical p-value
F
13551 33.218 5.4094 0.0010
407.95
5336.8
R 0.9522
Correlation matrix
1
2
3
Defense Population Grain Yields
Defense 1.0000
Population 0.4444
1.0000
Grain Yields 0.0689
0.5850
1.0000
Real GDP 0.2573
0.9484
0.7023
11-18
s 20.198
2
Adjusted R 0.9236
Partial F Calculations
Australia
#Independent variables in full model
#Independent variables dropped from the model
SSEF
2039.748
SSER
39867.85
Partial F
p-value
46.36369
0.0010
3k
2r
Model is significant with high R2, F-value, low multicollinearity.

11-111. Substitution of a variable with its logarithm transforms a non-linear model to a linear model. In
this case, the logarithm of size of fund has a linear relationship with the dependent variables.
11-112. Since the t-statistic for each variable alone is significant and given the R-square, we can conclude
that a good linear relation exists between the dependent and independent variables. Since the tstatistic of the cross products are not significant, there is no relation among the independent
variables and the cross products. In conclusion, there is only a linear relationship among the
dependent and independent variables.
11-113. Using MINITAB
Regression Analysis: Com. Eff. versus Sincerity, Excitement, ...
Com. Eff. = - 36.5 + 0.098 Sincerity + 1.99 Excitement + 0.507
Ruggedness - 0.366 Sophistication
Predictor
Constant
Sincerity
Excitement
Ruggedness
Sophistication
S = 3.68895
Coef
-36.49
0.0983
1.9859
0.5071
-0.3664
SE Coef
24.27
0.3021
0.2063
0.7540
0.3643
R-Sq = 94.6%
T
-1.50
0.33
9.63
0.67
-1.01
P
0.171
0.753
0.000
0.520
0.344
R-Sq(adj) = 91.8%
Based on the p-values for the estimated coefficients, only the assessed excitement variable is
significant. The adjusted R-square indicates that 91.8% of the variation in commercial
effectiveness is explained by the model. The ANOVA test indicates that a linear relation exists
between the dependent and independent variables.
11-19
Source
Regression
Residual Error
Total
DF
4
8
12
SS
1890.36
108.87
1999.23
MS
472.59
13.61
F
34.73
P
0.000
11-114. STEPWISE chooses only Number of Rooms and Assessed Value.

b0 = 91018
b1 = 7844 b2 = 0.2338 R 2 = 0.591
11-115. Answers to this web exercise will vary with selected countries and date of access.
Case 15: Return on Capital for Four Different Sectors
Indicator variables used:
Sector
I1
I2
I3
Banking
Computers
Construction
Energy
1.
A
1
2
3
4
5
6
7
8
9
10
b
s(b)
t
p-value
F
G
Chapter 11 Case - ROC
0
1
Intercept
Sales
14.6209 2.30E-05
2.51538 2.60E-05
5.81259 0.88781
0.0000
0.3770
2
Oper M
0.0824
0.0553
1.4905
0.1396
3
Debt/C
-0.0919
0.0444
-2.0692
0.0414
4
I1
10.051
2.0249
4.9636
0.0000
5
I2
2.8059
2.2756
1.2331
0.2208
6
I3
-1.6419
1.8725
-0.8769
0.3829
1.2472
1.2212
1.6224
1.8560
1.8219
1.9096
VIF
Based on the regression coefficients of I1, I2, I3, the ranking of the sectors from highest return to lowest
will be:
Computers, Construction, Banking, Energy
2. From "Partial F" sheet, the p-value is almost zero. Hence the type of industry is significant.
11-20
3. 95% Prediction Intervals:

Sector
95% Prediction Interval
Banking
Computers
12.9576 + or 12.977
Construction
Energy
23.0082 + or 13.295
15.7635 + or 13.139
11.3157 + or 12.864
11-21
Chapter 12 - Time Series, Forecasting, and Index Numbers
CHAPTER 12
TIME SERIES, FORECASTING, AND INDEX NUMBERS
12-1.
Trend analysis is a quick method of determining in which general direction the data are moving
through time. The method lacks, however, the theoretical justification of regression analysis
because of the inherent autocorrelations and the intended use of the method in extrapolation
beyond the estimation data set.
12-2.
The trend regression is:

b0 = 28.7273 b1 = -0.6947 r 2 = 0.511
y = 28.7273 0.6947 t
y (Jul-2007) = 12.055%
for t = 24
(Using the template: Trend Forecast.xls)

Forecasting with Trend
t
24
25
26
27
28
Forecast
Z-hat
12.0553
11.3607
10.666
9.9713
9.27668
Regression Statistics
2
r 0.5111
MSE 22.24426
Slope -0.69466
Intercept 28.72727
Forecast for July, 2007 (t = 24) = 12.0553%

12-3.

b0 = 34.818 b1 = 12.566
y = 34.818 + 12.566 t
r 2 = .9858
y (2008) = 198.182
y (2009) = 210.748
12-1

Data
Period
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
t
1
2
3
4
5
6
7
8
9
10
11
12
Zt
53
65
74
85
92
105
120
128
144
158
179
195
Forecast
t
13
14
15
16
17
18
19
20
21
22
23
24
Z-hat
198.182
210.748
223.315
235.881
248.448
261.014
273.58
286.147
298.713
311.28
323.846
336.413
2
r 0.9858
MSE 32.51189
Slope 12.56643
Intercept 34.81818
Forecast for 2008 (t = 13) = 198.182 and for 2009 (t = 14) = 210.748
12-4.

b0 = -0.873
b1 = 3.327
y = -0.667 + 3.269 t
y = 39.05%
r 2 = 0.8961
for t = 12
12-2

t
12
13
14
15
16
Forecast
Z-hat
39.0545
42.3818
45.7091
49.0364
52.3636
2
r 0.8961
MSE 15.68081
Slope 3.327273
Intercept -0.87273
Forecast for next year (t = 12) = 39.05%

12-5.
No, because of the seasonality.
12-6.
No. Cycles are not well modeled by trend analysis.
12-7.
The term, seasonal variation is reserved for variation with a cycle of one year.
12-8.
There will be too few degrees of freedom for error.
12-9.
The weather, for one thing, changes from year to year. Thus sales of winter clothing, as an
example, would have a variable seasonal component.
12-3
12-10. Using MINITAB to conduct a multiple regression with a time variable and 11 dummy variables:
Regression Analysis: profit versus t, jan, ...
adjusted R-square
is reasonable.
Setting t = 25, Jan = 1 and the rest of the months = 0, we
The The
regression
equation
is
get a =forecasted
for Jan, t2007
1.588 jan + 0.121 feb + 0.319 mar + 0.567 apr
profit
0.163 value
+ 0.0521
+ =
0.123
+ 0.615 may + 0.413 jun + 0.510 jul + 0.758 aug + 0.856 sep +
Predicted0.904
Values for
New
oct
+ Observations
0.602 nov
New
Obs
Fit SE
Fit SE
95%
CI
95%
Predictor
Coef
Coef
T PI
P
1
1.5875
0.3104
(0.9043,
2.2707)
(0.5874,
2.5876)
Constant
0.1625
0.3104 0.52 0.611
t
0.05208 0.01129 4.61 0.001
jan
0.1229
0.3543 0.35 0.735
for New 0.34
Observations
febValues of Predictors
0.1208
0.3505
0.737
mar
0.3188
0.3470 0.92 0.378
New
apr
0.5667
0.3439 1.65 0.128
Obs
t
jan
feb
mar
apr
may 1 25.0 0.6146
0.3411 0.000000
1.80 0.099
1.00 0.000000
0.000000
jun
0.4125
0.3387 1.22 0.249
julNew
0.5104
0.3366 1.52 0.158
augObs
0.7583
0.3349 2.26
0.045
aug
sep
oct
nov
0.000000
0.000000
sep 1 0.000000
0.8563
0.3336
2.57 0.000000
0.026
oct
0.9042
0.3326 2.72 0.020
nov
0.6021
0.3320 1.81 0.097
may
0.000000
12-11. Using trend analysis:

trend regression
is: = 83.2%
S = The
0.331834
R-Sq
R-Sq(adj) = 64.8%
2
b0 = 8165707 b1 = 40169.72 r = 0.9715
y = 8165707 + 40169.72 t
y = 8728083 for t = 13
Source
DF
SS
MS
F
P
Regression
12Trend
5.9783
0.4982 4.52 0.009
(Using the template:
Forecast.xls)
Residual
Errorwith
11Trend
1.2112 0.1101
Forecasting
Total
23 7.1896
t
14
15
16
17
18
Forecast
Z-hat
8728083
8768252
8808422
8848592
8888761
2
r 0.9715
MSE 7.82E+08
Slope 40169.72
Intercept 8165707
Forecast for next year (t = 13) = 8728083
12-4
jun
0.000000
jul
0.000000
12-12. Using a computer:

Linear regression trend line:
data:
t (mon.) Z(t)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
(Jul)
(Aug)
(Sep)
(Oct)
(Nov)
(Dec)
(Jan)
(Feb)
(Mar)
(Apr)
(May)
(Jun)
(Jul)
(Aug)
(Sep)
(Oct)
(Nov)
(Dec)
(Jan)
(Feb)
(Mar)
(Apr)
(May)
(Jun)
(Jul)
(Aug)
(Sep)
(Oct)
(Nov)
(Dec)
(Jan)
(Feb)
(Mar)
(Apr)
trend:
Zhat(t)
7.40
6.80
6.40
6.60
6.50
6.00
7.00
6.70
8.20
7.80
7.70
7.30
7.00
7.10
6.90
7.30
7.00
6.70
7.60
7.20
7.90
7.70
7.60
6.70
6.30
5.70
5.60
6.10
5.80
5.90
6.20
6.00
7.30
7.40
7.18
7.17
7.15
7.13
7.11
7.09
7.07
7.05
7.03
7.01
6.99
6.97
6.95
6.93
6.91
6.89
6.87
6.86
6.84
6.82
6.80
6.78
6.76
6.74
6.72
6.70
6.68
6.66
6.64
6.62
6.60
6.58
6.56
6.54
Zhat(t) = 7.2043 0.0194 t
Centered
Moving
Average
C(t) =
CMA
Zhat(t)
7.02
7.01
7.05
7.10
7.15
7.20
7.25
7.30
7.30
7.29
7.28
7.25
7.20
7.11
7.00
6.89
6.79
6.71
6.62
6.51
6.43
6.40
0.993
0.995
1.002
1.012
1.022
1.032
1.043
1.052
1.057
1.057
1.059
1.058
1.053
1.043
1.029
1.017
1.005
0.996
0.985
0.971
0.963
0.960
-----------FORECAST---------------35 (May) (Zhat = 6.525)(S = 109.60)/100 = 7.15
Template Forecast is 7.045
12-5
Ratio
Moving
Average
Seasonal
Index
S
99.76
95.54
116.38
109.92
107.76
101.45
96.55
97.32
94.47
100.17
96.16
92.41
105.62
101.29
112.92
111.73
111.90
99.88
95.21
87.58
87.05
95.37
95.68
92.25
90.57
97.57
95.96
92.22
102.47
98.21
114.41
110.59
109.60
100.45
95.68
92.25
90.57
97.57
95.96
92.22
102.47
98.21
114.41
110.59
109.60
100.45
95.68
92.25
90.57
97.57
95.96
92.22
102.47
98.21
114.41
110.59
[Deseasoned]
Z(t)/S%
7.73
7.37
7.07
6.76
6.77
6.51
6.83
6.82
7.17
7.05
7.03
7.27
7.32
7.70
7.62
7.48
7.29
7.27
7.42
7.33
6.90
6.96
6.93
6.67
6.58
6.18
6.18
6.25
6.04
6.40
6.05
6.11
6.38
6.69
12-13. (Using the template: Trend+Season Forecasting.xls, sheet: monthly)

Forecasting with Trend and Seasonality
Forecast for Sep, 2006 = 0.33587
t Year
1 2004
2 2004
3 2005
4 2005
5 2005
6 2005
7 2005
8 2005
9 2005
10 2005
11 2005
12 2005
13 2005
14 2005
15 2006
16 2006
17 2006
18 2006
19 2006
20 2006
21 2006
22 2006
Month
11 Nov
12 Dec
1 Jan
2 Feb
3 Mar
4 Apr
5 May
6 Jun
7 Jul
8 Aug
9 Sep
10 Oct
11 Nov
12 Dec
1 Jan
2 Feb
3 Mar
4 Apr
5 May
6 Jun
7 Jul
8 Aug
Y
0.38
0.38
0.44
0.42
0.44
0.46
0.48
0.49
0.51
0.52
0.45
0.4
0.39
0.37
0.38
0.37
0.33
0.33
0.32
0.32
0.32
0.31
Intercept
Trend Equation 0.518283
Deseasonalized
0.40913
0.41684
0.45224
0.42406
0.48048
0.49272
0.45687
0.45687
0.4539
0.44922
0.44242
0.43222
0.4199
0.40587
0.39057
0.37357
0.36036
0.35347
0.30458
0.29837
0.2848
0.26781
t
23
24
25
26
Year
2006
2006
2006
2006
Slope
-0.00818
12-14. (Using the template: Trend+Season Forecasting.xls, sheet: monthly)

Forecasting with Trend and Seasonality
12-6
Forecasts
Month
9 Sep
10 Oct
11 Nov
12 Dec
Y
0.33587
0.29803
0.29152
0.27867
Forecast for Oct, 2006 = 28.73718

t
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Data
Year
Month
2005 1
Jan
2005 2
Feb
2005 3
Mar
2005 4
Apr
2005 5
May
2005 6
Jun
2005 7
Jul
2005 8
Aug
2005 9
Sep
2005 10
Oct
2005 11
Nov
2005 12
Dec
2006 1
Jan
2006 2
Feb
2006 3
Mar
2006 4
Apr
2006 5
May
2006 6
Jun
2006 7
Jul
2006 8
Aug
2006 9
Sep
Intercept
Trend Equation 22.54861
Y Deseasonalized
14
16.8856
10
22.2728
50
54.0922
24
24.6668
16
15.3033
15
15.8805
20
22.3533
42
22.5141
18
21.3884
26
20.2627
21
20.6647
20
21.4286
18
21.71
10
22.2728
22
23.8006
24
24.6668
26
24.8678
24
25.4087
18
20.1179
58
31.0909
40
47.5297
t
22
23
24
25
Forecasts
Year Month
2006 10 Oct
2006 11 Nov
2006 12 Dec
2007 1 Jan
Y
28.73718
22.75217
20.88982
18.55136
Slope
-0.00694
The forecast for October is considerably less than the actual percents recorded for August and
September. The forecast reflects the historical percentage of negative stories instead of the
recent past history.
12-15. (Using the template: Trend+Season Forecasting.xls)
Forecasting with Trend and Seasonality (quarterly)
t Year
1 2005
2 2005
3 2005
4 2005
5 2006
6 2006
7 2006
8 2006
9 2007
Q
1
2
3
4
1
2
3
4
1
Deseasonalized
3.4
3.869621
4.5
4.150717
4
4.258289
5
4.554288
4.2
4.78012
5.4
4.98086
4.9
5.216404
5.7
5.191888
4.6
5.23537
12-7
Forecasts
Year
t
10
2007
11
2007
12
2007
Q
2
3
4
Y
6.20676
5.56327
6.71894
Seasonal
Indices
Q
Index
1
87.86
2
108.42
3
93.93
4
109.79
400
Forecast for Q2, 2007 = 6.20676

12-16. Assuming a weight of 0.4
(Using the template: Exponential Smoothing.xls)
Exponential Smoothing
MAE
MAPE
MSE
3.3688 7.91% 18.2177
Period
45
46
47
48
49
Actual Forecast
27 27.6959
26 27.4175
27 26.8505
28 26.9103
27.3462
Forecast for next quarter = 27.3462
12-8

w = 0.3
Zhat(1) = Z(1) = 57
Zhat( 2):
Zhat( 3):
Zhat( 4):
Zhat( 5):
Zhat( 6):
Zhat( 7):
Zhat( 8):
Zhat( 9):
Zhat(10):
Zhat(11):
Zhat(12):
Zhat(13):
Zhat(14):
Zhat(15):
Zhat(16):
Zhat(17):
0.3(57.00)
0.3(58.00)
0.3(60.00)
0.3(54.00)
0.3(56.00)
0.3(53.00)
0.3(55.00)
0.3(59.00)
0.3(62.00)
0.3(57.00)
0.3(50.00)
0.3(48.00)
0.3(52.00)
0.3(55.00)
0.3(58.00)
0.3(61.00)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
0.7(57.00)
0.7(57.00)
0.7(57.30)
0.7(58.11)
0.7(56.88)
0.7(56.61)
0.7(55.53)
0.7(55.37)
0.7(56.46)
0.7(58.12)
0.7(57.79)
0.7(55.45)
0.7(53.21)
0.7(52.85)
0.7(53.50)
0.7(54.85)
w = 0.8
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
57.00
57.30
58.11
56.88
56.61
55.53
55.37
56.46
58.12
57.79
55.45
53.21
52.85
53.50
54.85
56.69
0.8(57.00)
0.8(58.00)
0.8(60.00)
0.8(54.00)
0.8(56.00)
0.8(63.00)
0.8(55.00)
0.8(59.00)
0.8(62.00)
0.8(57.00)
0.8(50.00)
0.8(48.00)
0.8(52.00)
0.8(55.00)
0.8(58.00)
0.8(61.00)
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
0.2(57.00)
0.2(57.00)
0.2(57.80)
0.2(59.56)
0.2(55.11)
0.2(55.82)
0.2(53.56)
0.2(54.71)
0.2(58.14)
0.2(61.23)
0.2(57.85)
0.2(51.57)
0.2(48.71)
0.2(51.34)
0.2(54.27)
0.2(57.25)
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
=
57.00
57.80
59.56
55.11
55.82
53.56
54.71
58.14
61.23
57.85
51.57
48.71
51.34
54.27
57.25
60.25
The w = .8 forecasts follow the raw data much more closely. This makes sense because the raw
data jump back and forth fairly abruptly, so we need a high w for the forecasts to respond to
those oscillations sooner.
12-9

w = 0.7
Zhat(1) = Z(1) = 195
Zhat( 2): 0.7(195.00) + 0.3(195.00) =

Zhat( 3): 0.7(193.00) + 0.3(195.00) =
Zhat( 4): 0.7(190.00) + 0.3(193.60) =
Zhat( 5): 0.7(185.00) + 0.3(191.08) =
Zhat( 6): 0.7(180.00) + 0.3(186.82) =
Zhat( 7): 0.7(190.00) + 0.3(182.05) =
Zhat( 8): 0.7(185.00) + 0.3(187.61) =
Zhat( 9): 0.7(186.00) + 0.3(185.78) =
Zhat(10): 0.7(184.00) + 0.3(185.94) =
Zhat(11): 0.7(185.00) + 0.3(184.58) =
Zhat(12): 0.7(198.00) + 0.3(184.87) =
Zhat(13): 0.7(199.00) + 0.3(194.06) =
Zhat(14): 0.7(200.00) + 0.3(197.52) =
Zhat(15): 0.7(201.00) + 0.3(199.26) =
Zhat(16): 0.7(199.00) + 0.3(200.48) =
Zhat(17): 0.7(187.00) + 0.3(199.44) =
Zhat(18): 0.7(186.00) + 0.3(190.73) =
Zhat(19): 0.7(191.00) + 0.3(187.42) =
Zhat(20): 0.7(195.00) + 0.3(189.93) =
Zhat(21): 0.7(200.00) + 0.3(193.48) =
Zhat(22): 0.7(200.00) + 0.3(198.04) =
Zhat(23): 0.7(190.00) + 0.3(199.41) =
Zhat(24): 0.7(186.00) + 0.3(192.82) =
Zhat(25): 0.7(196.00) + 0.3(188.05) =
Zhat(26): 0.7(198.00) + 0.3(193.61) =
Zhat(27): 0.7(200.00) + 0.3(196.68) =
---------------FORECAST-------------Zhat(28): 0.7(200.00) + ).3(199.01) =
195.00
193.60
191.08
186.82
182.05
187.61
185.78
185.94
184.58
184.87
194.06
197.52
199.26
200.48
199.44
190.73
187.42
189.93
193.48
198.04
199.41
192.82
188.05
193.61
196.68
199.01
199.70
12-10
MAE
MAPE
MSE
4.8241 2.52% 34.8155
w 0.7
t
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Zt
195
193
190
185
180
190
185
186
184
185
198
199
200
201
199
187
186
191
195
200
200
190
186
196
198
200
200
Forecast
195
195
193.6
191.08
186.824
182.047
187.614
185.784
185.935
184.581
184.874
194.062
197.519
199.256
200.477
199.443
190.733
187.42
189.926
193.478
198.043
199.413
192.824
188.047
193.614
196.684
199.005
199.702
|Error|
%Error
Error
3.6
6.08
6.824
7.9528
2.61416
0.21575
1.93527
0.41942
13.1258
4.93775
2.48132
1.7444
1.47668
12.443
4.7329
3.58013
5.07404
6.52221
1.95666
9.413
6.8239
7.95283
4.38585
3.31575
0.99473
1.89%
3.29%
3.79%
4.19%
1.41%
0.12%
1.05%
0.23%
6.63%
2.48%
1.24%
0.87%
0.74%
6.65%
2.54%
1.87%
2.60%
3.26%
0.98%
4.95%
3.67%
4.06%
2.22%
1.66%
0.50%
12.96
36.9664
46.567
63.247
6.83383
0.04655
3.74529
0.17591
172.287
24.3814
6.15697
3.04292
2.18059
154.828
22.4004
12.8173
25.7459
42.5392
3.82853
88.6046
46.5656
63.2475
19.2357
10.9942
0.98948
12-11

(Using the template: Exponential Smoothing.xls)
w
t
0.9
Zt
1 2565942
2 2724292
3 3235231
4 3863508
5 4819747
6 5371689
7 6119114
8
Forecast
2565942
2565942
2708457
3182554
3795413
4717314
5306251
6037828
Forecast for 2007 = 6037828

12-20. Answers will vary.
12-21. Equation (12-11):
Z wZ w(1 w)Z
t 1
t 1
w(1 w) 2 Z t 2 w(1 w) 3 Z t 3 ...
The same equation for Z t (shifting all subscripts back by 1):

Z t = wZ t1 + w(1w)Z t2 + w(1-w) 2Z t3 + w(1-w) 3Z t4 +
Now multiplying this second equation throughout by (1w) gives:

(1w) Z = w(1w)Z t-1 + w(1w) 2Z t-2 + w(1w) 3Z t3 = w(1w) 4Z t-4 +
t
Now note that all the terms on the right side of the equation above are identical to all the terms in
Equation (12-11) on the top, after the term wZ t. Hence we can substitute in Equation (12-11) the
left hand side of our last equation, (1w) Z t for all the terms past the first. This gives us:
Z t 1 wZ t (1 w)Z t

12-22. Equation (12-13) is:
Z
t +1 = Zt + (1 w)(Zt
Multiplying out we get:
Zt )
Z t 1 Z t (1 w)Z t (1 w)Z t Z t (1 w)Z t Z t wZ t wZ t (1 w)Z t ,
12-12
289.1
; thus:
100
new CPI
24.9
26.9
27.5
27.7
..
.
12-23. Simply divide each CPI by

year
1950
1951
1952
1953
..
.
old CPI
72.1
77.8
79.5
80.1
..
.
12-24. 168.77 in July 2000 and 173.48 in June 2001.

12-25. A simple price index reflects changes in a single price variable of time, relative to a single base
time.
12-26. Index numbers are used as deflators for comparing values and prices over time in a way that
prevents a given inflationary factor from affecting comparisons. They are also used to provide an
aggregate measure of changes over time in several related variables.
12-27. a. 1988
1993index
163
=
1988index
100
c. It fell, from 145% of the 1988 output down to 133% of that output.
d. Big increase in the mid 80s, then a sharp drop in 1986, tumbling for three more
then slowly climbing back up until 1995, then a drop-off.
b. Just divide each index number by
a)
Price Index
BaseYear 1988
Year
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
100 Base
Price
Index
175 175
190 190
132 132
96 96
100 100
78 78
131 131
135 135
154 154
163 163
178 178
170 170
145 145
133 133
12-13
years,
c)
Price Index
BaseYear 1993
Year
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
163 Base
Price
Index
175 107.36
190 116.56
132 80.982
96 58.896
100 61.35
78 47.853
131 80.368
135 82.822
154 94.479
163 100
178 109.2
170 104.29
145 88.957
133 81.595
12-28. Divide each data point by

Jun. 03: 98.6
Jan.2004value
1.44 :
100
Jul. 03: 95.14
12-29. Since a yearly cycle has 12 months and there are only 18 data points, a seasonal/cyclical
decomposition isnt feasible. Simple linear regression, with the successive months numbered
1,2,..., gives SALES = 4.23987 .03870MONTH, thus for July 1995 (month #19), the forecast is
3.5046.
12-14

Data
Period
jan
feb
mar
apr
may
jun
jul
aug
sep
oct
nov
dec
jan
feb
mar
apr
may
jun
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Zt
4.4
4.2
3.8
4.1
4.1
4
4
3.9
3.9
3.8
3.7
3.7
3.8
3.9
3.8
3.7
3.5
3.4
Forecast
t
19
20
21
Z-hat
3.50458
3.46588
3.42718
The forecast of sales for July, 2004 is 3.5 million units.

2
r 0.7285
MSE 0.016906
Slope
-0.0387
Intercept 4.239869
12-30. Trend analysis is a quick, if sometimes inaccurate, method that can give good results. The
additive and multiplicative TSCI models are sometimes useful, although they lack a firm
theoretical framework. Exponential smoothing methods are good models. The ones described in
this book do not handle seasonality, but extensions are possible. This author believes that BoxJenkins ARIMA models are the way to go. One limitation of these models is the need for large
data sets.
12-31. Exponential smoothing models smooth the data of sharp variations and produce forecasts that
follow a type of average movement in the data. The greater the weighting factor w, the closer
the exponential smoothing series follows the data and forecasts tend to follow the variations in
the data more closely.
12-15
12-32. Using MINITAB: Stat: Time Series: Moving Average

Moving Average for Data
Moving Average
Length
Accuracy Measures
MAPE
MAD
MSD
1.69534
1.75000
3.66964
Forecasts
Period
13
Forecast
103.375
Lower
99.6204
Upper
107.130
Forecast for next period = 103.375

Use the template: Exponential Smothing.xls
w
t
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
0.4
Zt
18
17
15
14
15
11
8
5
4
3
5
4
6
5
7
8
Forecast
18
18
17.6
16.56
15.536
15.3216
13.593
11.3558
8.81347
6.88808
5.33285
5.19971
4.71983
5.2319
5.13914
5.88348
6.73009
y(2007) = 6.73009
12-16
12-34. a) raised the seasonal index to 99.38 for April from 99.29
We would expect to see the April index change by a significant amount. The reason it did
not is due to the calculations involving moving average.
b) raised the seasonal index to 122.27 for April from 99.29
c) raised the seasonal index to 100.16 for December from 100.09 We would expect the
December index to change by a significant amount. It did not due to the calculations for
moving average.
d) very high or low values for data points at the beginning or end of a series have little impact
on the seasonal index due to their limited influence in the moving average computations.
12-35. (Using the template: Trend Forecast.xls)
Data
Period
1998
1999
2000
2001
2002
2003
2004
t
1
2
3
4
5
6
7
Zt
6.3
6.6
7.3
7.4
7.8
6.9
7.8
Forecast
t
8
9
10
Z-hat
7.95714
8.15714
8.35714
Forecast for 2005 = 7.957

2
r 0.5552
MSE 0.179429
Slope
0.2
Intercept 6.357143
12-36. Answers will vary.

Case 16: Auto Parts Sales Forecast
1)
Forecasts
Year
t
17
2002
18
2002
19
2002
20
2002
Q
1
2
3
4
Y
$85,455,550.30
$108,706,616.14
$97,706,824.92
$105,724,455.54
12-17
Using Excels regression tool and the Centered Moving Average (col. G of the template) as our
Y and the values under t (col. B of the template) for our X, we get the following supporting detail
for the Trend+Seasonal model:
Multiple R
0.89727
R Square
0.805093
Adjusted R Square 0.785602
Standard Error
1.558112
Observations
12
Intercept
time
Standard
Coefficients Error
t Stat
P-value
152.2638 1.195366 127.3785 2.18E-17
-0.83741 0.130296 -6.42701 7.57E-05
(Note: the coefficient values are identical to those generated by the template.)
ANOVA
df
SS
MS
F
Significance F
1 100.2802 100.2802 41.30642
7.57E-05
10 24.27713 2.427713
11 124.5573
Regression
Residual
Total
2) Multiple Regression Equation:

Y = -2693200091 8445234.547 M2 +82447357.24 NF 3768891Oil Price
0
Intercept
-2693200091
b
s(b)
1096606287
t
-2.455940771
0.0303
p-value
ANOVA Table
Source
SS
df
Regn. 3.77493E+15 3
Error 8.2767E+14 12
Total
4.6026E+15
15
1
2
M2 Index
Non Farm Activity Index
-8445234.547
82447357.24
101021547.4
38350031.1
-0.083598349
2.149864156
0.9348
0.0527
3
Oil Price
-3768891
1263314.066
-2.983336528
0.0114
MS
FCritical p-value
F
1.25831E+15 18.243631 3.4902996 0.0001
6.89725E+13
3.0684E+14
3) forecasted values using regression model:
12-18
0.8202
s 8304970.102
2
Adjusted R 0.77521656
Quarter
2002/Q1
2002/Q2
2002/Q3
2002/Q4
4)
Forecast
$81,337,085.11
$55,574,874.53
$60,903,732.58
$59,868,829.41
4. Add the new data:

Y
X1
X2
X3
X4 X5 X6
Non Farm
Ones
Activity
Q2 Q3 Q4
Sales
M2 Index Index
Oil Price
35452300 1 2.356464
34.2
19.15
0
0
0
41469361 1 2.357643
34.27
16.46
1
0
0
40981634 1 2.364126
34.3
18.83
0
1
0
42777164 1 2.379493
34.33
19.75
0
0
1
43491652 1 2.373544
34.4
18.53
0
0
0
57669446 1 2.387192
34.33
17.61
1
0
0
59476149 1 2.403903
34.37
17.95
0
1
0
76908559 1
2.42073
34.43
15.84
0
0
1
63103070 1 2.431623
34.37
14.28
0
0
0
84457560 1 2.441958
34.5
13.02
1
0
0
67990330 1 2.447452
34.5
15.89
0
1
0
68542620 1 2.445616
34.53
16.91
0
0
1
73457391 1
2.45601
34.6
16.29
0
0
0
89124339 1
2.48364
34.7
17
1
0
0
85891854 1 2.532692
34.67
18.2
0
1
0
69574971 1 2.564984
34.73
17
0
0
1
12-19

0
b
s(b)
t
p-value
M2 Index
Intercept
-2655354679 -12780153.29
1219227600
118142020
-2.177899088 -0.108176187
0.0574
0.9162
2
Non Farm
Activity Index
81566233.8
43101535.65
1.892420596
0.0910
9.9367
9.0506
VIF
ANOVA Table
Source
SS
df
Regn.
3.89129E+15 6
Error
7.11312E+14 9
Total
MS
6.48548E+14
7.90347E+13
4.6026E+15 15
Oil Price
Q2
Q3
Q4
-3827527.175
5802059 7127252.8 3211850.1
1534592.501 6575281.3 6653402.9 6716387.2
-2.494165177 0.8824047 1.0712192 0.478211
0.0342
0.4005
0.3120
0.6439
1.4058
F
8.2058616
3.0684E+14
1.6411
1.6803
FCritical
p-value
3.3737564 0.0031
s 8890145.586
Adjusted
2
R 0.742423692
0.8455
Regression Equation:
Sales = -2655354679 12780153.29 M2 + 81566233.8 NFAI 3827527.175 Oil P +5802059 Q2
+7127252.8 Q3 + 3211850.1 Q4
5. Forecast for next four quarters:
Quarter
02 Q1
02 Q2
02 Q3
02 Q4
Sales
76344324
56495768
62878143
57771809
6. Partial F-test:
H0: 4 = 5 = 6 = 0
H1: not all are zero
(Remember, to drop the three indicator variables, they must be the last three independent
variables in the data sheet of the template.)
Partial F Calculations
#Independent variables in full model
#Independent variables dropped from the model
SSEF
7.11E+14
SSER
8.28E+14
Partial F
p-value
0.490747
0.6973
12-20
1.7123
6
3
p-value = 0.6973, very high. Do not reject the null hypothesis: indicator variables are not
significant.
7. Comparing the three model forecasts:
It would be ideal to have the values for 2004 to compare the forecasts to the actual values.
However, these values are not available. The next step is to compare the three models relative to
R2, F and standard error of the models.
Model
R2
F
Std. error
Trend + Seasonal
0.805
41.306
1.558
MR (part 2)
0.820
18.244
8,304,970.1
MR (part 4)
0.846
8.206
8,890,145.6
Clearly, the best model is the Trend + Seasonal Model with the smallest Std. Error and highest Fvalue. The only significant independent variable in the multiple regression models is oil prices,
and a regression of sales on oil prices only yields an R2 of 0.33 and a very high std. error.
12-21

Aczel Business Statistics Solutions Ch8-12

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Aczel Business Statistics Solutions Ch8-12

Uploaded by

Copyright:

Available Formats

Chapter 08 - The Comparison of Two Populations

Average Difference 19.08 D

Stdev. of Difference 30.67 sD

Strongly reject H0. 95% C.I. for D 2.023(2.3/ 40 ) = [4.26, 5.74].

s D = 2.45 (D = Movie Commercial)

Chapter 08 - The Comparison of Two Populations

(template: Testing Paired Difference.xls, sheet: Sample Data)

Average Difference 3.66667 D

Paired Difference Test

Test Statistic 1.5492 t

Chapter 08 - The Comparison of Two Populations

There is no evidence that the shelf facings are effective.

(template: Testing Paired Difference.xls, sheet: Sample Data)

Average Difference 37.0833 D

= P(Z > 1.551) = 0.0604

Chapter 08 - The Comparison of Two Populations

Do not reject H0; no evidence of a difference.

Average Difference 1.25 D

Stdev. of Difference 42.89 sD

n 2 = 100 x1 = 76.5 x 2 = 88.1 s1 = 38 s 2 = 40

(Template: Testing Population Means.xls, sheet: Z-test from Stats)

H0: Population Variances Equal

Assuming Population Variances are Equal

At an of Confidence Interval for difference in Population Means

Reject H0. There is evidence that gasoline outperforms ethanol.

Minolta (2): x 2 = 7.8 s 2 = 1.8

Chapter 08 - The Comparison of Two Populations

(2.1 / 30) (1.8 / 30)

Bel Air (1):

(Template: Testing Population Means.xls, sheet: t-test from Stats)

H0: Population Variances Equal

Assuming Population Variances are Equal

Warning: Equal variance assumption is questionable.

Confidence Interval for difference in Population Means

Assuming Population Variances are Unequal

Confidence Interval for difference in Population Means

Chapter 08 - The Comparison of Two Populations

(Template: Testing Population Means.xls, sheet: t-test from Stats)

t-Test for Difference in Population Means

H0: Population Variances Equal

Assuming Population Variances are Equal

At an of Confidence Interval for difference in Population Means

Reject H0. Music is probably more effective.

Chapter 08 - The Comparison of Two Populations

Chapter 08 - The Comparison of Two Populations

(Template: Testing Population Means.xls, sheet: t-test from Stats)

H0: Population Variances Equal

Assuming Population Variances are Equal

At an of Confidence Interval for difference in Population Means

95% C.I. for 2 1 is: ( x 2 x1 ) z / 2 (s1 / n1 ) (s 2 / n2 )

= 2.54 1.96 (.64 / 255) (.85 / 300) = [2.416, 2.664] percent.

Chapter 08 - The Comparison of Two Populations

Assuming Population Variances are Equal

With training (1):

Without training (2): n 2 = 15

H1: 1 2 > 4,000

(Use template: testing difference in means.xls)

Chapter 08 - The Comparison of Two Populations

t-Test for Difference in Population Means

H0: Population Variances Equal

The variances are not equal.

At an Confidence Interval for difference in Population

0.0000 Reject 95%

(Use template: testing difference in means.xls)