You are on page 1of 40

Instructors Solutions Manual - Chapter 9

Chapter 9 Solutions

Develop Your Skills 9.1
1. First, calculate differences.


Worker
Average
Daily
Production
Before
Music
Average
Daily
Production
After
Music Difference
1 18 18 0
2 14 15 -1
3 10 12 -2
4 11 15 -4
5 9 7 2
6 10 11 -1
7 9 6 3
8 11 14 -3
9 10 11 -1
10 12 12 0


Differences appear normally distributed, although as usual, this can be hard to
determine with small sample sizes.


0
1
2
3
4
5
N
u
m
b
e
r

o
f

W
o
r
k
e
r
s
(AverageDailyProductionBeforeMusic) (AverageDaily
ProductionAfterMusic)
Production Beforeand AfterMusicis
PlayedinthePlant



Copyright 2011 Pearson Canada Inc. 194
Instructors Solutions Manual - Chapter 9
H
0
:
D
= 0
H
1
:
D
< 0
(The order of subtraction is production before the music production after the music
is played. If music increases productivity, the production before the music should be
lower than after the music is played, so the average difference would be negative.)
= 0.04

Calculations can be done by hand, with Excel functions and the template, or with the
Data Analysis tool.

A completed template is shown below (mean, standard deviation, and sample size
were computed with Excel functions).


MakingDecisionsAboutthePopulation
MeanwithaSingleSample
Dothesampledataappeartobenormally
distributed? yes
SampleStandardDeviations 2.11082
SampleMean 0.7
SampleSizen 1
HypotheticalValueofPopulationMean 0
tScore 1.0487
OneTailedpValue 0.16083
TwoTailedpValue 0.32166
0



This is a one-tailed test, so the p-value is 0.16. Since this is > 4%, we fail to reject
H
0
. There is insufficient evidence to suggest that playing classical music led to
increased worker productivity.

Copyright 2011 Pearson Canada Inc. 195
Instructors Solutions Manual - Chapter 9
2. First calculate the differences.


Weekly Sales Before and After Product Redesign
Store sales after sales before differences
51 Bayfield $ 842.42 $ 813.67 $ 28.75
109 Mapleview Drive $ 831.54 $ 698.71 $ 132.83
137 Wellington $ 822.86 $ 734.48 $ 88.38
6 Collier $ 876.97 $ 832.46 $ 44.51
421 Essa Road $ 776.44 $ 791.22 -$ 14.78
19 Queen $ 793.19 $ 766.73 $ 26.46
345 Cundles $ 730.17 $ 668.66 $ 61.51
D-564 Byrne Drive $ 576.95 $ 631.05 -$ 54.10
24 Archer $ 758.87 $ 724.39 $ 34.48
15 Short St. $ 736.04 $ 766.76 -$ 30.72


The differences appear to be normally distributed, as seen in the histogram below.


0
1
2
3
4
5
N
u
m
b
e
r

o
f

S
t
o
r
e
s
(SalesAfterPackagingRedesign) (SalesBeforePackaging
Redesign)
SalesforGourmetCookies, Beforeand After
PackagingRedesign



H
0
:
D
= 0
H
1
:
D
> 0
(The order of subtraction is (sales after packaging redesign sales before packaging
redesign). If the package redesign leads to increased sales, sales should be higher
after the design, so the differences would tend to be positive.)
= 0.05

Copyright 2011 Pearson Canada Inc. 196
Instructors Solutions Manual - Chapter 9
Calculations can be done by hand, with Excel functions and the template, or with the
Data Analysis tool. Output from the Data Analysis t-test: paired two-sample for
means is shown below.


tTest:PairedTwoSampleforMeans
SalesAfter
Packaging
Redesign
SalesBefore
Packaging
Redesign
Mean 774.545 742.813
Variance 7085.906428 4098.8416
Observations 10 10
PearsonCorrelation 0.749516219
HypothesizedMeanDifference 0
df 9
tStat 1.800489248
P(T<=t)onetail 0.052654545
tCriticalonetail 1.833113856
P(T<=t)twotail 0.10530909
tCriticaltwotail 2.262158887



This is a one-tailed test, so the appropriate p-value is 0.053. This is > 5%, so we fail
to reject H
0
. There is insufficient evidence to suggest that sales of the cookies
increased after the packaging was redesigned.

The completed Excel template for the confidence interval is shown below.


ConfidenceIntervalEstimatefor
thePopulationMean
Dothesampledataappeartobe
normallydistributed? yes
ConfidenceLevel(decimalform) 0.9
SampleMean 31.73 $
SampleStandardDeviations 55.7323
SampleSizen 1
UpperConfidenceLimit 64.039
LowerConfidenceLimit 0.575
0


A 90% confidence interval estimate for the difference in sales after the product
redesign is (-0.58, $64.03).
Copyright 2011 Pearson Canada Inc. 197
Instructors Solutions Manual - Chapter 9
3. We are not given the data set, but some summary data. We cannot check for
normality of differences. We proceed by assuming the differences are normally
distributed, and noting that our conclusions may not be valid if this is not the case.

H
0
:
D
= 0
H
1
:
D
0
(The order of subtraction is before after. The alternative hypothesis concerns a
difference in daily sales before and after the script change, either positive or
negative.)
= 0.05

We are given:
x
D
= 4.2
s
D
= 23.4
n
D
= 56

343 . 1
56
4 . 23
0 2 . 4
n
s
x
t
D
D
D D
=

=

=

We refer to the t-distribution with 55 degrees of freedom. There is no such row in
the t-table, but whether we choose the row with 50 or 60 degrees of freedom, the
calculated t-score of 1.343 is between t
.100
and t
.050
. This is a two-tailed test, so

2 0.050 < p-value < 2 0.100

0.1 < p-value < 0.2

Since the p-value > 5%, we fail to reject H
0
. There is insufficient evidence to
suggest there is a difference in daily sales by the telemarketers before and after the
script change.

Copyright 2011 Pearson Canada Inc. 198
Instructors Solutions Manual - Chapter 9
4. This is a large data set, so we will use Excel.
Differences appear normally-distributed. See the histogram below.


0
5
10
15
20
25
30
F
r
e
q
u
e
n
c
y
(HoursStudiedbyMaleStudents) (HoursStudiedby
FemaleStudents)
NumberofHoursStudied OveraFourWeek
Period



H
0
:
D
= 0
H
1
:
D
< 0
(The order of subtraction is hours studied by male students hours studied by female
students. If female students study more, these differences will tend to be negative.)
= 0.02

Output from the Data Analysis t-test: paired two-sample for means is shown below.


tTest:PairedTwoSampleforMeans
Males Females
Mean 112.76 129.02
Variance 2307.497 1774.323
Observations 100 100
PearsonCorrelation 0.02797
HypothesizedMeanDifference 0
df 99
tStat 2.51047
P(T<=t)onetail 0.006839
tCriticalonetail 1.660392
P(T<=t)twotail 0.013677
tCriticaltwotail 1.984217



This is a one-tailed test, so the p-value is 0.007. This is less than , so we reject H
0
.
There is sufficient evidence to suggest that female students study more than male
students. However, results must be interpreted with caution, as there are many factors
that affect how much a student studies.
Copyright 2011 Pearson Canada Inc. 199
Instructors Solutions Manual - Chapter 9
The completed Excel template for the 96% confidence interval estimate is shown
below.


ConfidenceIntervalEstimatefor
thePopulationMean
Dothesampledataappeartobe
normallydistributed? yes
ConfidenceLevel(decimalform) 0.96
SampleMean 16.26
SampleStandardDeviations 64.7688
SampleSizen 1
UpperConfidenceLimit 2.7806
LowerConfidenceLimit 29.739
00



The 96% confidence interval estimate of the average number of hours that male
students study, compared to than their female counterparts over a four-week period is
(-29.7, -2.8).


5. We are told the histogram of differences appears to be normally distributed. We are
given summary data.
H
0
:
D
= 0
H
1
:
D
> 0
(The order of subtraction is fuel consumption without checking tires fuel
consumption checking tires. If checking the tires improves fuel consumption (that is,
reduces it), then these differences would tend to be positive.
= 0.04

We could do question by hand or with the Excel template.

Copyright 2011 Pearson Canada Inc. 200
Instructors Solutions Manual - Chapter 9
MakingDecisionsAboutthePopulation
MeanwithaSingleSample
Dothesampledataappeartobenormally
distributed? yes
SampleStandardDeviations 1
SampleMean 0.4
SampleSizen 2
HypotheticalValueofPopulationMean
tScore 1.27775
OneTailedpValue 0.10836
TwoTailedpValue 0.21673
.4
0



This is a one-tailed test, so the p-value is 0.11. This is > 0.04, so we fail to reject H
0
.
There is insufficient evidence to support the associations claim that checking tire
pressure regularly improves fuel consumption. However, there may be other
explanatory factors at play. Although fuel consumption was recorded during two
summer months, driving behavior and weather could have been quite different in the
two months, and so we cannot consider this test to be definitive.

Copyright 2011 Pearson Canada Inc. 201
Instructors Solutions Manual - Chapter 9
Develop Your Skills 9.2
6. This is a large data set, so we will use Excel.
First check the histogram of differences.
The differences appear approximately normally distributed.
[Note that you should not be fooled into thinking that a WSRST will be required for
these questions, just because they came right after the discussion of the WSRST in
the text. Which test to use depends on the conditions.]


0
5
10
15
20
25
30
35
N
u
m
b
e
r

o
f

W
o
r
k
e
r
s
(WeeklyWorkerErrorsBeforeTraining) (WeeklyWorker
ErrorsAferTraining)
WeeklyWorker Errors Beforeand Aftera
TrainingProgram



We can use the t-test for matched pairs.
H
0
:
D
= 0
H
1
:
D
> 0
(The order of subtraction is errors before training errors after training. If the
training reduced the number of errors, these differences would tend to be positive.)
= 0.04
Copyright 2011 Pearson Canada Inc. 202
Instructors Solutions Manual - Chapter 9

t-Test: Paired Two Sample for Means
Weekly
Errors Before
Training
Weekly
Errors After
Training
Mean 16.64 16.04
Variance 7.081212121 3.23070707
Observations 100 100
Pearson Correlation 0.140311038
Hypothesized Mean Difference 0
df 99
t Stat 2.00337553
P(T<=t) one-tail 0.023935075
t Critical one-tail 1.660391157
P(T<=t) two-tail 0.047870149
t Critical two-tail 1.9842169



This is a one-tailed test, so the p-value is 0.024. We reject H
0
. There is enough
evidence to suggest that weekly worker errors declined after the training.

Copyright 2011 Pearson Canada Inc. 203
Instructors Solutions Manual - Chapter 9
7. We have seen a similar problem in Develop Your Skills 9.1 Exercise 2, but the data
set has changed. Now, the differences are non-normal.


0
1
2
3
4
5
N
u
m
b
e
r

o
f

S
t
o
r
e
s
(SalesAfterPackagingRedesign) (SalesBeforePackaging
Redesign)
SalesforGourmetCookies, Beforeand After
PackagingRedesign



The sample size is small. The histogram is not perfectly symmetric, but it does show
a somewhat symmetric U-shape, so we will proceed with the WSRST.

H
0
: populations of weekly sales of gourmet cookies before and after the packaging
redesign are the same
H
1
: population of weekly sales of gourmet cookies after the packaging redesign is
to the right of the population of weekly sales before the packaging redesign
(that is, weekly sales of gourmet cookies are generally greater after the
packaging redesign, compared to before the redesign)

= 0.05

Now we must rank the differences (their absolute values) and compute W
+
and W
-
.

Copyright 2011 Pearson Canada Inc. 204
Instructors Solutions Manual - Chapter 9
The table below summarizes.


Differences
Absolute
Value Of
Differences
Ordered
Differences
Ranks To
Be
Assigned
Ranks For
Positive
Differences
Ranks For
Negative
Differences
128.33 128.33 26.46 1 1
-132.83 132.83 61.51 2 2
-88.38 88.38 88.38 3 3
144.51 144.51 114.78 4 4
-114.78 114.78 128.33 5 5
26.46 26.46 130.72 6 6
-61.51 61.51 132.83 7 7
154.1 154.1 134.48 8 8
134.48 134.48 144.51 9 9
-130.72 130.72 154.1 10 10
sums 55 W
+
= 33 W
-
= 22


The order of subtraction is sales after the packaging redesign sales before the
packaging redesign. If the packaging redesign increased sales, these differences
would tend to be positive. Many positive differences would lead to a high rank sum
for the positive differences. So, p-value = P(W
+
> 33).

Since the sample size is small, we turn to the WSRST table, for n
W
= 10. We see
P(W
+
> 44) = 0.053, so we know P(W
+
> 33) > 0.053. We fail to reject H
0
. There is
insufficient evidence to suggest that sales of the gourmet cookies increased after the
packaging was redesigned.

Copyright 2011 Pearson Canada Inc. 205
Instructors Solutions Manual - Chapter 9
8. We have seen a similar problem in Develop Your Skills 9.1, Exercise 3, but the data
set has changed. Now, the differences are non-normal. This is such a small data set
that a histogram is not all that useful. We must be cautious making any conclusions
about the locations of the before and after populations.


0
1
2
3
4
F
r
e
q
u
e
n
c
y
(NumberofDailySalesBeforeScriptChanged) (Numberof
DailySalesAfterScriptChanged)
DailySalesbyTelemarketers



H
0
: population of sales of telemarketers before and after the script is changed are
the same
H
1
: population of sales of telemarketers before the script is changed is either to the
right or the left of the different population of sales of telemarketers after the
script is changed
= 0.05


Differences
Absolute
Value Of
Differences
Ordered
Differences
Ranks To
Be
Assigned
Ranks For
Positive
Differences
Ranks For
Negative
Differences
-11 11 1 1 1.5
-10 10 1 2 1.5
-10 10 2 3 3.5
-9 9 2 4 3.5
-1 1 7 5 5
1 1 9 6 6
2 2 10 7 7.5
7 7 10 8 7.5
2 2 11 9 9
sums 45 13.5 31.5


Now we turn to the table for n
W
= 9. We see that P(W
+
> 31.5) > 0.064. This is a
two-tailed test, so the p-value > 2 0.064 = 0.128. We fail to reject H
0
. There is
insufficient evidence to suggest a difference in the locations of the populations of
sales of the telemarketers before and after the script is changed.
Copyright 2011 Pearson Canada Inc. 206
Instructors Solutions Manual - Chapter 9

9. We have seen a similar problem in Develop Your Skills 9.1 Exercise 4, but the data
set has changed. This is a large data set, so we will use Excel.

A histogram of differences is shown below.


0
5
10
15
20
25
30
N
u
m
b
e
r

o
f

S
t
u
d
e
n
t

P
a
i
r
s
(HoursStudiedbyMaleStudents) (HoursStudiedbyFemale
Students)
NumberofHoursStudied OveraFourWeek
Period



This histogram is fairly symmetric, and almost normal-looking. The problem is that
there is an outlier, which significantly affects the mean in this data set. This outlier
may be an error (or a lie). It occurs with an observation of a male student who
claims to have studied 437 hours over the 4-week period. Since this amounts to
about 15 hours of studying per day, it is suspicious. However, we have no way to
check the accuracy of the data.

H
0
: populations of hours of study for male and female students are the same
H
1
: populations of hours of study for male students is to the left of (below) the
population of hours of study for female students
= 0.04

We will use the Excel add-in Wilcoxon Signed Rank Sum Test Calculations to get
W
+
and W
-
for this data set. The output is as follows.


Wilcoxon Signed Rank Sum Test Calculations
sample size 100
W+ 2189
W- 2861

Copyright 2011 Pearson Canada Inc. 207
Instructors Solutions Manual - Chapter 9

Since sample size is large, at 100, we can use the Excel template based on the normal
approximation to the sampling distribution for this test.


MakingDecisionsAboutMatched
Pairs,QuantitativeData,NonNormal
Differences(WSRST)
SampleSize 100
Isthesamplesizeatleast25? yes
Isthehistogramofdifferences
symmetric? yes
W+ 2189
W 2861
zScore 1.15527715
OneTailedpValue 0.12398848
TwoTailedpValue 0.24797695



This is a one-tailed test, so the p-value is 0.124. We fail to reject H
0
. There is
insufficient evidence to suggest that male students study less than female students.

Copyright 2011 Pearson Canada Inc. 208
Instructors Solutions Manual - Chapter 9
10. We have seen a similar problem in Develop Your Skills 9.1 Exercise 5, but the data
set has changed.

The histogram of differences is skewed to the left, and the sample size is fairly small,
so we will use the WSRST for this analysis.


0
1
2
3
4
5
6
7
8
9
10
F
r
e
q
u
e
n
c
y
(L/100kmWithoutCheckingTirePressureReguarly) (L/100
kmWhenCheckingTirePressureRegularly)
L/100kmforCars DuringSummerMonths


H
0
: populations of L/100 km fuel consumption for cars with and without tire
pressures checked regularly are the same
H
1
: populations of L/100 km fuel consumption for cars without tire pressures
checked regularly is to the right of the population of L/100 km fuel
consumption for cars with tire pressures checked regularly (that is, fuel
consumption is higher for cars without tire pressures checked regularly)
= 0.04

We will use Excel for the calculations.


Wilcoxon Signed Rank Sum Test Calculations
sample size 20
W+ 160.5
W- 49.5


Since sample size is < 25, we will use the WSRST tables.

Copyright 2011 Pearson Canada Inc. 209
Instructors Solutions Manual - Chapter 9
The order of subtraction is fuel consumption without checking tires fuel
consumption checking tires. If checking tires regularly improves (reduces) fuel
consumption, then these differences would tend to be positive. So, we can focus on
W
+
for the p-value.

0.01 < P(W
+
> 160.5) < 0.024 (from the table)
This is a one-tailed test. We reject H
0
. There is sufficient evidence to suggest that
populations of L/100 km fuel consumption for cars without tire pressures checked
regularly is to the right of the population of L/100 km fuel consumption for cars with
tire pressures checked regularly.

However, as noted before, although this provides some evidence that checking tires
regularly reduces fuel consumption, there may be other explanatory factors at play.
Although fuel consumption was recorded during two summer months, driving
behavior and weather could have been quite different in the two months, and so we
cannot consider this test to be definitive.


Develop Your Skills 9.3
11. H
0
: p = 0.5 (half the cola drinkers prefer Cola A, half prefer the other brand)
H
1
: p 0.5 (there is a difference in preferences for Cola A and the other brand)
= 0.05
n
ST
= 16 -1 = 15
n
+
= 9, so n
-
= 6

P(n
+
> 9, n = 15, p = 0.5) = 1 P(n
+
s 8) = 1 - 0.696 = 0.304
Since this is a two-tailed test, the p-value = 2 0.304 = 0.608.
We fail to reject H
0
. There is insufficient evidence to suggest there is a difference in
preferences for Cola A and the other brand.

12. First, assess the differences in the ratings.


Shopper Rating for
Ford Dealer
Rating for
Honda Dealer
Difference
(Ford Honda)
1 5 1 +
2 2 3 -
3 3 4 -
4 4 2 +
5 2 3 -
6 2 2 0
7 5 1 +
8 3 2 +
9 3 3 0


Copyright 2011 Pearson Canada Inc. 210
Instructors Solutions Manual - Chapter 9
H
0
: p = 0.5 (the ratings for the car shopping experience are the same at the Ford and
Honda dealers)
H
1
: p 0.5 (there is a difference in ratings for the car shopping experience at the
Ford and Honda dealers)
= 0.04
n
ST
= 9 (differences) 2 (differences of zero) = 7
n
+
= 4, so n
-
= 3
Because this is a two-tailed test, we can focus on either n
+
or n
-
. It is easier to use the
lower numbers, with the tables.
P(n
-
s 3, n
ST
= 7, p = 0.5) = 0.5
This is a two-tailed test, so p-value = 2 0.5 = 1.0.
We fail to reject H
0
. There is insufficient evidence to suggest that there is a
difference in the ratings for the car shopping experience at the Ford and Honda
dealers.

13. First, analyze the differences in the ratings.


Analyst Rating for
North America
Rating for
Europe
Differences
1 3 4 -
2 2 3 -
3 4 2 +
4 3 2 +
5 3 1 +
6 2 3 -
7 3 2 +
8 3 2 +
9 3 4 -
10 2 1 +
11 4 4 0

H
0
: p = 0.5 (the ratings by analysts for the North American and European economies
are the same)
H
1
: p 0.5 (there is a difference in ratings for the North American and European
economics by all analysts)
= 0.03
n
ST
= 11 -1 = 10
n
+
= 6, n
-
= 4
P(n
-
s 4, n
ST
= 10, p = 0.5) = 0.377
This is a two-tailed test, so p-value = 2 0.377 = 0.754.
We fail to reject H
0
. There is insufficient evidence to suggest that there is a
difference in the ratings for the North American and European economies by
analysts.

Copyright 2011 Pearson Canada Inc. 211
Instructors Solutions Manual - Chapter 9
14. H
0
: p = 0.5 (wine-drinkers rate Californian and French wines the same)
H
1
: p > 0.5 (wine-drinkers rate Californian wines higher than French wines, where p
is the proportion of wine-drinkers who prefer Californian wines)
= 0.03

Use the Excel template.


MakingDecisionsAboutMatched
Pairs,RankedData(SignTest)
NumberofNonZeroDifferences 225
NumberofPositiveDifferences 150
NumberofNegativeDifferences 75
OneTailedpValue 3.2E07
TwoTailedpValue 6.4E07



This is a one-tailed test, so the p-value is 3.2 10
-7
, which is very small. It would be
almost impossible to get sample results like this, if wine-drinkers rated Californian
and French wines the same. We reject H
0
. There is sufficient evidence to suggest
that wine-drinkers rate Californian wines higher than French wines. This conclusion
presumes that wine-drinkers who attend wine and cheese shows are representative of
all wine drinkers, which may not be the case.

15. H
0
: p = 0.5 (potential customers are equally ready to buy an HDTV before and after
seeing an ad about HDTVs)
H
1
: p > 0.5 (potential customers are more ready to buy an HDTV after seeing an ad
about HDTVs; p is the proportion of potential customers more likely to buy an
HDTV after seeing the ad)
= 0.05

First we use the Non Parametric Tool, and the Sign Test Calculations, to analyze the
data.


Sign Test Calculations
#of non-zero differences 132
#of positive differences 47
#of negative differences 85


The order of subtraction is willingness to buy before the ad willingness to buy after
the ad. A higher number indicates a greater willingness to buy, so if the ad increases
willingness to buy, this should result in more minus signs. We see that there are 85
negative differences, which indicates that 85 of 132 customers increased their
willingness to buy after seeing the ad.
Copyright 2011 Pearson Canada Inc. 212
Instructors Solutions Manual - Chapter 9


MakingDecisionsAboutMatched
Pairs,RankedData(SignTest)
NumberofNonZeroDifferences 132
NumberofPositiveDifferences 47
NumberofNegativeDifferences 85
OneTailedpValue 0.0006
TwoTailedpValue 0.0012



From the template, we see the one-tailed p-value is 0.0006, which is quite small. We
reject H
0
. There is sufficient evidence to infer that willingness to buy an HDTV
increased after potential customers saw the ad.

We can also do this question by hand. Sampling is done without replacement. There
are 132 customers in the sample, presumably less than 5% of all potential customers
for HDTVs.

n
ST
= 132 > 20, so the sampling distribution of will be approximately normal. p

31 . 3
132
) 5 . 0 )( 5 . 0 (
5 . 0
132
85

|
.
|

\
|
=

=
ST
p
n
pq
p p p p
z
o


p-value = P(z > 3.31) = 1 0.9995 = 0.0005
Since the p-value is < , we reject H
0
. There is sufficient evidence to suggest that
potential customers are more ready to buy an HDTV after seeing an ad about
HDTVs. Notice that the p-values with the template (based on the binomial
distribution) and the by-hand approximation are almost equal here.

Chapter Review Exercises
1. Matched-pairs samples are better than independent samples for exploring cause and
effect, because they control some of the potential causal variables, and therefore take
them out of the picture. If we match Business grads according to age, experience,
location, and academic performance, then we know that any difference in salary is
not caused by these factors.

2. These cannot be matched pairs, because the sample sizes are different. You can be
sure that the samples are independent if the sample sizes are different.

3. The two different approaches will usually, but not always, lead to the same
conclusion. It is harder to reject the null hypothesis with the Wilcoxon Signed Rank
Copyright 2011 Pearson Canada Inc. 213
Instructors Solutions Manual - Chapter 9
Sum Test, and this is why the t-test of
D
is preferred, if the necessary conditions are
met. Remember, the Wilcoxon Signed Rank Sum Test works with the ranks of the
values, not the actual values, and so it gives up some of the information available in
the sample data.

4. The computer-based version of the Sign Test is based on the binomial distribution.
The version using the sampling distribution of is an approximation. While the
approximation can be quite good, the actual value provided by the binomial
distribution is more accurate.
p

5. Tom is right. He has just expressed your conclusion in a different way. If the new
version is rated more highly than the old version, we can also say that the old version
ratings are lower than the new version ratings. This exercise is a reminder that you
should read and think carefully about these comparisons. Don't get mixed up in
language.

6. H
0
: ratings for two beer recipes are the same
H
1
: ratings for two beer recipes are different
= 5%
First analyze the data.


Taste Test of Beer
Tester Beer Recipe #3 Beer Recipe #4 Difference
1 1 4 -
2 3 2 +
3 2 1 +
4 5 3 +
5 3 4 -
6 2 1 +
7 4 5 -
8 1 3 -
9 2 1 +
10 3 4 -


We see n
ST
= 10, n
+
= 5, n
-
= 5.
At this point, we can clearly see that we have no evidence of a difference, because
the number of positive differences exactly matches the number of negative
differences.
Fail to reject H
0
. There is insufficient evidence of a difference in ratings for the two
beer recipes.

7. H
0
: p = 0.5 (students rate the two designs the same)
H
1
: p 0.5 (students rate the two designs differently)
= 0.025
Copyright 2011 Pearson Canada Inc. 214
Instructors Solutions Manual - Chapter 9

Sampling is done without replacement. We do not know the total number of students
at the college. As long as there are 8,000 or more, the sample of 400 will be less than
about 5% of the population, and we can use the binomial distribution.

n
ST
= 400 27 = 373 > 20, so the sampling distribution of will be approximately
normal
p

12 . 2
373
) 5 . 0 )( 5 . 0 (
5 . 0
373
207

|
.
|

\
|
=

=
ST
p
n
pq
p p p p
z
o


p-value = 2 P(z > 2.162) = 2 (1- 0.9830) = 2 (0.0170 = 0.034 >
Fail to reject H
0
. There is not enough evidence to suggest that the students rate the
two designs differently.

8. H
0
:
D
= 0
H
1
:
D
> 0
(The order of subtraction is (time without tool time with tool). If the tool speeds
work up, times should be longer without the tool, and the differences will be
positive.)
= 0.05
We are told to assume the differences are normally distributed.
We are given:
x
D
= 3.4 minutes
s
D
= 4.6
n
D
= 18

136 . 3
18
6 . 4
0 4 . 3
=

=
D
D
D D
n
s
x
t



Referring to the t-table, we look at the row for 14 degrees of freedom.
The t-score of 3.136 is to the right of t
.005
. So, p-value < 0.05.
Reject H
0
. There is sufficient evidence to suggest that the tool speeds up the work,
assuming all other explanatory factors are the same.

Copyright 2011 Pearson Canada Inc. 215
Instructors Solutions Manual - Chapter 9
9. H
0
:
D
= 0
H
1
:
D
> 0
(The order of subtraction is (price for job in wealthy neighbourhood price for job in
run-down neighbourhood). If the contractors charge more in the wealthier
neighbourhoods, the differences will be positive.)
= 0.05
We are told to assume the differences are normally distributed.
We are given:
x
D
= 1262
s
D
= 478
n
D
= 10

The Excel template is shown below.


MakingDecisionsAboutthePopulation
MeanwithaSingleSample
Dothesampledataappeartobenormally
distributed? yes
SampleStandardDeviations 4
SampleMean 1262
SampleSizen 1
HypotheticalValueofPopulationMean 0
tScore 8.34894
OneTailedpValue 7.9E06
TwoTailedpValue 1.6E05
78
0



The p-value is very small. Reject H
0
. There is sufficient evidence to suggest that the
contractors charge higher prices in wealthier neighbourhoods.

Copyright 2011 Pearson Canada Inc. 216
Instructors Solutions Manual - Chapter 9
10. The Excel template is shown below. Of course, this could also be done without
Excel.


ConfidenceIntervalEstimateforthe
PopulationMean
Dothesampledataappeartobenormally
distributed? yes
ConfidenceLevel(decimalform) 0.95
SampleMean 1262
SampleStandardDeviations 478
SampleSizen 1
UpperConfidenceLimit 1603.94
LowerConfidenceLimit 920.059
0



We have 95% confidence that the interval ($920, $1,604) contains the true premium
that contractors charge on a bathroom renovation in a wealthy neighbourhood.

11. H
0
: p = 0.5 (diners rate the two salads the same)
H
1
: p > 0.5 (diners rate the mixed green salad higher, where p is defined as the
proportion of diners who prefer the mixed green salad)
= 0.03

Sampling is done without replacement. We do not know the total number of diners
at the restaurant. As long as there are 700 or more, the sample of 35 will be less than
about 5% of the population, and we can use the binomial distribution.

n
ST
= 35 -3 = 32

We could use the sampling distribution of here, but the approximation will not be
that good, because the sample size is fairly small. Instead we will use the Excel
template.
p


MakingDecisionsAboutMatched
Pairs,RankedData(SignTest)
NumberofNonZeroDifferences 32
NumberofPositiveDifferences 20
NumberofNegativeDifferences 12
OneTailedpValue 0.10766
TwoTailedpValue 0.21533



This is a one-tailed test. The p-value is 0.108. Fail to reject H
0
. There is insufficient
evidence to suggest that diners prefer the mixed green salad.

Copyright 2011 Pearson Canada Inc. 217
Instructors Solutions Manual - Chapter 9
12. H
0
: people are willing to pay similar prices for spa weekend in the city and in the
country
H
1
: people are willing to pay more for a spa weekend in the country
= 0.03

The order of subtraction was (price for country spa weekend price for city spa
weekend). A large W
+
provides evidence in favour of H
1
.

p-value = P(W
+
> 1751)

Since the sample size is > 25, we can use the normal approximation to the sampling
distribution of W.

72 . 1
37397 . 189
326
24
) 1 ) 75 ( 2 )( 1 75 ( 75
4
) 1 75 ( 75
1751
24
) 1 2 )( 1 (
4
) 1 (
= =
+ +
|
.
|

\
|
+

=
+ +
|
.
|

\
|
+

=
W W W
W W
W
W
n n n
n n
W
W
z
o



p-value = P(z > 1.72) = 1- 0.9573 = 0.0427
Fail to reject H
0
. There is not enough evidence to suggest that people are willing to
pay more for a spa weekend in the country than a spa weekend in the city, at the 3%
level of significance.

13. There is sufficient evidence to reject the hypothesis that the tasks are completed in
the same time with the two programs. There is sufficient evidence, at the 5% level of
significance, to suggest that there is a difference in the amount of time it takes to
complete tasks with the two programs. The new software would be recommended.
The interval (3.9 minutes, 14.3 minutes) probably contains the average reduction in
time on task with the new software.

14. H
0
: There is no difference in the locations of the populations of sales of soup for
each package design.
H
1
: There is a difference in location of the population of sales of soup for each
package design.
= 0.05

16 . 3
373214 . 39
5 . 124
24
) 1 ) 26 ( 2 )( 1 26 ( 26
4
) 1 26 ( 26
300
24
) 1 2 )( 1 (
4
) 1 (
= =
+ +
|
.
|

\
|
+

=
+ +
|
.
|

\
|
+

=
W W W
W W
W
W
n n n
n n
W
W
z
o



p-value = 2 P(z > 3.16 ) = 2 (1 0.9992) = 0.0008 < 0.05
Reject H
0
. There is sufficient evidence to suggest that there is a difference in sales of
soup for each package design.
Copyright 2011 Pearson Canada Inc. 218
Instructors Solutions Manual - Chapter 9

15. H
0
:
D
= 0
H
1
:
D
< 0 (for order of subtraction (new business before training new business
after training)
= 0.025

We are told we can assume the differences are normally distributed.
First, calculate the differences.


Staff Member
Monthly
New Business
Before Training
($000s)
Monthly
New Business
After Training
($000s)
Difference
Shirley $230 $240 -$10
Tom $150 $165 -$15
Janice $100 $90 $10
Brian $75 $100 -$25
Ed $340 $330 $10
Kim $500 $525 -$25


Using standard formulas, we calculate:
x
D
= -$9.16667
s
D
= 15.942605
n
D
= 6

408 . 1
6
942605 . 15
0 166673 . 9
=

=

=
D
D
D D
n
s
x
t



We refer to the t-table, looking at the row of critical values for 5 degrees of freedom.
Since t
.100
= 1.476, we know P(t s -1.408) > 0.10. Fail to reject H
0
. There is
insufficient evidence to infer that monthly new business increased after the training.

Copyright 2011 Pearson Canada Inc. 219
Instructors Solutions Manual - Chapter 9
16.

) 542 . 6 , 21 . 28 (
6
557979 . 16
571 . 2 833333 . 10

|
.
|

\
|

|
|
.
|

\
|

D
D
D
n
s
score t critical x


Yes, the interval should contain zero, since there was not enough evidence to
conclude there was a difference in monthly new business before and after the
training. A one-tailed test with = 0.025 corresponds to a 95% confidence interval,
which has 0.025 in each tail.

17. With non-normal differences, we must use the Wilcoxon Signed Rank Sum Test.


Staff
Member
Difference
Absolute
Value Of
Differences
Ordered
Differences
Ranks
To Be
Assigned
Ranks For
Positive
Differences
Ranks For
Negative
Differences
Shirley -$10
10 10 1 2
Tom -$15
15 10 2 2
Janice $10
10 10 3 2
Brian -$25
25 15 4 4
Ed $10
10 25 5 5.5
Kim -$25
25 25 6 5.5

sums 21 W
+
=4 W
-
=17


Because of the order of subtraction, we expect W
-
to be the largest rank sum, which
it is.
p-value = P(W > W
-
) = P(W > 17)
From the table, we see P(W > 18) = 0.078, so P(W> 17) > 0.078.
Fail to reject H
0
. There is insufficient evidence to suggest that monthly new business
increased after the training.


Copyright 2011 Pearson Canada Inc. 220
Instructors Solutions Manual - Chapter 9
18. Because these are ranked data, we will use the Sign Test.
First, record the differences in the ratings.


Taste Test of Yogurt Formulations
Taster Recipe 1 Recipe 2 Difference
1 1 2 -
2 4 1 +
3 2 3 -
4 5 4 +
5 3 2 +
6 2 1 +
7 3 2 +
8 2 5 -
9 5 2 +
10 4 3 +


H
0
: p = 0.5 (tasters rate the two formulations of yogurt the same)
H
1
: p 0.5 (tasters prefer one yogurt over the other)
= 0.025

n
ST
= 10
n
+
= 7
n
-
= 3
P(n
-
s 3, n
ST
= 10, p=0.5) = 0.172
p-value = 2 0.172 = 0.344 > 0.025
Fail to reject H
0
. There is insufficient evidence to suggest that the tasters prefer one
yogurt over the other.

19. H
0
:
D
= 0
H
1
:
D
< 0 (for order of subtraction (completion time with new-style drill
completion time with old-style drill))
= 0.05

We are told to assume the differences in completion times are normally distributed.
We are given
x
D
= -5.2
s
D
= 12.2
n
D
= 20

91 . 1
20
2 . 12
0 2 . 5
n
s
x
t
D
D
D D
=

=

=

Copyright 2011 Pearson Canada Inc. 221
Instructors Solutions Manual - Chapter 9

We refer to the t-table, looking at the row with n-1= 20-1 = 19 degrees of freedom.
We see that 1.91 is located between t
.050
and t
.025
. This is a one-tailed test.

0.025 < p-value < 0.05

Reject H
0
. There is sufficient evidence to suggest that task completion times with the
new-style drill are shorter than with the old-style drill.


20.
a. We could consider these quantitative data, as we do with scores on a statistics test,
and that is how we will proceed here. But an argument could also be made that these
are ranked data. It may not be possible to produce scores for self-esteem that are
objective and reproducible.

b. Differences in test scores do not appear normal, but are somewhat symmetric.


0
1
2
3
4
5
6
7
F
r
e
q
u
e
n
c
y
(ScoreBeforeSeminar) (ScoreAfterSeminar)
DifferencesinScoresonSelfEsteemTest,
Beforeand AfterSeminar



H
0
: self-esteem test scores are the same before and after the seminar
H
1
: self-esteem test scores are higher after the seminar
= 5%

We used the Non Parametric Tools add-in Wilcoxon Signed Rank Sum Test
Calculations to get the rank sums shown below.


Copyright 2011 Pearson Canada Inc. 222
Instructors Solutions Manual - Chapter 9
WilcoxonSignedRankSumTestCalculations
samplesize 18
W+ 52
W 119



Since sample size is less than 25, we use the tables to estimate the p-value. The
order of subtraction was (test scores before the seminar) (test scores after the
seminar). If the seminar increased test scores, these differences would tend to be
negative. So, we focus on W
-
, which is 119. P(W
-
> 119) is the p-value. This rank
sum is lower than any shown in the table. We conclude p-value > 0.054.
Fail to reject H
0
. There is not enough evidence to suggest that the self-esteem test
scores are higher after the seminar.

Copyright 2011 Pearson Canada Inc. 223
Instructors Solutions Manual - Chapter 9
21. A histogram of the differences is shown below.


0
2
4
6
8
10
12
14
F
r
e
q
u
e
n
c
y
(SalaryofBusinessGraduate) (SalaryofComputerStudies
Graduate)
DifferencesinSalaries ofBusiness and
ComputerStudiesGraduates



The differences are not perfectly normally distributed. However, the sample size is
fairly large, so we will continue with the t-test.


H
0
:
D
= 0
H
1
:
D
0
(The order of subtraction is Business salary Computer Studies salary.)
= 0.025

We can use Excel functions and the template, as shown below.


MakingDecisionsAboutthePopulation
MeanwithaSingleSample
Dothesampledataappeartobenormally
distributed? yes
SampleStandardDeviations 4684.629
SampleMean 533.333 $
SampleSizen 3
HypotheticalValueofPopulationMean 0
tScore 0.623569
OneTailedpValue 0.268893
TwoTailedpValue 0.537785
0



The two-tailed p-value is 0.538 > 0.025. Fail to reject H
0
. There is insufficient
evidence to suggest that salaries of Business grads are different from salaries of
Computer Studies grads.
Copyright 2011 Pearson Canada Inc. 224
Instructors Solutions Manual - Chapter 9
22. A histogram of the differences is shown below.


0
1
2
3
4
5
6
7
8
9
10
F
r
e
q
u
e
n
c
y
(CommutingTimeinMinutesfor8AMArrival)
(CommutingTimeinMinutesfor9AMArrival)
DifferencesinCommuting TimesforWorkers
ataHonda PlantinAlliston



The differences appear non-normal, but at least somewhat symmetric. We will use
the WSRST.

H
0
: commuting times are the same for 8 am start and 9 am start times
H
1
: commuting times are less for the 8 am start time
= 0.04

We use the add-in to compute the rank sums.


Wilcoxon Signed Rank Sum Test Calculations
sample size 30
W+ 115.5
W- 349.5



W
-
is the largest rank sum, which supports H
1
, given the order of subtraction is (8 am
start commuting time 9 am start commuting time).

Copyright 2011 Pearson Canada Inc. 225
Instructors Solutions Manual - Chapter 9
Since the sample size is large, we can use the normal approximation to the sampling
distribution of W. The Excel template is shown below (this could also be done by
hand).


MakingDecisionsAboutMatched
Pairs,QuantitativeData,NonNormal
Differences(WSRST)
SampleSize 30
Isthesamplesizeatleast25? yes
Isthehistogramofdifferences
symmetric? yes
W+ 115.5
W 349.5
zScore 2.4064957
OneTailedpValue 0.0080532
TwoTailedpValue 0.01610639



This is a one-tailed test. The p-value is 0.008 < 0.04. Reject H
0
. There is sufficient
evidence to suggest that commuting times are lower for the earlier start time.

23. As usual, with a small data set, it is difficult to assess normality. One possible
histogram is shown below.


0
1
2
3
4
5
6
7
8
9
10
F
r
e
q
u
e
n
c
y
(PlayingTimeinMinutesBeforeChanges) (PlayingTimein
MinutesAferChanges)
PlayingTimeforNineHolesofGolf,Ladies'
LeagueFoursomes

Copyright 2011 Pearson Canada Inc. 226
Instructors Solutions Manual - Chapter 9

The histogram is somewhat skewed to the left, but we will assume normality and
proceed.

The completed Excel template is shown below.


MakingDecisionsAboutthePopulation
MeanwithaSingleSample
Dothesampledataappeartobenormally
distributed? yes
SampleStandardDeviations 18.1491
SampleMean 13.13
SampleSizen 2
HypotheticalValueofPopulationMean 0
tScore 3.4697
OneTailedpValue 0.00109
TwoTailedpValue 0.00218
3.00



The order of subtraction is (Playing Times Before Changes) (Playing Times After
Changes). If the changes increased the speed of play, we would expect a negative
difference. The p-value is 0.001 < 0.05. Reject H
0
. There is enough evidence to
conclude that playing times were faster after the changes were made. Note that the
change in approach may have caused the faster play. However, it may be that the fact
that the course marshal was obviously focused on faster play was the real cause.


24. We have already assessed normality. The completed Excel template for the
confidence interval estimate is shown below.


ConfidenceIntervalEstimatefor
thePopulationMean
Dothesampledataappeartobe
normallydistributed? yes
ConfidenceLevel(decimalform) 0.9
SampleMean 13.13
SampleStandardDeviations 18.1491
SampleSizen 23.00
UpperConfidenceLimit 6.6321
LowerConfidenceLimit 19.629


Copyright 2011 Pearson Canada Inc. 227
Instructors Solutions Manual - Chapter 9

A 90% confidence interval estimate for the difference in playing times is (-19.6
minutes, -6.6 minutes). We have 90% confidence that the interval from 6.6 minutes
to 19.6 minutes contains the reduction in playing times.

25. Because these are matched pairs of ranked data, we will use the Sign Test.

H
0
: p = 0.5 (employees rate the two presidents the same)
H
1
: p > 0.5 (employees rate the new president higher than the old president)
= 0.04

We can use the Non Parametric Tools Add-In, the Sign Test Calculations, to analyze
the ratings. The output is shown below.


SignTestCalculations
#ofnonzerodifferences 8
#ofpositivedifferences 6
#ofnegativedifferences 2


Copyright 2011 Pearson Canada Inc. 228
Instructors Solutions Manual - Chapter 9
We can then use the Sign Test template to complete the hypothesis test.


MakingDecisionsAboutMatched
Pairs,RankedData(SignTest)
NumberofNonZeroDifferences 8
NumberofPositiveDifferences 6
NumberofNegativeDifferences 2
OneTailedpValue 0.14453125
TwoTailedpValue 0.2890625



We see that the one-tailed p-value is 0.145 > 0.04. Fail to reject H
0
. There is not
enough evidence to conclude that employees rate the new president higher than the
old president.

26. First analyze the data. The histogram of differences looks normal.


0
2
4
6
8
10
12
14
16
F
r
e
q
u
e
n
c
y
(TimetoCompleteSearchTaskinMinutes,UsingOldSearch
Engine) (TimetoCompleteSearchTaskinMinutes,UsingNew
SearchEngine)
DifferencesinTimestoCompleteSearch
TasksUsingDifferentSearchEngines



Copyright 2011 Pearson Canada Inc. 229
Instructors Solutions Manual - Chapter 9
The completed Excel template for the t-test of
D
is shown below.


MakingDecisionsAboutthePopulation
MeanwithaSingleSample
Dothesampledataappeartobenormally
distributed? yes
SampleStandardDeviations 4.94218
SampleMean 2.61765
SampleSizen 3
HypotheticalValueofPopulationMean 0
tScore 3.08839
OneTailedpValue 0.00203
TwoTailedpValue 0.00406
4



We could also have used the Data Analysis tool, to get the following output.


tTest:PairedTwoSampleforMeans
Timeto
Complete
SearchTask,
UsingOld
SearchEngine
Timeto
Complete
SearchTask,
UsingNew
SearchEngine
Mean 15.35294118 12.73529412
Variance 40.4171123 32.38235294
Observations 34 34
PearsonCorrelation 0.668571931
HypothesizedMeanDifference 0
df 33
tStat 3.088389544
P(T<=t)onetail 0.002031766
tCriticalonetail 1.692360258
P(T<=t)twotail 0.004063531
tCriticaltwotail 2.034515287



Copyright 2011 Pearson Canada Inc. 230
Instructors Solutions Manual - Chapter 9
Either way, the result is the same.

H
0
:
D
= 0
H
1
:
D
> 0
(The order of subtraction is (Time Using Old Search Engine Time Using New
Search Engine)
= 0.05

p-value = 0.004 < 0.05

Reject H
0
. There is enough evidence to suggest that the times to complete search
tasks are longer with the old search engine.

b. First analyze the data. In this case, the differences do not appear normal. However,
they do appear fairly symmetric, so we will use the WSRST.


0
2
4
6
8
10
12
F
r
e
q
u
e
n
c
y
(TimetoCompleteSearchTaskinMinutes,UsingOld
SearchEngine) (TimetoCompleteSearchTaskin
Minutes,UsingNew SearchEngine)
DifferencesinTimestoCompleteSearch
TasksUsingDifferentSearchEngines



Copyright 2011 Pearson Canada Inc. 231
Instructors Solutions Manual - Chapter 9
The output from the Non Parametric Tools Add-In for Wilcoxon Signed Rank Sum
Test Calculations is shown below.


WilcoxonSignedRankSumTestCalculations
samplesize 33
W+ 282
W 279



The completed Excel template is shown below.


MakingDecisionsAboutMatched
Pairs,QuantitativeData,NonNormal
Differences(WSRST)
SampleSize 33
Isthesamplesizeatleast25? yes
Isthehistogramofdifferences yes
W+ 282
W 279
zScore 0.02680174
OneTailedpValue 0.48930893
TwoTailedpValue 0.97861786



A quick look at the template reveals a very high p-value.

H
0
: search task completion times are the same for the old and new search engines
H
1
: search task completion times are lower for the new search engine.
= 0.05

p-value = 0.49 > 0.05

Fail to reject H
0
. There is not enough evidence to conclude that search times are
lower with the new search engine.
Copyright 2011 Pearson Canada Inc. 232
Instructors Solutions Manual - Chapter 9
Copyright 2011 Pearson Canada Inc. 233
c. We will use the Sign Test for these ranked data.


MakingDecisionsAboutMatched
Pairs,RankedData(SignTest)
NumberofNonZeroDifferences 29
NumberofPositiveDifferences 21
NumberofNegativeDifferences 8
OneTailedpValue 0.01206
TwoTailedpValue 0.02412



H
0
: p = 0.5 (users rate the two search engines about the same)
H
1
: p > 0.5 (users prefer the new search engine)
= 0.05

p-value = 0.012 < 0.05

Reject H
0
. There is enough evidence to conclude that users prefer the new search
engine.

You might also like