You are on page 1of 14

Instructors Solutions Manual - Chapter 3

Chapter 3 Solutions Develop Your Skills 3.1 1. y = 2 + 4 + 6 + 8 = 20 2. y2 = 22 + 42 + 62 + 82 = 120 (y)2 = (2 + 4 + 6 + 8) = 202 = 400 The answers are different because of the different order of operations
y 20 5 n 4 x 16 4 n 4

3.

4.

( x 4) (1 4) (3 4) (5 4) (7 4) 3 (1) 1 3 0 ( y 5) (2 5) (4 5) (6 5) (8 5) 3 (1) 1 3 0
Consider the data set: 34, 67, 2, 31, 89, 35. For this data set, calculate: x 34 67 2 31 89 35 258 x 2 34 2 67 2 2 2 312 89 2 35 2 15756 x 258 43 n 6
x
2 2 x

5. a. b. c.

d.

n 1

2 258 15756

15756 11094 5

4662 932 .4 30.535 5

Develop Your Skills 3.2 6. The mean age is 41.2, the median age is 35.5, and the mode of the ages is 30. In this case, because the data set is severely skewed to the right (as we saw when we created the histogram of ages in Develop Your Skills 2.2, Exercise 6), the median is the better measure of central tendency.

7.

The mean income is $47,868.10, and the median income is $44,925. This data set is skewed to the right (as we saw when we created the histogram of incomes in Develop Your Skills 2.2, Exercise 7). As a result the unusually high incomes have pulled the mean to the right of the median. The median is the better measure of central tendency.

Copyright 2011 Pearson Canada Inc.

51

Instructors Solutions Manual - Chapter 3

8.

From the stem and leaf display we constructed in Develop Your Skills 2.2, Exercise 8, we can see that this data set is slightly skewed to the right, but not much. This is reflected in the calculations of mean and median (when in doubt, calculate both!). The mean of this data set is 26.12, and the median is 26. Either would be acceptable as a measure of central tendency, but the mean is preferred, because its calculation depends on the value of every single data point in the data set. Because the mean and the median are almost equal, we expect the distribution to be symmetric.

9.

10. Because the quarterly operating profits of the oil and gas sector are highly skewed to the right, the median of $1.816 billion is the appropriate measure of central tendency. Although the distribution of operating profits for the manufacturing sector is not as skewed, so we might have considered using the mean as a measure of central tendency, we must use the median so that we are comparing the same measure for both data sets. The median quarterly operating profit for the manufacturing sector is $8.909 billion. Generally, the quarterly operating profits are much higher for the manufacturing sector than for the oil and gas sector.
Develop Your Skills 3.3 11. Since the age data are skewed to the right, the IQR is the best measure of variability. Using Excel calculations, we find: Q1 = 31 Q3 = 42 IQR = 11 The Empirical Rule could not be applied here, as the data are not symmetric and bellshaped.

12. Since the data are skewed to the right, the IQR is the best measure of variability. Using Excel calculations we find: Q1 = $40,350 Q3 = $55,400 IQR = $15,050 13. Since this data set is fairly symmetric with no obvious outliers, the standard deviation is the preferred measure of variability.
x 2

x 2
n

18381

n 1

653 2 25 18381 17056 .36 1324 .64 55.19333 7.429 24 24 24

Copyright 2011 Pearson Canada Inc.

52

Instructors Solutions Manual - Chapter 3

14. Because the distribution is reasonably symmetric and bell-shaped, the Empirical Rule can be applied. You must create a histogram to check this. Shown below is one possible histogram for the data set.

DailyCustomer Counts, Downtown Automotive


10 9 8 7 6 5 4 3 2 1 0

NumberofDays

NumberofCustomers

15. The mean number of daily customers at Downtown Automotive is 26.12 (calculated for Develop Your Skills 3.2, Exercise 8). The standard deviation is 7.43 (the answer to Exercise 13 above). The Empirical Rule says that about 95% of the data points will lie within 2 standard deviations of the mean. x 2s 26.12 + 2(7.43) = 40.98 x 2s 26.12 - 2(7.43) = 11.26 If this sample is representative of the population, then 95% of the daily customer counts will be between 11.26 and 40.98. Since the data set is (more or less) symmetric, this means about 2% of the data will lie below 11.26, and about 2% will lie above 40.98. About 97.5% of the time, the maximum number of customers Doug would need to plan for is 41.
Develop Your Skills 3.4 16. The scatter diagram showed some (not much) evidence of a positive relationship between household income and monthly spending on restaurant meals. Since the relationship appears to be linear, the Pearson r is the appropriate measure of association. Excel calculates it as 0.42. This is positive and less than 0.5, as we would expect, since the relationship is not very strong.

Copyright 2011 Pearson Canada Inc.

53

Instructors Solutions Manual - Chapter 3

17. The only choice is b (-0.88). Choices a and c are incorrect, because they are positive and the relationship is clearly negative. Choice d is not correct, because the negative relationship is obviously fairly strong. 18. The Spearman rank correlation coefficient must be used here, since the data are ranked. The Spearman r (calculated with Excel) is 0.61. This indicates a positive relationship between the recruiters ranking and the supervisors ranking, but the relationship is not particularly strong. 19. These are quantitative data, and the graph created for Develop Your Skills 2.5, Exercise 24 shows a linear relationship. The Pearson r is the correct measure of association. Excel calculates it at -0.67 (note that you must check for linearity of the relationship before you calculate the Pearson r). There is a negative relationship between the two variables. The greater the number of hours of paid employment during the semester, the lower the semester average mark. 20. Exhibit 3.44b is the graph that corresponds to the negative correlation coefficient of -0.90. This is obvious, since it is the only graph of the three showing a negative relationship. Exhibits 3.44a and c share the same correlation coefficient of 0.73. This is interesting because the correlation probably looks stronger in Exhibit 3.44a. However, notice that these two graphs depict exactly the same data, but with the xand y-axes reversed. Realize that you cannot reliably eyeball the strength of a relationship. The correlation coefficient allows us to make much more precise comparisons.
Chapter Review Exercises 1. The mean mark is quite a bit higher than the median mark. This suggests that the distribution of marks is skewed to the right. It is likely that there are a few unusually high marks in the distribution.

2.

The mean weekly sales for both businesses are similar, although the mean sales at the haircutting salon are a bit lower than at the day spa. However, the mean sales at the haircutting salon are much less variable than at the day spa. This would result in a greater number of weeks with higher sales for the haircutting salon, and as a result, it would be a better purchase (all other things being equal). The mean age is 26.05, and the median age is 20.5. This is as expected. Because the distribution of ages is skewed to the right, the mean is greater than the median. There are several modes in the data set: 8, 9, 12, 20. Clearly, the three lower modes are not good indications of central tendency in this data set. The standard deviation is 17.1. Calculation of the interquartile range (manual method) is as follows: The location of Q1 is 5.25, and its value is 12. The location of Q3 is 15.75, and its value is 38, so the IQR is 26. Because the Pearson r is higher for Don's data set, the correlation between test marks and calories consumed (for Don) will be higher than the correlation between test

3.

4.

Copyright 2011 Pearson Canada Inc.

54

Instructors Solutions Manual - Chapter 3

marks and hours spent studying (for Jane). However, there is no obvious reason why eating more calories would result in higher test marks. There is a logical connection between hours spent studying and test marks, so this cause and effect relationship is stronger. 5. First, remember that with sample data, we cannot absolutely prove anything. As well, although the correlation coefficient is low, this does not mean that there is no relationship between incomes and purchases. As discussed in the answer to Develop Your Skills Exercise 23 in Chapter 2, the lack of relationship between an individual purchase and annual income does not preclude the existence of a relationship between annual purchases and annual income. Both histograms showed some right-skewness, particularly the purchases by females. However, the mean and median purchases for both groups are similar. The mean purchase by females is $30.86, and the median purchase is $29.50. The mean purchase for males is $28.90, and the median purchase is $28.38. Because the means and medians are so close, we will use the mean as the measure of central tendency. On average, the purchases of males are slightly higher than the purchases of females. Because we used the mean for the measure of central tendency, we will use the standard deviation as the measure of variability. The standard deviation for purchases by females is $10.97, while the standard deviation for purchases by males is $5.39. As we saw in the histograms we created in Chapter 2 (Chapter Review Exercise 6), there is less variability in purchases by males than purchases by females. 7. Because a histogram of the data is symmetric and bell-shaped, we can apply the Empirical Rule. If the sample is representative of the population, then we can expect that about 68% of the data lie within one standard deviation of the mean, that is, between 170 cm and 184.4 cm, with 32% divided between the two tails of the distribution. This means that about 16% of young men aged 18-24 would be shorter than 170 cm. Almost all of the heights would be within three standard deviations of the mean, that is, between 155.6 cm and 198.8 cm. Therefore, there would not be many young men aged 18-24 who were taller than 199 cm. This is a small data set, and so it is not possible to create a histogram to assess the shape of each locations sales distribution. However, if you order the two data sets, in both cases, there are more observations on the high end of the range than elsewhere, suggesting some skewness. Therefore the interquartile range is probably the best measure of variability. If you do the calculations by hand, the results are as follows. Both data sets have 7 data points. Q1 location is the 0.25(n+1) = 0.25(8) = 2nd place Q3 location is the 0.75(8) = 6th place

6.

8.

Copyright 2011 Pearson Canada Inc.

55

Instructors Solutions Manual - Chapter 3

Red Deer Vernon Q1 109.55 112.30 Q3 122.48 122.01 IQR 12.93 9.71 The Red Deer locations sales are more variable than the Vernon locations sales over the period. If you do the calculations with Excel, the numerical results are different but the conclusion is the same. Red Deer Vernon 112.23 114.795 122.42 121.675 10.19 6.88

Q1 Q3 IQR 9.

The mean price of the inkjet printer cartridges is $26.93. The median price is $25.95.

10. The mean weight of the honey in the jars is 497.3 grams. While this is below 500 grams, it is not much below. Without knowing more about the variability of the weights of honey in the jars, we cannot make a conclusion about whether the jars are being consistently underfilled. When you master the techniques of Chapter 7, you will be able to decide.

Copyright 2011 Pearson Canada Inc.

56

Instructors Solutions Manual - Chapter 3

11. This data set is reasonably symmetric and bell-shaped. The mean weekly sales for the sample of stores trying out the new marketing approach are $5101.07, with a standard deviation of $325.60. Applying the Empirical Rule, almost all of the sales would be between $4124.28 and $6077.85.

WeeklySales for Stores with NewMarketing Approach


5

NumberofStores

4 3 2 1 0

WeeklySales

12. First we must assess the shape of the distribution. The histogram below shows a reasonably symmetric and bell-shaped data set.

Annual Days Off(Other Than Vacation) for a Random Sample of Employees


12

NumberofEmployees

10 8 6 4 2 0 DaysOff

Copyright 2011 Pearson Canada Inc.

57

Instructors Solutions Manual - Chapter 3

The mean days off is 6.04, with a standard deviation of 1.29. The sample mean is below the average days off in the past, so there is a reason to hope that there has been an improvement. However, we need to do a formal hypothesis test (covered in Chapter 7) to determine whether there is sufficient evidence to conclude that average days off among all employees has actually decreased. As well, even if we conclude that there has been a decrease in average days off, we cannot necessarily conclude that the wellness program is the cause. Other factors may account for the difference, such as a change in the workforce. 13. Since there are many tied values, it is a bit of a challenge to do the ranking process. The results are as follows. Ratings by Customers Rank Ratings by Bosses Rank 2 2 3 4 3 3 2 2 1 2 4 4 8 10 8 8 4 4 1 4 3 4 2 1 2 2 4 1 3 2 7.5 9.5 4.5 1.5 4.5 4.5 9.5 1.5 7.5 4.5

The Spearman rank correlation coefficient is -0.574. This indicates a negative correlation between the ratings by customers and the ratings by bosses, that is, the ratings tend to be higher by customers when the ratings by bosses are lower. However, the correlation coefficient indicates that this relationship is weak. 14. Since the data sets are fairly symmetric, the mean and the standard deviation are the appropriate measures of central tendency and variability. The results are shown below. Location 1 Location 2 Mean 108.4 124.4 Standard Deviation 17.3 29.6 Mean daily pedestrian traffic is higher at Location 2, at 124.4, compared with 108.4 at Location 1. The daily pedestrian traffic at Location 2 (standard deviation is 29.6) is also more variable than at Location 1 (standard deviation of 17.3).

Copyright 2011 Pearson Canada Inc.

58

Instructors Solutions Manual - Chapter 3

15. Because the data are quantitative and appear to be linearly related, the Pearson r is the appropriate measure of association. The Pearson r is 0.958, indicating a high correlation between the mark in Business Math and in Statistics. 16. Because the data are quantitative and appear to be linearly related, the Pearson r is the appropriate measure of association. The Pearson r is 0.941, indicating a high correlation between annual advertising expenditure and annual sales. 17. The customer incomes are skewed to the right, with a few incomes much higher than the rest in the data set. Therefore, the median and the interquartile range are the appropriate measures. The median income of the drugstore customers is $44,925. The interquartile range is 15,050 (Excel) or 15,762.5 (by hand).

Copyright 2011 Pearson Canada Inc.

59

Instructors Solutions Manual - Chapter 3

18. Before we can decide on appropriate numerical measures to compare the data sets, we must examine the shapes of the distributions, by creating histograms, as shown below. Note that these histograms are not appropriate for comparison of the distributionsthey are just for deciding on the appropriate measures.

Kate'sClients' RRSP Holdings


60 50

NumberofClients

40 30 20 10 0

RRSPHolding

Wally'sClients' RRSP Holdings


40 35 30

NumberofClients

25 20 15 10 5 0

RRSPHolding

Copyright 2011 Pearson Canada Inc.

60

Instructors Solutions Manual - Chapter 3

Since both data sets are reasonably symmetric, we can use the mean and the standard deviation to compare them. The results are shown in the table below. Kates Clients Wallys Clients RRSP Holdings RRSP Holdings Mean $111,021.86 $101,092.89 Standard Deviation $ 32,050.79 $ 40,192.47 The mean holdings of Kates clients RRSPs are higher, at $111,021.86, than the mean holdings of Wallys clients RRSPs, at $101,092.89. The variability of the RRSP holdings of Kates clients is less than for Wallys clients (standard deviation of $32,050.79, compared with $40,192.47). 19. First, we must examine the shape of the distribution. A histogram shows a reasonably symmetric data set (see below).

Contents of a Sample of Soup Cans


14 12 10 NumberofCans 8 6 4 2 0

ContentsinMillilitres

The mean measurement is 540.4 mL, with a standard deviation of 5.17 mL. The maximum measurement in the sample is 551 mL, and this does not give any cause for concern that the cans contain more than 556 mL. As well, if we apply the Empirical Rule, we note that almost all of the measurements would be between 524.9 mL and 555.9 mL, which again does not give any cause for concern that the cans contain more than 556 mL. A measurement of 530 mL is about two standard deviations below the mean. The Empirical Rule says that about 95% of the data will lie within two standard deviations of the mean, with the remaining 5% split between the two tails of the distribution. If this can be applied to the population data, then about 2 % of the cans would contain less than 530 mL.

Copyright 2011 Pearson Canada Inc.

61

Instructors Solutions Manual - Chapter 3

20. Once again, we must check the distribution of the data set to see if the Empirical Rule applies.

Contents of a Sample of Soup Cans


12 10 NumberofCans 8 6 4 2 0

ContentsinMillilitres

Since the distribution is approximately bell-shaped and symmetric, we can apply the Empirical Rule. The mean measurement is 543.63 mL, with a standard deviation of 6.44 mL. There is one can of soup in the sample that contains more than 556 mL. Applying the Empirical Rule, we note that 95% of the soup cans would contain between 530.7 mL and 556.5 mL. This leaves 2% of the soup cans with more than 556 mL, and 2% of the soup cans with less than 530 mL. 21. Once again, the answers will depend on the most up-to-date data available when you are answering this question. (As a guide, the retail sector data that matches the data in the text for the manufacturing sector are discussed. Because of revisions, the more recent data sets may not exactly match these data. For example, the retail series was significantly changed between the time of the original download in May of 2009, and a subsequent download in November 2009. It is a challenge to come up with a class width for comparison for the two sectors, since quarterly operating profits are much smaller for the retail sector than for the manufacturing sector. The compromise choice of $1 billion is really too narrow for the manufacturing data, and not wide enough for the retail sector data. However, these histograms give a starting point for the analysis.)

Copyright 2011 Pearson Canada Inc.

62

Instructors Solutions Manual - Chapter 3

QuarterlyOperating Profits
CanadianRetailSector,I1988toIII 2008
45 40 35
NumberofQuarters

30 25 20 15 10 5 0

MillionsofDollars

QuarterlyOperating Profits
CanadianManufacturing,I1988toIII 2008
12
NumberofQuarters

10 8 6 4 2 0

MillionsofDollars

The histograms show that quarterly operating profits are smaller for the Canadian retail sector than for the manufacturing sector. For over half the period, quarterly operating profits for the retail sector were < $2 billion. In over 60% of the quarters in the period under study, the quarterly operating profits of the Canadian manufacturing sector were $8 billion or more. The distributions of quarterly operating profits also differ. The distribution for the retail sector profits is skewed to the right, with profits above $3 billion in a few quarters. The distribution for the manufacturing sector profits is skewed to the left, with a few quarters where operating profits were unusually low (below $5 billion).
63

Copyright 2011 Pearson Canada Inc.

Instructors Solutions Manual - Chapter 3

Median quarterly operating profits for the manufacturing sector were $8.909 billion, much greater than the median quarterly operating profits for the retail sector, at $1.827 billion. Quarterly operating profits for the manufacturing sector were much more variable over the period, with an interquartile range of $4.648 billion, compared with only $1.1 billion for the retail sector. There appears to be some slight positive correlation between the quarterly operating profits of the two sectors, but it is not strong. The scatter diagram below illustrates.

QuarterlyOperatingProfitsoftheManufacturing Sector($Millions)

QuarterlyOperating ProfitsforTwo CanadianSectors,I1988 toIII2008


$16,000 $14,000 $12,000 $10,000 $8,000 $6,000 $4,000 $2,000 $0 $0 $1,000 $2,000 $3,000 $4,000 $5,000 QuarterlyOperatingProfitsoftheRetailSector($Millions)

The Pearson r is 0.42, confirming the impression from the scatter diagram of a weak positive relationship. When quarterly operating profits of the retail sector are higher, the quarterly operating profits of the manufacturing sector tend to be higher, but the correlation is weak. (Your comparison, with more up-to-date data, should contain all of the elements shown in this answer.)

Copyright 2011 Pearson Canada Inc.

64

You might also like