You are on page 1of 156

1

Chapter
Introduction to
Statistics

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

Chapter Outline

1.1 An Overview of Statistics


1.2 Data Classification
1.3 Frequency Distributions and Their Graphs
1.4 More Graphs and Displays
1.5 Measures of Central Tendency
1.6 Measures of Variation
1.7 Measures of Position

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

Objectives
The end of this chapter, students should be able :
The definition of statistics
Distinguish between a population and a sample and
between a parameter and a statistic
Distinguish between descriptive statistics and
inferential statistics
Distinguish between qualitative data and quantitative
data

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

Objectives
The end of this chapter, students should be able :
Constructed the frequency Distributions table,
Constructed frequency histograms, frequency polygons,
relative frequency histograms and ogives
Graph the quantitative data and qualitative data.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

What is Data?
Data
Consist of information coming from observations, counts, measurements, or
responses.

People who eat three daily servings of whole grains


have been shown to reduce their risk ofstroke by
37%. (Source: Whole Grains Council)
Seventy percent of the 1500 U.S. spinal cord
injuries to minors result from vehicle accidents, and
68 percent were not wearing a seatbelt. (Source: UPI)
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

What is Statistics?
Statistics
The science of collecting,
organizing, analyzing, and
interpreting data in order to
make decisions.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

Data Sets
Population
The collection of all outcomes,
responses, measurements, or
counts that are of interest.
Sample
A subset, or part, of the population.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

Example: Identifying Data Sets


In a recent survey, 1500 adults in the United States were
asked if they thought there was solid evidence for global
warming. Eight hundred fifty-five of the adults said yes.
Identify the population and the sample. Describe the
data set. (Adapted from: Pew Research Center)

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

Solution: Identifying Data Sets


The population consists of the
responses of all adults in the U.S.
The sample consists of the
responses of the 1500 adults in the
U.S. in the survey.
The sample is a subset of the
responses of all adults in the U.S.
The data set consists of 855 yess
and 645 nos.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

Responses of adults in
the U.S. (population)
Responses of
adults in survey
(sample)

TRY IT YOURSELF 1
Page 23

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

10

Parameter and Statistic


Parameter
A numerical description of a population
characteristic.
Average age of all people in the United States

Statistic
A numerical description of a sample
characteristic.
Average age of people from a sample
of three states
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

11

Example: Distinguish Parameter and Statistic


Decide whether the numerical value describes a
population parameter or a sample statistic.
1. A recent survey of a sample of college
career centers reported that the average
starting salary for petroleum
engineering majors is $83,121. (Source:
National Association of Colleges and
Employers)

Solution:
Sample statistic (the average of $83,121 is based
on a subset of the population)
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

12

Example: Distinguish Parameter and Statistic


Decide whether the numerical value describes a
population parameter or a sample statistic.
2. The 2182 students who accepted
admission offers to Northwestern
University in 2009 have an average SAT
score of 1442. (Source: Northwestern
University)

Solution:
Population parameter (the SAT score of 1442 is
based on all the students who accepted admission
offers in 2009)
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

13

Example: Distinguish Parameter and Statistic


Decide whether the numerical value describes a
population parameter or a sample statistic.
3. In a random check of 400 retail stores, the
Food and Drug Administration found that
34% of the stores were not storing fish at the
proper temperature.
Solution:
Sample statistic because the percent 34% is based
on a subset of the population.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

14

TRY IT YOURSELF 2
Page 24

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

15

Branches of Statistics
Descriptive
Statistics Involves
organizing,
summarizing, and
displaying data.

Inferential Statistics
Involves using sample
data to draw
conclusions about a
population.

e.g. Tables, charts,


averages

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

16

Example: Descriptive and Inferential


Statistics
Decide which part of the study represents the descriptive branch
of statistics. What conclusions might be drawn from the study
using inferential statistics?

Question:
A large sample of men, aged 48,
was studied for 18 years. For
unmarried men, approximately
70% were alive at age 65. For
married men, 90% were alive at
age 65.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

17

Solution: Descriptive and Inferential


Statistics
Descriptive statistics involves statements such as For unmarried
men, approximately 70% were alive at age 65 and For married
men, 90% were alive at 65.
A possible inference drawn from the study is that being married is
associated with a longer life for men.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

18

TRY IT YOURSELF 3
Page 25

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

19

Types of Data
Qualitative Data
Consists of attributes, labels, or nonnumerical entries.
Major

Place of birth

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

Eye color

20

Types of Data
Quantitative data
Numerical measurements or counts.
Age

Weight of a letter

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

Temperature

21

Example: Classifying Data by Type


The base prices of several vehicles are shown in the
table. Which data are qualitative data and which are
quantitative data? (Source Ford Motor Company)

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

22

Solution: Classifying Data by Type

Qualitative Data
(Names of vehicle
models are nonnumerical
entries)

Quantitative Data
(Base prices of
vehicles models are
numerical entries)

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

23

TRY IT YOURSELF 1
Page 29

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

24

Frequency Distribution
Frequency Distribution
Class Frequency, f
A table that shows
Class width 1 5
5
classes or intervals of 6 1 = 5
6 10
8
data with a count of the
11 15
6
number of entries in each
16 20
8
class.
21 25
5
The frequency, f, of a
26 30
4
class is the number of
data entries in the class. Lower class
Upper class
limits
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

limits
25

Constructing a Frequency Distribution


1. Decide on the number of classes.
Usually between 5 and 20; otherwise, it may be
difficult to detect any patterns.
2. Find the class width.
Determine the range of the data.
Divide the range by the number of classes.
Round up to the next convenient number.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

26

Constructing a Frequency Distribution


3. Find the class limits.
You can use the minimum data entry as the lower
limit of the first class.
Find the remaining lower limits (add the class
width to the lower limit of the preceding class).
Find the upper limit of the first class. Remember
that classes cannot overlap.
Find the remaining upper class limits.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

27

Constructing a Frequency Distribution


4. Make a tally mark for each data entry in the row of
the appropriate class.
5. Count the tally marks to find the total frequency f
for each class.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

28

Example: Constructing a Frequency


Distribution
The following sample data set lists the prices (in dollars)
of 30 portable global positioning system (GPS)
navigators. Construct a frequency distribution that has
seven classes.
90 130 400 200 350 70 325 250 150 250
275 270 150 130 59 200 160 450 300 130
220 100 200 400 200 250 95 180 170 150

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

29

Solution: Constructing a Frequency


Distribution
90 130 400 200 350 70 325 250 150 250
275 270 150 130 59 200 160 450 300 130
220 100 200 400 200 250 95 180 170 150
1. Number of classes = 7 (given)
2. Find the class width
max min 450 59 391

55.86
#classes
7
7

Round up to 56

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

30

Solution: Constructing a Frequency


Distribution
3. Use 59 (minimum value)
as first lower limit. Add
the class width of 56 to
get the lower limit of the
next class.
59 + 56 = 115
Find the remaining
lower limits.

Lower
limit
Class
width = 56

Upper
limit

59
115
171
227
283
339
395

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

31

Solution: Constructing a Frequency


Distribution
The upper limit of the first
class is 114 (one less than
the lower limit of the
second class).
Add the class width of 56
to get the upper limit of
the next class.
114 + 56 = 170
Find the remaining upper
limits.

Lower
limit

Upper
limit

59
115

114
170

171
227
283

226
282
338

339
395

394
450

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

Class
width = 56

32

Solution: Constructing a Frequency


Distribution
4. Make a tally mark for each data entry in the row of
the appropriate class.
5. Count the tally marks to find the total frequency f
for each class.
Class

Tally

Frequency, f

IIII

115 170

IIII III

171 226

IIII I

227 282

IIII

283 338

II

339 394

395 450

III

59 114

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

33

TRY IT YOURSELF 1
Page 62

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

34

Determining the Midpoint


Midpoint of a class
(Lower class limit) (Upper class limit)
2
Class

Midpoint

Frequency, f

59 114

59 114
86.5
2

115 170

115 170
142.5
2

171 226

171 226
198.5
2

Class width = 56
8
6

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

35

Determining the Relative Frequency


Relative Frequency of a class
Portion or percentage of the data that falls in a
particular class.
class frequency f
relative frequency

Sample size
n
Class

Frequency, f

59 114

115 170

171 226

Relative Frequency
5
0.17
30
8
0.27
30
6
0.2
30

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

36

Determining the Cumulative Frequency


Cumulative frequency of a class
The sum of the frequency for that class and all previous
classes.
Class

Frequency, f

Cumulative frequency

59 114

115 170

+ 8

13

171 226

+ 6

19

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

37

Expanded Frequency Distribution

Class

Frequency, f

Midpoint

Relative
frequency

59 114

86.5

0.17

115 170

142.5

0.27

13

171 226

198.5

0.2

19

227 282

254.5

0.17

24

283 338

310.5

0.07

26

339 394

366.5

0.03

27

395 450

422.5

0.1

30

f = 30

Cumulative
frequency

f
1
n

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

38

TRY IT YOURSELF 2
Page 63

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

39

Graphs of Frequency Distributions

frequency

Frequency Histogram
A bar graph that represents the frequency distribution.
The horizontal scale is quantitative and measures the
data values.
The vertical scale measures the frequencies of the
classes.
Consecutive bars must touch.

data values
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

40

Class Boundaries
Class boundaries
The numbers that separate classes without forming
gaps between them.
The distance from the upper
limit of the first class to the
lower limit of the second
class is 115 114 = 1.
Half this distance is 0.5.

Class

Class
Frequency,
Boundaries
f

59 114 58.5 114.5


115 170

171 226

First class lower boundary = 59 0.5 = 58.5


First class upper boundary = 114 + 0.5 = 114.5
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

41

Class Boundaries
Class

Class
boundaries

Frequency,
f

59 114
115 170

58.5 114.5
114.5 170.5

5
8

171 226
227 282
283 338

170.5 226.5
226.5 282.5
282.5 338.5

6
5
2

339 394
395 450

338.5 394.5
394.5 450.5

1
3

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

42

Example: Frequency Histogram


Construct a frequency histogram for the global
positioning system (GPS) navigators.
Class

Class
boundaries

Frequency,
Midpoint
f

59 114

58.5 114.5

86.5

115 170

114.5 170.5

142.5

171 226

170.5 226.5

198.5

227 282

226.5 282.5

254.5

283 338

282.5 338.5

310.5

339 394

338.5 394.5

366.5

395 450

394.5 450.5

422.5

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

43

Solution: Frequency Histogram


(using Midpoints)

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

44

Solution: Frequency Histogram


(using class boundaries)

You can see that more than half of the GPS navigators are
priced below $226.50.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

45

TRY IT YOURSELF 3
Page 65

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

46

Graphs of Frequency Distributions

frequency

Frequency Polygon
A line graph that emphasizes the continuous change in
frequencies.

data values

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

47

Example: Frequency Polygon


Construct a frequency polygon for the GPS navigators
frequency distribution.
Class

Midpoint

Frequency, f

59 114

86.5

115 170

142.5

171 226

198.5

227 282

254.5

283 338

310.5

339 394

366.5

395 450

422.5

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

48

Solution: Frequency Polygon


The graph should
begin and end on the
horizontal axis, so
extend the left side to
one class width before
the first class
midpoint and extend
the right side to one
class width after the
last class midpoint.

You can see that the frequency of GPS navigators increases


up to $142.50 and then decreases.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

49

Graphs of Frequency Distributions

relative
frequency

Relative Frequency Histogram


Has the same shape and the same horizontal scale as
the corresponding frequency histogram.
The vertical scale measures the relative frequencies,
not frequencies.

data values
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

50

TRY IT YOURSELF 4
Page 65

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

51

Example: Relative Frequency Histogram


Construct a relative frequency histogram for the GPS
navigators frequency distribution.
Class

Class
boundaries

Frequency,
f

Relative
frequency

59 114

58.5 114.5

86.5

0.17

115 170

114.5 170.5

142.5

0.27

171 226

170.5 226.5

198.5

0.2

227 282

226.5 282.5

254.5

0.17

283 338

282.5 338.5

310.5

0.07

339 394

338.5 394.5

366.5

0.03

395 450

394.5 450.5

422.5

0.1

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

52

Solution: Relative Frequency Histogram

6.5

18.5

30.5

42.5

54.5

66.5

78.5

90.5

From this graph you can see that 20% of GPS navigators are
priced between $114.50 and $170.50.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

53

Graphs of Frequency Distributions

cumulative
frequency

Cumulative Frequency Graph or Ogive


A line graph that displays the cumulative frequency of
each class at its upper class boundary.
The upper boundaries are marked on the horizontal
axis.
The cumulative frequencies are marked on the vertical
axis.

data values
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

54

TRY IT YOURSELF 5
Page 66

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

55

Constructing an Ogive
1. Construct a frequency distribution that includes
cumulative frequencies as one of the columns.
2. Specify the horizontal and vertical scales.
The horizontal scale consists of the upper class
boundaries.
The vertical scale measures cumulative
frequencies.
3. Plot points that represent the upper class boundaries
and their corresponding cumulative frequencies.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

56

Constructing an Ogive
4. Connect the points in order from left to right.
5. The graph should start at the lower boundary of the
first class (cumulative frequency is zero) and should
end at the upper boundary of the last class
(cumulative frequency is equal to the sample size).

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

57

Example: Ogive
Construct an ogive for the GPS navigators frequency
distribution.
Class

Class
boundaries

Frequency,
f

Cumulative
frequency

59 114

58.5 114.5

86.5

115 170

114.5 170.5

142.5

13

171 226

170.5 226.5

198.5

19

227 282

226.5 282.5

254.5

24

283 338

282.5 338.5

310.5

26

339 394

338.5 394.5

366.5

27

395 450

394.5 450.5

422.5

30

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

58

Solution: Ogive

6.5

18.5

30.5

42.5

54.5

66.5

78.5

90.5

From the ogive, you can see that about 25 GPS navigators cost
$300 or less. The greatest increase occurs between $114.50 and
$170.50.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

59

TRY IT YOURSELF 6
Page 67

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

60

Graphing Quantitative Data Sets


Stem-and-leaf plot
Each number is separated into a stem and a leaf.
Similar to a histogram.
26
Still contains original data values.
Data: 21, 25, 25, 26, 27, 28,
30, 36, 36, 45

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

1 5 5 6 7 8

0 6 6

61

Example: Constructing a Stem-and-Leaf


Plot
The following are the numbers of text messages sent last
month by the cellular phone users on one floor of a college
dormitory. Display the data in a stem-and-leaf plot.
155 159 144 129 105 145 126 116 130 114 122 112 112 142 126
156 118 108 122 121 109 140 126 119 113 117 118 109 109 119
139 139 122 78 133 126 123 145 121 134 124 119 132 133 124
129 112 126 148 147

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

62

Solution: Constructing a Stem-and-Leaf


Plot
155 159 144 129 105 145 126 116 130 114 122 112 112 142 126
156 118 108 122 121 109 140 126 119 113 117 118 109 109 119
139 139 122 78 133 126 123 145 121 134 124 119 132 133 124
129 112 126 148 147

The data entries go from a low of 78 to a high of 159.


Use the rightmost digit as the leaf.
For instance,
78 = 7 | 8 and 159 = 15 | 9
List the stems, 7 to 15, to the left of a vertical line.
For each data entry, list a leaf to the right of its stem.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

63

Solution: Constructing a Stem-and-Leaf


Plot
Include a key to identify
the values of the data.

From the display, you can conclude that more than 50% of the
cellular phone users sent between 110 and 130 text messages.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

64

TRY IT YOURSELF 1
Page 76

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

65

Graphing Quantitative Data Sets


Dot plot
Each data entry is plotted, using a point, above a
horizontal axis
Data: 21, 25, 25, 26, 27, 28, 30, 36, 36, 45
26

20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

66

Example: Constructing a Dot Plot


Use a dot plot organize the text messaging data.
155 159 144 129 105 145 126 116 130 114 122 112 112 142 126
156 118 108 122 121 109 140 126 119 113 117 118 109 109 119
139 139 122 78 133 126 123 145 121 134 124 119 132 133 124
129 112 126 148 147
So that each data entry is included in the dot plot, the horizontal axis should include
numbers between 70 and 160.
To represent a data entry, plot a point above the entry's position on the axis.
If an entry is repeated, plot another point above the previous point.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

67

Solution: Constructing a Dot Plot


155 159 144 129 105 145 126 116 130 114 122 112 112 142 126
156 118 108 122 121 109 140 126 119 113 117 118 109 109 119
139 139 122 78 133 126 123 145 121 134 124 119 132 133 124
129 112 126 148 147

From the dot plot, you can see that most values cluster between 105 and 148 and the
value that occurs the most is 126. You can also see that 78 is an unusual data value.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

68

TRY IT YOURSELF 3
Page 77

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

69

Graphing Qualitative Data Sets


Pie Chart
A circle is divided into sectors that represent
categories.
The area of each sector is proportional to the
frequency of each category.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

70

Example: Constructing a Pie Chart


The numbers of earned degrees conferred (in thousands)
in 2007 are shown in the table. Use a pie chart to
organize the data. (Source: U.S. National Center for
Educational Statistics)
Type of degree

Number
(thousands)

Associates
Bachelors
Masters
First professional
Doctoral
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

728
1525
604
90
60
71

Solution: Constructing a Pie Chart

Find the relative frequency (percent) of each category.


Type of degree
Associates
Bachelors
Masters

Relative frequency

728

728
0.24
3007

1525

1525
0.51
3007
604
0.20
3007
90
0.03
3007
60
0.02
3007

604

First professional
Doctoral

Frequency, f

90
60
3007

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

72

Solution: Constructing a Pie Chart


Construct the pie chart using the central angle that
corresponds to each category.
To find the central angle, multiply 360 by the
category's relative frequency.
For example, the central angle for cars is
360(0.24) 86

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

73

Solution: Constructing a Pie Chart


Relative
Frequency, f frequency

Type of degree
Associates

728

0.24

360(0.24)86

Bachelors

1525

0.51

360(0.51)184

604

0.20

360(0.20)72

First professional

90

0.03

360(0.03)11

Doctoral

60

0.02

360(0.02)7

Masters

Central angle

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

74

Solution: Constructing a Pie Chart


Relative
frequency

Central
angle

Associates

0.24

86

Bachelors

0.51

184

Masters

0.20

72

First professional

0.03

11

Doctoral

0.02

Type of degree

From the pie chart, you can see that most fatalities in motor
vehicle crashes were those involving the occupants of cars.
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

75

TRY IT YOURSELF 4
Page 78

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

76

Graphing Qualitative Data Sets

Frequency

Pareto Chart
A vertical bar graph in which the height of each bar
represents frequency or relative frequency.
The bars are positioned in order of decreasing height,
with the tallest bar positioned at the left.

Categories
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

77

Graphing Paired Data Sets


Paired Data Sets
Each entry in one data set corresponds to one entry in
a second data set.
Graph using a scatter plot.
The ordered pairs are graphed as y
points in a coordinate plane.
Used to show the relationship
between two quantitative variables.
x

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

78

TRY IT YOURSELF 5 & 6


Page 79 & 80

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

79

WEEK 2

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

80

Objectives
The end of this chapter, students should be able :
Find the mean, median, and mode of a population and
of a sample (measures of central tendency)
Find the range of a data, the variance and standard
deviation of a population and of a sample
Find the first, second, and third quartiles of a data set,
the interquartile range of a data set, and represent a data
set graphically using a box-and whisker plot

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

81

Measures of Central Tendency


Measure of central tendency
A value that represents a typical, or central, entry of a
data set.
Most common measures of central tendency:
Mean
Median
Mode

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

82

Measure of Central Tendency: Mean


Mean (average)
The sum of all the data entries divided by the number
of entries.
Sigma notation: x = add all of the data entries (x)
in the data set.
x

Population mean:
N

Sample mean:

x
x
n

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

83

Example: Finding a Sample Mean


The prices (in dollars) for a sample of roundtrip flights
from Chicago, Illinois to Cancun, Mexico are listed.
What is the mean price of the flights?
872 432 397 427 388 782 397

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

84

Solution: Finding a Sample Mean


872 432 397 427 388 782 397

The sum of the flight prices is


x = 872

+ 432 + 397 + 427 + 388 + 782 + 397 = 3695

To find the mean price, divide the sum of the prices by the number of prices in the
sample

x 3695
x

527.9
n
7
The mean price of the flights is about $527.90.
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

85

TRY IT YOURSELF 1
Page 87

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

86

Measure of Central Tendency: Median


Median
The value that lies in the middle of the data when the data
set is ordered.
Measures the center of an ordered data set by dividing it
into two equal parts.
If the data set has an
odd number of entries: median is the middle data
entry.
even number of entries: median is the mean of the
two middle data entries.
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

87

Example: Finding the Median


The prices (in dollars) for a sample of roundtrip flights
from Chicago, Illinois to Cancun, Mexico are listed.
Find the median of the flight prices.
872 432 397 427 388 782 397

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

88

Solution: Finding the Median


872 432 397 427 388 782 397

First order the data.


388 397 397 427 432 782 872
There are seven entries (an odd number), the median
is the middle, or fourth, data entry.
The median price of the flights is $427.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

89

Example: Finding the Median


The flight priced at $432 is no longer available. What is
the median price of the remaining flights?
872 397 427 388 782 397

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

90

Solution: Finding the Median


872 397 427 388 782 397

First order the data.


388 397 397 427 782 872
There are six entries (an even number), the median is
the mean of the two middle entries.
397 427
Median
412
2
The median price of the flights is $412.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

91

TRY IT YOURSELF 2 & 3


Page 88

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

92

Measure of Central Tendency: Mode


Mode
The data entry that occurs with the greatest frequency.
If no entry is repeated the data set has no mode.
If two entries occur with the same greatest frequency,
each entry is a mode (bimodal).

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

93

Example: Finding the Mode


The prices (in dollars) for a sample of roundtrip flights
from Chicago, Illinois to Cancun, Mexico are listed.
Find the mode of the flight prices.
872 432 397 427 388 782 397

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

94

Solution: Finding the Mode


872 432 397 427 388 782 397

Ordering the data helps to find the mode.


388 397 397 427 432 782 872
The entry of 397 occurs twice, whereas the other
data entries occur only once.
The mode of the flight prices is $397.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

95

Example: Finding the Mode


At a political debate a sample of audience members was
asked to name the political party to which they belong.
Their responses are shown in the table. What is the
mode of the responses?
Political Party

Frequency, f

Democrat

34

Republican

56

Other

21

Did not respond

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

96

Solution: Finding the Mode


Political Party

Frequency, f

Democrat

34

Republican

56

Other

21

Did not respond

The mode is Republican (the response occurring with the greatest frequency). In this
sample there were more Republicans than people of any other single affiliation.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

97

TRY IT YOURSELF 4
Page 89

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

98

Example: Comparing the Mean, Median,


and Mode
Find the mean, median, and mode of the sample ages of
a class shown. Which measure of central tendency best
describes a typical entry of this data set? Are there any
outliers?
Ages in a class

20

20

20

20

20

20

21

21

21

21

22

22

22

23

23

23

23

24

24

65

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

99

Solution: Comparing the Mean, Median,


and Mode
Ages in a class

Mean:

Median:

Mode:
.

20

20

20

20

20

20

21

21

21

21

22

22

22

23

23

23

23

24

24

65

x 20 20 ... 24 65
x

23.8 years
n
20

21 22
21.5 years
2

20 years (the entry occurring with the


greatest frequency)
Copyright 2015, 2012, and 2009 Pearson Education, Inc.

100

Solution: Comparing the Mean, Median,


and Mode
Mean 23.8 years

Median = 21.5 years

Mode = 20 years

The mean takes every entry into account, but is


influenced by the outlier of 65.
The median also takes every entry into account, and
it is not affected by the outlier.
In this case the mode exists, but it doesn't appear to
represent a typical entry.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

101

Solution: Comparing the Mean, Median,


and Mode
Sometimes a graphical comparison can help you decide which measure of central
tendency best represents a data set.

In this case, it appears that the median best describes the data set.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

102

TRY IT YOURSELF 6
Page 90

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

103

Mean of Grouped Data


Mean of a Frequency Distribution
Approximated by

( x f )
x
n

n f

where x and f are the midpoints and frequencies of a


class, respectively

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

104

Finding the Mean of a Frequency


Distribution
In Words

In Symbols

1. Find the midpoint of each


class.

(lower limit)+(upper limit)


x
2

2. Find the sum of the


products of the midpoints
and the frequencies.

( x f )

3. Find the sum of the


frequencies.

n f

4. Find the mean of the


frequency distribution.
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

( x f )
x
n
105

Example: Find the Mean of a Frequency


Distribution
Use the frequency distribution to approximate the mean
number of minutes that a sample of Internet subscribers
spent online during their most recent session.

Class

Midpoint

Frequency, f

7 18

12.5

19 30

24.5

10

31 42

36.5

13

43 54

48.5

55 66

60.5

67 78

72.5

79 90

84.5

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

106

Solution: Find the Mean of a Frequency


Distribution
Class

Midpoint, x Frequency, f

(xf)

7 18

12.5

12.56 = 75.0

19 30

24.5

10

24.510 = 245.0

31 42

36.5

13

36.513 = 474.5

43 54

48.5

48.58 = 388.0

55 66

60.5

60.55 = 302.5

67 78

72.5

72.56 = 435.0

79 90

84.5

84.52 = 169.0

n = 50

(xf) = 2089.0

( x f ) 2089
x

41.8 minutes
n
50
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

107

TRY IT YOURSELF 8
Page 92

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

108

The Shape of Distributions


Symmetric Distribution
A vertical line can be drawn through the middle of a graph of the
distribution and the resulting halves are approximately mirror images.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

109

The Shape of Distributions


Skewed Left Distribution (negatively skewed)
The tail of the graph elongates more to the left.
The mean is to the left of the median.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

110

The Shape of Distributions


Skewed Right Distribution (positively skewed)
The tail of the graph elongates more to the right.
The mean is to the right of the median.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

111

Range
Range
The difference between the maximum and minimum
data entries in the set.
The data must be quantitative.
Range = (Max. data entry) (Min. data entry)

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

112

Example: Finding the Range


A corporation hired 10 graduates. The starting salaries
for each graduate are shown. Find the range of the
starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

113

Solution: Finding the Range


Ordering the data helps to find the least and greatest
salaries.
37 38 39 41 41 41 42 44 45 47
minimum

maximum

Range = (Max. salary) (Min. salary)


= 47 37 = 10
The range of starting salaries is 10 or $10,000.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

114

TRY IT YOURSELF 1
Page 102

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

115

Deviation, Variance, and Standard


Deviation
Deviation
The difference between the data entry, x, and the
mean of the data set.
Population data set:
Deviation of x = x
Sample data set:
Deviation of x = x x

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

116

Example: Finding the Deviation


A corporation hired 10 graduates. The starting salaries
for each graduate are shown. Find the deviation of the
starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42
Solution:
First determine the mean starting salary.

x 415

41.5
N
10
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

117

Solution: Finding the Deviation


Determine the
deviation for each data
entry.

Salary ($1000s), x Deviation: x


41

41 41.5 = 0.5

38

38 41.5 = 3.5

39

39 41.5 = 2.5

45

45 41.5 = 3.5

47

47 41.5 = 5.5

41

41 41.5 = 0.5

44

44 41.5 = 2.5

41

41 41.5 = 0.5

37

37 41.5 = 4.5

42

42 41.5 = 0.5

x = 415

(x ) = 0

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

118

Deviation, Variance, and Standard


Deviation
Population Variance

( x )

N
2

Sum of squares, SSx

Population Standard Deviation


2

(
x

)
2

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

119

Finding the Population Variance &


Standard Deviation
In Words

In Symbols
x

1. Find the mean of the


population data set.

2. Find deviation of each


entry.

3. Square each deviation.

(x )2

4. Add to get the sum of


squares.

SSx = (x )2

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

120

Finding the Population Variance &


Standard Deviation
In Words

In Symbols

5. Divide by N to get the


population variance.

(
x

)
2
N

6. Find the square root to get


the population standard
deviation.

( x ) 2

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

121

Example: Finding the Population


Standard Deviation
A corporation hired 10 graduates. The starting salaries
for each graduate are shown. Find the population
variance and standard deviation of the starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42
Recall = 41.5.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

122

Solution: Finding the Population


Standard Deviation
Determine SSx
N = 10

Salary, x

Deviation: x

Squares: (x )2

41

41 41.5 = 0.5

(0.5)2 = 0.25

38

38 41.5 = 3.5

(3.5)2 = 12.25

39

39 41.5 = 2.5

(2.5)2 = 6.25

45

45 41.5 = 3.5

(3.5)2 = 12.25

47

47 41.5 = 5.5

(5.5)2 = 30.25

41

41 41.5 = 0.5

(0.5)2 = 0.25

44

44 41.5 = 2.5

(2.5)2 = 6.25

41

41 41.5 = 0.5

(0.5)2 = 0.25

37

37 41.5 = 4.5

(4.5)2 = 20.25

42

42 41.5 = 0.5

(0.5)2 = 0.25

(x ) = 0
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

SSx = 88.5
123

Solution: Finding the Population


Standard Deviation
Population Variance

( x )
88.5

8.9

N
10
2

Population Standard Deviation


8.85 3.0
2

The population standard deviation is about 3.0, or $3000.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

124

TRY IT YOURSELF 2
Page 104

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

125

Deviation, Variance, and Standard


Deviation
Sample Variance

( x x )
s
n 1
2

Sample Standard Deviation

(
x

x
)
s s2
n 1

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

126

Finding the Sample Variance & Standard


Deviation
In Words

In Symbols
x
n

1. Find the mean of the


sample data set.

2. Find deviation of each


entry.

xx

3. Square each deviation.

( x x )2

4. Add to get the sum of


squares.

SS x ( x x ) 2

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

127

Finding the Sample Variance & Standard


Deviation
In Words

In Symbols

5. Divide by n 1 to get the


sample variance.
6. Find the square root to get
the sample standard
deviation.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

(
x

x
)
s2
n 1

( x x ) 2
s
n 1

128

Example: Finding the Sample Standard


Deviation
The starting salaries are for the Chicago branches of a
corporation. The corporation has several other branches,
and you plan to use the starting salaries of the Chicago
branches to estimate the starting salaries for the larger
population. Find the sample standard deviation of the
starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

129

Solution: Finding the Sample Standard


Deviation
Determine SSx
n = 10

Salary, x

Deviation: x

Squares: (x )2

41

41 41.5 = 0.5

(0.5)2 = 0.25

38

38 41.5 = 3.5

(3.5)2 = 12.25

39

39 41.5 = 2.5

(2.5)2 = 6.25

45

45 41.5 = 3.5

(3.5)2 = 12.25

47

47 41.5 = 5.5

(5.5)2 = 30.25

41

41 41.5 = 0.5

(0.5)2 = 0.25

44

44 41.5 = 2.5

(2.5)2 = 6.25

41

41 41.5 = 0.5

(0.5)2 = 0.25

37

37 41.5 = 4.5

(4.5)2 = 20.25

42

42 41.5 = 0.5

(0.5)2 = 0.25

(x ) = 0
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

SSx = 88.5
130

Solution: Finding the Sample Standard


Deviation
Sample Variance

( x x )
88.5

9.8
s
n 1
10 1
2

Sample Standard Deviation

88.5
3.1
s s
9
2

The sample standard deviation is about 3.1, or $3100.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

131

TRY IT YOURSELF 3
Page 106

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

132

Interpreting Standard Deviation


Standard deviation is a measure of the typical amount
an entry deviates from the mean.
The more the entries are spread out, the greater the
standard deviation.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

133

Interpreting Standard Deviation:


Empirical Rule (68 95 99.7 Rule)
For data with a (symmetric) bell-shaped distribution, the
standard deviation has the following characteristics:
About 68% of the data lie within one standard
deviation of the mean.
About 95% of the data lie within two standard
deviations of the mean.
About 99.7% of the data lie within three standard
deviations of the mean.
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

134

Interpreting Standard Deviation:


Empirical Rule (68 95 99.7 Rule)
99.7% within 3 standard deviations
95% within 2 standard deviations
68% within 1
standard deviation

34%

2.35%

x 3s
.

34%

13.5%

x 2s

13.5%

x s

xs

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

2.35%

x 2s

x 3s
135

Standard Deviation for Grouped Data


Sample standard deviation for a frequency distribution

When a frequency distribution has classes, estimate the sample mean and standard deviation by using the midpoint of each class.

( x x ) 2 f
s
n 1

where n= f (the number of entries in the data


set)

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

136

Example: Finding the Standard Deviation


for Grouped Data
You collect a random sample of the
number of children per household in
a region. Find the sample mean and
the sample standard deviation of the
data set.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

Number of Children in
50 Households
1

4
137

Solution: Finding the Standard Deviation


for Grouped Data
First construct a frequency distribution.
x

Find the mean of the frequency distribution.


0
10

xf 91
x

1.8
n
50
The sample mean is about 1.8 children.

xf
0(10) = 0

19

1(19) = 19

2(7) = 14

3(7) =21

4(2) = 8

5(1) = 5

6(4) = 24

f = 50 (xf )= 91

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

138

Solution: Finding the Standard Deviation


for Grouped Data
Determine the sum of squares.
x

xx

( x x )2

10

0 1.8 = 1.8

(1.8)2 = 3.24

3.24(10) = 32.40

19

1 1.8 = 0.8

(0.8)2 = 0.64

0.64(19) = 12.16

2 1.8 = 0.2

(0.2)2 = 0.04

0.04(7) = 0.28

3 1.8 = 1.2

(1.2)2 = 1.44

1.44(7) = 10.08

4 1.8 = 2.2

(2.2)2 = 4.84

4.84(2) = 9.68

5 1.8 = 3.2

(3.2)2 = 10.24

10.24(1) = 10.24

6 1.8 = 4.2

(4.2)2 = 17.64

17.64(4) = 70.56

( x x )2 f

( x x ) 2 f 145.40
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

139

Solution: Finding the Standard Deviation


for Grouped Data
Find the sample standard deviation.
x 2 x

( x x )2

( x x ) f
145.40
s

1.7
n 1
50 1

( x x )2 f

The standard deviation is about 1.7 children.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

140

TRY IT YOURSELF 8
Page 110

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

141

Quartiles
Fractiles are numbers that partition (divide) an ordered
data set into equal parts.
Quartiles approximately divide an ordered data set into
four equal parts.
First quartile, Q1: About one quarter of the data fall
on or below Q1.
Second quartile, Q2: About one half of the data fall on
or below Q2 (median).
Third quartile, Q3: About three quarters of the data
fall on or below Q3.
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

142

Example: Finding Quartiles


The number of nuclear power plants in the top 15 nuclear
power-producing countries in the world are listed. Find
the first, second, and third quartiles of the data set.
7 18 11 6 59 17 18 54 104 20 31 8 10 15 19
Solution:
Q2 divides the data set into two halves.

Lower half

Upper half

6 7 8 10 11 15 17 18 18 19 20 31 54 59 104
Q2
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

143

Solution: Finding Quartiles


The first and third quartiles are the medians of the lower and
upper halves of the data set.

Lower half
Upper half
6 7 8 10 11 15 17 18 18 19 20 31 54 59 104
Q1

Q2

Q3

About one fourth of the countries have 10 or less,


about one half have 18 or less; and about three fourths
have 31 or less.
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

144

TRY IT YOURSELF 1
Page 122

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

145

Interquartile Range
Interquartile Range (IQR)
The difference between the third and first quartiles.
IQR = Q3 Q1

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

146

Example: Finding the Interquartile Range


Find the interquartile range of the data set.
Recall Q1 = 10, Q2 = 18, and Q3 = 31
Solution:
IQR = Q3 Q1 = 31 10 = 21
The number of power plants in the middle portion of
the data set vary by at most 21.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

147

TRY IT YOURSELF 3
Page 124

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

148

Box-and-Whisker Plot
Box-and-whisker plot
Exploratory data analysis tool.
Highlights important features of a data set.
Requires (five-number summary):
Minimum entry
First quartile Q1
Median Q2
Third quartile Q3
Maximum entry
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

149

Drawing a Box-and-Whisker Plot


1. Find the five-number summary of the data set.
2. Construct a horizontal scale that spans the range of
the data.
3. Plot the five numbers above the horizontal scale.
4. Draw a box above the horizontal scale from Q1 to Q3
and draw a vertical line in the box at Q2.
5. Draw whiskers from the box to the minimum and
maximum entries.
Box
Whisker
Minimum
entry
.

Whisker

Q1

Median, Q2

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

Q3

Maximum
entry
150

Example: Drawing a Box-and-Whisker


Plot
Draw a box-and-whisker plot that represents the 15 data
set.
Min = 6, Q1 = 10, Q2 = 18, Q3 = 31, Max = 104,
Solution:

About half the scores are between 10 and 31. By looking


at the length of the right whisker, you can conclude 104
is a possible outlier.
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

151

TRY IT YOURSELF 4
Page 125

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

152

Percentiles and Other Fractiles

Fractiles
Quartiles

Summary
Divides data into 4 equal
parts

Symbols
Q1, Q2, Q3

Deciles

Divides data into 10 equal


parts

D1, D2, D3,, D9

Percentiles

Divides data into 100 equal


parts

P1, P2, P3,, P99

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

153

Percentile that corresponds to a specific


data entry, x

And the round to the nearest whole number

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

154

Example: Finding a percentile


For the data set below, find the percentile that corresponds to $30, 000
38 33 40 42 34 27 44 38 32 34 45 32 23 46 27 23 30 27 41
22 26 45 31 26 19

Solution:

The tuition cost of $30,000 corresponds to the 36th percentile


Interpretation : The tuition cost of $30,000 is greater than 36% of the other tuition
costs.
.

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

155

TRY IT YOURSELF 6
Page 127

Copyright 2015, 2012, and 2009 Pearson Education, Inc.

156

You might also like