Professional Documents
Culture Documents
Chapter
Introduction to
Statistics
Chapter Outline
Objectives
The end of this chapter, students should be able :
The definition of statistics
Distinguish between a population and a sample and
between a parameter and a statistic
Distinguish between descriptive statistics and
inferential statistics
Distinguish between qualitative data and quantitative
data
Objectives
The end of this chapter, students should be able :
Constructed the frequency Distributions table,
Constructed frequency histograms, frequency polygons,
relative frequency histograms and ogives
Graph the quantitative data and qualitative data.
What is Data?
Data
Consist of information coming from observations, counts, measurements, or
responses.
What is Statistics?
Statistics
The science of collecting,
organizing, analyzing, and
interpreting data in order to
make decisions.
Data Sets
Population
The collection of all outcomes,
responses, measurements, or
counts that are of interest.
Sample
A subset, or part, of the population.
Responses of adults in
the U.S. (population)
Responses of
adults in survey
(sample)
TRY IT YOURSELF 1
Page 23
10
Statistic
A numerical description of a sample
characteristic.
Average age of people from a sample
of three states
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
11
Solution:
Sample statistic (the average of $83,121 is based
on a subset of the population)
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
12
Solution:
Population parameter (the SAT score of 1442 is
based on all the students who accepted admission
offers in 2009)
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
13
14
TRY IT YOURSELF 2
Page 24
15
Branches of Statistics
Descriptive
Statistics Involves
organizing,
summarizing, and
displaying data.
Inferential Statistics
Involves using sample
data to draw
conclusions about a
population.
16
Question:
A large sample of men, aged 48,
was studied for 18 years. For
unmarried men, approximately
70% were alive at age 65. For
married men, 90% were alive at
age 65.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
17
18
TRY IT YOURSELF 3
Page 25
19
Types of Data
Qualitative Data
Consists of attributes, labels, or nonnumerical entries.
Major
Place of birth
Eye color
20
Types of Data
Quantitative data
Numerical measurements or counts.
Age
Weight of a letter
Temperature
21
22
Qualitative Data
(Names of vehicle
models are nonnumerical
entries)
Quantitative Data
(Base prices of
vehicles models are
numerical entries)
23
TRY IT YOURSELF 1
Page 29
24
Frequency Distribution
Frequency Distribution
Class Frequency, f
A table that shows
Class width 1 5
5
classes or intervals of 6 1 = 5
6 10
8
data with a count of the
11 15
6
number of entries in each
16 20
8
class.
21 25
5
The frequency, f, of a
26 30
4
class is the number of
data entries in the class. Lower class
Upper class
limits
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
limits
25
26
27
28
29
55.86
#classes
7
7
Round up to 56
30
Lower
limit
Class
width = 56
Upper
limit
59
115
171
227
283
339
395
31
Lower
limit
Upper
limit
59
115
114
170
171
227
283
226
282
338
339
395
394
450
Class
width = 56
32
Tally
Frequency, f
IIII
115 170
IIII III
171 226
IIII I
227 282
IIII
283 338
II
339 394
395 450
III
59 114
33
TRY IT YOURSELF 1
Page 62
34
Midpoint
Frequency, f
59 114
59 114
86.5
2
115 170
115 170
142.5
2
171 226
171 226
198.5
2
Class width = 56
8
6
35
Sample size
n
Class
Frequency, f
59 114
115 170
171 226
Relative Frequency
5
0.17
30
8
0.27
30
6
0.2
30
36
Frequency, f
Cumulative frequency
59 114
115 170
+ 8
13
171 226
+ 6
19
37
Class
Frequency, f
Midpoint
Relative
frequency
59 114
86.5
0.17
115 170
142.5
0.27
13
171 226
198.5
0.2
19
227 282
254.5
0.17
24
283 338
310.5
0.07
26
339 394
366.5
0.03
27
395 450
422.5
0.1
30
f = 30
Cumulative
frequency
f
1
n
38
TRY IT YOURSELF 2
Page 63
39
frequency
Frequency Histogram
A bar graph that represents the frequency distribution.
The horizontal scale is quantitative and measures the
data values.
The vertical scale measures the frequencies of the
classes.
Consecutive bars must touch.
data values
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
40
Class Boundaries
Class boundaries
The numbers that separate classes without forming
gaps between them.
The distance from the upper
limit of the first class to the
lower limit of the second
class is 115 114 = 1.
Half this distance is 0.5.
Class
Class
Frequency,
Boundaries
f
171 226
41
Class Boundaries
Class
Class
boundaries
Frequency,
f
59 114
115 170
58.5 114.5
114.5 170.5
5
8
171 226
227 282
283 338
170.5 226.5
226.5 282.5
282.5 338.5
6
5
2
339 394
395 450
338.5 394.5
394.5 450.5
1
3
42
Class
boundaries
Frequency,
Midpoint
f
59 114
58.5 114.5
86.5
115 170
114.5 170.5
142.5
171 226
170.5 226.5
198.5
227 282
226.5 282.5
254.5
283 338
282.5 338.5
310.5
339 394
338.5 394.5
366.5
395 450
394.5 450.5
422.5
43
44
You can see that more than half of the GPS navigators are
priced below $226.50.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
45
TRY IT YOURSELF 3
Page 65
46
frequency
Frequency Polygon
A line graph that emphasizes the continuous change in
frequencies.
data values
47
Midpoint
Frequency, f
59 114
86.5
115 170
142.5
171 226
198.5
227 282
254.5
283 338
310.5
339 394
366.5
395 450
422.5
48
49
relative
frequency
data values
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
50
TRY IT YOURSELF 4
Page 65
51
Class
boundaries
Frequency,
f
Relative
frequency
59 114
58.5 114.5
86.5
0.17
115 170
114.5 170.5
142.5
0.27
171 226
170.5 226.5
198.5
0.2
227 282
226.5 282.5
254.5
0.17
283 338
282.5 338.5
310.5
0.07
339 394
338.5 394.5
366.5
0.03
395 450
394.5 450.5
422.5
0.1
52
6.5
18.5
30.5
42.5
54.5
66.5
78.5
90.5
From this graph you can see that 20% of GPS navigators are
priced between $114.50 and $170.50.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
53
cumulative
frequency
data values
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
54
TRY IT YOURSELF 5
Page 66
55
Constructing an Ogive
1. Construct a frequency distribution that includes
cumulative frequencies as one of the columns.
2. Specify the horizontal and vertical scales.
The horizontal scale consists of the upper class
boundaries.
The vertical scale measures cumulative
frequencies.
3. Plot points that represent the upper class boundaries
and their corresponding cumulative frequencies.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
56
Constructing an Ogive
4. Connect the points in order from left to right.
5. The graph should start at the lower boundary of the
first class (cumulative frequency is zero) and should
end at the upper boundary of the last class
(cumulative frequency is equal to the sample size).
57
Example: Ogive
Construct an ogive for the GPS navigators frequency
distribution.
Class
Class
boundaries
Frequency,
f
Cumulative
frequency
59 114
58.5 114.5
86.5
115 170
114.5 170.5
142.5
13
171 226
170.5 226.5
198.5
19
227 282
226.5 282.5
254.5
24
283 338
282.5 338.5
310.5
26
339 394
338.5 394.5
366.5
27
395 450
394.5 450.5
422.5
30
58
Solution: Ogive
6.5
18.5
30.5
42.5
54.5
66.5
78.5
90.5
From the ogive, you can see that about 25 GPS navigators cost
$300 or less. The greatest increase occurs between $114.50 and
$170.50.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
59
TRY IT YOURSELF 6
Page 67
60
1 5 5 6 7 8
0 6 6
61
62
63
From the display, you can conclude that more than 50% of the
cellular phone users sent between 110 and 130 text messages.
Copyright 2015, 2012, and 2009 Pearson Education, Inc.
64
TRY IT YOURSELF 1
Page 76
65
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
66
67
From the dot plot, you can see that most values cluster between 105 and 148 and the
value that occurs the most is 126. You can also see that 78 is an unusual data value.
68
TRY IT YOURSELF 3
Page 77
69
70
Number
(thousands)
Associates
Bachelors
Masters
First professional
Doctoral
.
728
1525
604
90
60
71
Relative frequency
728
728
0.24
3007
1525
1525
0.51
3007
604
0.20
3007
90
0.03
3007
60
0.02
3007
604
First professional
Doctoral
Frequency, f
90
60
3007
72
73
Type of degree
Associates
728
0.24
360(0.24)86
Bachelors
1525
0.51
360(0.51)184
604
0.20
360(0.20)72
First professional
90
0.03
360(0.03)11
Doctoral
60
0.02
360(0.02)7
Masters
Central angle
74
Central
angle
Associates
0.24
86
Bachelors
0.51
184
Masters
0.20
72
First professional
0.03
11
Doctoral
0.02
Type of degree
From the pie chart, you can see that most fatalities in motor
vehicle crashes were those involving the occupants of cars.
.
75
TRY IT YOURSELF 4
Page 78
76
Frequency
Pareto Chart
A vertical bar graph in which the height of each bar
represents frequency or relative frequency.
The bars are positioned in order of decreasing height,
with the tallest bar positioned at the left.
Categories
.
77
78
79
WEEK 2
80
Objectives
The end of this chapter, students should be able :
Find the mean, median, and mode of a population and
of a sample (measures of central tendency)
Find the range of a data, the variance and standard
deviation of a population and of a sample
Find the first, second, and third quartiles of a data set,
the interquartile range of a data set, and represent a data
set graphically using a box-and whisker plot
81
82
Population mean:
N
Sample mean:
x
x
n
83
84
To find the mean price, divide the sum of the prices by the number of prices in the
sample
x 3695
x
527.9
n
7
The mean price of the flights is about $527.90.
.
85
TRY IT YOURSELF 1
Page 87
86
87
88
89
90
91
92
93
94
95
Frequency, f
Democrat
34
Republican
56
Other
21
96
Frequency, f
Democrat
34
Republican
56
Other
21
The mode is Republican (the response occurring with the greatest frequency). In this
sample there were more Republicans than people of any other single affiliation.
97
TRY IT YOURSELF 4
Page 89
98
20
20
20
20
20
20
21
21
21
21
22
22
22
23
23
23
23
24
24
65
99
Mean:
Median:
Mode:
.
20
20
20
20
20
20
21
21
21
21
22
22
22
23
23
23
23
24
24
65
x 20 20 ... 24 65
x
23.8 years
n
20
21 22
21.5 years
2
100
Mode = 20 years
101
In this case, it appears that the median best describes the data set.
102
TRY IT YOURSELF 6
Page 90
103
( x f )
x
n
n f
104
In Symbols
( x f )
n f
( x f )
x
n
105
Class
Midpoint
Frequency, f
7 18
12.5
19 30
24.5
10
31 42
36.5
13
43 54
48.5
55 66
60.5
67 78
72.5
79 90
84.5
106
Midpoint, x Frequency, f
(xf)
7 18
12.5
12.56 = 75.0
19 30
24.5
10
24.510 = 245.0
31 42
36.5
13
36.513 = 474.5
43 54
48.5
48.58 = 388.0
55 66
60.5
60.55 = 302.5
67 78
72.5
72.56 = 435.0
79 90
84.5
84.52 = 169.0
n = 50
(xf) = 2089.0
( x f ) 2089
x
41.8 minutes
n
50
.
107
TRY IT YOURSELF 8
Page 92
108
109
110
111
Range
Range
The difference between the maximum and minimum
data entries in the set.
The data must be quantitative.
Range = (Max. data entry) (Min. data entry)
112
113
maximum
114
TRY IT YOURSELF 1
Page 102
115
116
x 415
41.5
N
10
.
117
41 41.5 = 0.5
38
38 41.5 = 3.5
39
39 41.5 = 2.5
45
45 41.5 = 3.5
47
47 41.5 = 5.5
41
41 41.5 = 0.5
44
44 41.5 = 2.5
41
41 41.5 = 0.5
37
37 41.5 = 4.5
42
42 41.5 = 0.5
x = 415
(x ) = 0
118
( x )
N
2
(
x
)
2
119
In Symbols
x
(x )2
SSx = (x )2
120
In Symbols
(
x
)
2
N
( x ) 2
121
122
Salary, x
Deviation: x
Squares: (x )2
41
41 41.5 = 0.5
(0.5)2 = 0.25
38
38 41.5 = 3.5
(3.5)2 = 12.25
39
39 41.5 = 2.5
(2.5)2 = 6.25
45
45 41.5 = 3.5
(3.5)2 = 12.25
47
47 41.5 = 5.5
(5.5)2 = 30.25
41
41 41.5 = 0.5
(0.5)2 = 0.25
44
44 41.5 = 2.5
(2.5)2 = 6.25
41
41 41.5 = 0.5
(0.5)2 = 0.25
37
37 41.5 = 4.5
(4.5)2 = 20.25
42
42 41.5 = 0.5
(0.5)2 = 0.25
(x ) = 0
.
SSx = 88.5
123
( x )
88.5
8.9
N
10
2
124
TRY IT YOURSELF 2
Page 104
125
( x x )
s
n 1
2
(
x
x
)
s s2
n 1
126
In Symbols
x
n
xx
( x x )2
SS x ( x x ) 2
127
In Symbols
(
x
x
)
s2
n 1
( x x ) 2
s
n 1
128
129
Salary, x
Deviation: x
Squares: (x )2
41
41 41.5 = 0.5
(0.5)2 = 0.25
38
38 41.5 = 3.5
(3.5)2 = 12.25
39
39 41.5 = 2.5
(2.5)2 = 6.25
45
45 41.5 = 3.5
(3.5)2 = 12.25
47
47 41.5 = 5.5
(5.5)2 = 30.25
41
41 41.5 = 0.5
(0.5)2 = 0.25
44
44 41.5 = 2.5
(2.5)2 = 6.25
41
41 41.5 = 0.5
(0.5)2 = 0.25
37
37 41.5 = 4.5
(4.5)2 = 20.25
42
42 41.5 = 0.5
(0.5)2 = 0.25
(x ) = 0
.
SSx = 88.5
130
( x x )
88.5
9.8
s
n 1
10 1
2
88.5
3.1
s s
9
2
131
TRY IT YOURSELF 3
Page 106
132
133
134
34%
2.35%
x 3s
.
34%
13.5%
x 2s
13.5%
x s
xs
2.35%
x 2s
x 3s
135
When a frequency distribution has classes, estimate the sample mean and standard deviation by using the midpoint of each class.
( x x ) 2 f
s
n 1
136
Number of Children in
50 Households
1
4
137
xf 91
x
1.8
n
50
The sample mean is about 1.8 children.
xf
0(10) = 0
19
1(19) = 19
2(7) = 14
3(7) =21
4(2) = 8
5(1) = 5
6(4) = 24
f = 50 (xf )= 91
138
xx
( x x )2
10
0 1.8 = 1.8
(1.8)2 = 3.24
3.24(10) = 32.40
19
1 1.8 = 0.8
(0.8)2 = 0.64
0.64(19) = 12.16
2 1.8 = 0.2
(0.2)2 = 0.04
0.04(7) = 0.28
3 1.8 = 1.2
(1.2)2 = 1.44
1.44(7) = 10.08
4 1.8 = 2.2
(2.2)2 = 4.84
4.84(2) = 9.68
5 1.8 = 3.2
(3.2)2 = 10.24
10.24(1) = 10.24
6 1.8 = 4.2
(4.2)2 = 17.64
17.64(4) = 70.56
( x x )2 f
( x x ) 2 f 145.40
.
139
( x x )2
( x x ) f
145.40
s
1.7
n 1
50 1
( x x )2 f
140
TRY IT YOURSELF 8
Page 110
141
Quartiles
Fractiles are numbers that partition (divide) an ordered
data set into equal parts.
Quartiles approximately divide an ordered data set into
four equal parts.
First quartile, Q1: About one quarter of the data fall
on or below Q1.
Second quartile, Q2: About one half of the data fall on
or below Q2 (median).
Third quartile, Q3: About three quarters of the data
fall on or below Q3.
.
142
Lower half
Upper half
6 7 8 10 11 15 17 18 18 19 20 31 54 59 104
Q2
.
143
Lower half
Upper half
6 7 8 10 11 15 17 18 18 19 20 31 54 59 104
Q1
Q2
Q3
144
TRY IT YOURSELF 1
Page 122
145
Interquartile Range
Interquartile Range (IQR)
The difference between the third and first quartiles.
IQR = Q3 Q1
146
147
TRY IT YOURSELF 3
Page 124
148
Box-and-Whisker Plot
Box-and-whisker plot
Exploratory data analysis tool.
Highlights important features of a data set.
Requires (five-number summary):
Minimum entry
First quartile Q1
Median Q2
Third quartile Q3
Maximum entry
.
149
Whisker
Q1
Median, Q2
Q3
Maximum
entry
150
151
TRY IT YOURSELF 4
Page 125
152
Fractiles
Quartiles
Summary
Divides data into 4 equal
parts
Symbols
Q1, Q2, Q3
Deciles
Percentiles
153
154
Solution:
155
TRY IT YOURSELF 6
Page 127
156