You are on page 1of 70

Measures of Central Tendency

Measure of central tendency


A value that represents a typical, or central, entry of a
data set.
Most common measures of central tendency:
Mean
Median
Mode

Larson/Farber 4th ed. 1


Measure of Central Tendency: Mean

Mean (average)
The sum of all the data entries divided by the number
of entries.
Sigma notation: x = add all of the data entries (x)
in the data set.
x
Population mean:
N

x
Sample mean: x
n

Larson/Farber 4th ed. 2


Example: Finding a Sample Mean

The prices (in dollars) for a sample of roundtrip flights


from Chicago, Illinois to Cancun, Mexico are listed.
What is the mean price of the flights?
872 432 397 427 388 782 397

Larson/Farber 4th ed. 3


Solution: Finding a Sample Mean

872 432 397 427 388 782 397

The sum of the flight prices is


x = 872 + 432 + 397 + 427 + 388 + 782 + 397 = 3695

To find the mean price, divide the sum of the prices


by the number of prices in the sample
x 3695
x 527.9
n 7

The mean price of the flights is about $527.90.


Larson/Farber 4th ed. 4
Measure of Central Tendency: Median

Median
The value that lies in the middle of the data when the
data set is ordered.
Measures the center of an ordered data set by dividing
it into two equal parts.
If the data set has an
odd number of entries: median is the middle data
entry.
even number of entries: median is the mean of
the two middle data entries.
Larson/Farber 4th ed. 5
Example: Finding the Median

The prices (in dollars) for a sample of roundtrip flights


from Chicago, Illinois to Cancun, Mexico are listed.
Find the median of the flight prices.
872 432 397 427 388 782 397

Larson/Farber 4th ed. 6


Solution: Finding the Median

872 432 397 427 388 782 397

First order the data.


388 397 397 427 432 782 872

There are seven entries (an odd number), the median


is the middle, or fourth, data entry.

The median price of the flights is $427.

Larson/Farber 4th ed. 7


Example: Finding the Median

The flight priced at $432 is no longer available. What is


the median price of the remaining flights?
872 397 427 388 782 397

Larson/Farber 4th ed. 8


Solution: Finding the Median

872 397 427 388 782 397


First order the data.
388 397 397 427 782 872

There are six entries (an even number), the median is


the mean of the two middle entries.
397 427
Median 412
2
The median price of the flights is $412.

Larson/Farber 4th ed. 9


Measure of Central Tendency: Mode

Mode
The data entry that occurs with the greatest frequency.
If no entry is repeated the data set has no mode.
If two entries occur with the same greatest frequency,
each entry is a mode (bimodal).

Larson/Farber 4th ed. 10


Example: Finding the Mode

The prices (in dollars) for a sample of roundtrip flights


from Chicago, Illinois to Cancun, Mexico are listed.
Find the mode of the flight prices.
872 432 397 427 388 782 397

Larson/Farber 4th ed. 11


Solution: Finding the Mode

872 432 397 427 388 782 397

Ordering the data helps to find the mode.


388 397 397 427 432 782 872

The entry of 397 occurs twice, whereas the other


data entries occur only once.

The mode of the flight prices is $397.

Larson/Farber 4th ed. 12


Example: Finding the Mode

At a political debate a sample of audience members was


asked to name the political party to which they belong.
Their responses are shown in the table. What is the
mode of the responses?
Political Party Frequency, f
Democrat 34
Republican 56
Other 21
Did not respond 9

Larson/Farber 4th ed. 13


Solution: Finding the Mode

Political Party Frequency, f


Democrat 34
Republican 56
Other 21
Did not respond 9

The mode is Republican (the response occurring with


the greatest frequency). In this sample there were more
Republicans than people of any other single affiliation.

Larson/Farber 4th ed. 14


Comparing the Mean, Median, and Mode

All three measures describe a typical entry of a data


set.
Advantage of using the mean:
The mean is a reliable measure because it takes
into account every entry of a data set.
Disadvantage of using the mean:
Greatly affected by outliers (a data entry that is far
removed from the other entries in the data set).

Larson/Farber 4th ed. 15


Example: Comparing the Mean, Median,
and Mode
Find the mean, median, and mode of the sample ages of
a class shown. Which measure of central tendency best
describes a typical entry of this data set? Are there any
outliers?
Ages in a class
20 20 20 20 20 20 21
21 21 21 22 22 22 23
23 23 23 24 24 65

Larson/Farber 4th ed. 16


Solution: Comparing the Mean, Median,
and Mode
Ages in a class
20 20 20 20 20 20 21
21 21 21 22 22 22 23
23 23 23 24 24 65

x 20 20 ... 24 65
Mean: x 23.8 years
n 20
21 22
Median: 21.5 years
2

Mode: 20 years (the entry occurring with the


greatest frequency)
Larson/Farber 4th ed. 17
Solution: Comparing the Mean, Median,
and Mode

Mean 23.8 years Median = 21.5 years Mode = 20 years

The mean takes every entry into account, but is


influenced by the outlier of 65.
The median also takes every entry into account, and
it is not affected by the outlier.
In this case the mode exists, but it doesn't appear to
represent a typical entry.

Larson/Farber 4th ed. 18


Solution: Comparing the Mean, Median,
and Mode
Sometimes a graphical comparison can help you decide
which measure of central tendency best represents a
data set.

In this case, it appears that the median best describes


the data set.
Larson/Farber 4th ed. 19
Weighted Mean

Weighted Mean
The mean of a data set whose entries have varying
weights.

( x w)
x where w is the weight of each entry x
w

Larson/Farber 4th ed. 20


Example: Finding a Weighted Mean

You are taking a class in which your grade is


determined from five sources: 50% from your test
mean, 15% from your midterm, 20% from your final
exam, 10% from your computer lab work, and 5% from
your homework. Your scores are 86 (test mean), 96
(midterm), 82 (final exam), 98 (computer lab), and 100
(homework). What is the weighted mean of your
scores? If the minimum average for an A is 90, did you
get an A?

Larson/Farber 4th ed. 21


Solution: Finding a Weighted Mean
Source Score, x Weight, w xw
Test Mean 86 0.50 86(0.50)= 43.0
Midterm 96 0.15 96(0.15) = 14.4
Final Exam 82 0.20 82(0.20) = 16.4
Computer Lab 98 0.10 98(0.10) = 9.8
Homework 100 0.05 100(0.05) = 5.0
w = 1 (xw) = 88.6

( x w) 88.6
x 88.6
w 1
Your weighted mean for the course is 88.6. You did not
get an A.
Larson/Farber 4th ed. 22
Mean of Grouped Data

Mean of a Frequency Distribution


Approximated by
( x f )
x n f
n
where x and f are the midpoints and frequencies of a
class, respectively

Larson/Farber 4th ed. 23


Finding the Mean of a Frequency
Distribution
In Words In Symbols
1. Find the midpoint of each (lower limit)+(upper limit)
x
class. 2

2. Find the sum of the


products of the midpoints ( x f )
and the frequencies.
3. Find the sum of the n f
frequencies.
4. Find the mean of the ( x f )
x
frequency distribution. n
Larson/Farber 4th ed. 24
Example: Find the Mean of a Frequency
Distribution
Use the frequency distribution to approximate the mean
number of minutes that a sample of Internet subscribers
spent online during their most recent session.
Class Midpoint Frequency, f
7 18 12.5 6
19 30 24.5 10
31 42 36.5 13
43 54 48.5 8
55 66 60.5 5
67 78 72.5 6
79 90 84.5 2
Larson/Farber 4th ed. 25
Solution: Find the Mean of a Frequency
Distribution
Class Midpoint, x Frequency, f (xf)
7 18 12.5 6 12.56 = 75.0
19 30 24.5 10 24.510 = 245.0
31 42 36.5 13 36.513 = 474.5
43 54 48.5 8 48.58 = 388.0
55 66 60.5 5 60.55 = 302.5
67 78 72.5 6 72.56 = 435.0
79 90 84.5 2 84.52 = 169.0
n = 50 (xf) = 2089.0

( x f ) 2089
x 41.8 minutes
n 50
Larson/Farber 4th ed. 26
The Shape of Distributions

Symmetric Distribution
A vertical line can be drawn through the middle of
a graph of the distribution and the resulting halves
are approximately mirror images.

Larson/Farber 4th ed. 27


The Shape of Distributions

Uniform Distribution (rectangular)


All entries or classes in the distribution have equal
or approximately equal frequencies.
Symmetric.

Larson/Farber 4th ed. 28


The Shape of Distributions

Skewed Left Distribution (negatively skewed)


The tail of the graph elongates more to the left.
The mean is to the left of the median.

Larson/Farber 4th ed. 29


The Shape of Distributions

Skewed Right Distribution (positively skewed)


The tail of the graph elongates more to the right.
The mean is to the right of the median.

Larson/Farber 4th ed. 30


Range

Range
The difference between the maximum and minimum
data entries in the set.
The data must be quantitative.
Range = (Max. data entry) (Min. data entry)

Larson/Farber 4th ed. 31


Example: Finding the Range

A corporation hired 10 graduates. The starting salaries


for each graduate are shown. Find the range of the
starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42

Larson/Farber 4th ed. 32


Solution: Finding the Range

Ordering the data helps to find the least and greatest


salaries.
37 38 39 41 41 41 42 44 45 47
minimum maximum

Range = (Max. salary) (Min. salary)


= 47 37 = 10

The range of starting salaries is 10 or $10,000.

Larson/Farber 4th ed. 33


Deviation, Variance, and Standard
Deviation
Deviation
The difference between the data entry, x, and the
mean of the data set.
Population data set:
Deviation of x = x
Sample data set:
Deviation of x = x x

Larson/Farber 4th ed. 34


Example: Finding the Deviation

A corporation hired 10 graduates. The starting salaries


for each graduate are shown. Find the deviation of the
starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42
Solution:
First determine the mean starting salary.
x 415
41.5
N 10

Larson/Farber 4th ed. 35


Solution: Finding the Deviation

Determine the Salary ($1000s), x Deviation: x


deviation for each 41 41 41.5 = 0.5
data entry. 38 38 41.5 = 3.5
39 39 41.5 = 2.5
45 45 41.5 = 3.5
47 47 41.5 = 5.5
41 41 41.5 = 0.5
44 44 41.5 = 2.5
41 41 41.5 = 0.5
37 37 41.5 = 4.5
42 42 41.5 = 0.5
x = 415 (x ) = 0
Larson/Farber 4th ed. 36
Deviation, Variance, and Standard
Deviation
Population Variance

( x ) 2
Sum of squares, SSx

2

N
Population Standard Deviation

( x ) 2

2

Larson/Farber 4th ed. 37


Finding the Population Variance &
Standard Deviation
In Words In Symbols
1. Find the mean of the x

population data set. N
2. Find deviation of each x
entry.
3. Square each deviation. (x )2
4. Add to get the sum of SSx = (x )2
squares.

Larson/Farber 4th ed. 38


Finding the Population Variance &
Standard Deviation
In Words In Symbols
5. Divide by N to get the ( x ) 2

population variance. 2
N
6. Find the square root to get
( x ) 2
the population standard
deviation. N

Larson/Farber 4th ed. 39


Example: Finding the Population
Standard Deviation
A corporation hired 10 graduates. The starting salaries
for each graduate are shown. Find the population
variance and standard deviation of the starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42
Recall = 41.5.

Larson/Farber 4th ed. 40


Solution: Finding the Population
Standard Deviation
Determine SSx Salary, x Deviation: x Squares: (x )2
N = 10 41 41 41.5 = 0.5 (0.5)2 = 0.25
38 38 41.5 = 3.5 (3.5)2 = 12.25
39 39 41.5 = 2.5 (2.5)2 = 6.25
45 45 41.5 = 3.5 (3.5)2 = 12.25
47 47 41.5 = 5.5 (5.5)2 = 30.25
41 41 41.5 = 0.5 (0.5)2 = 0.25
44 44 41.5 = 2.5 (2.5)2 = 6.25
41 41 41.5 = 0.5 (0.5)2 = 0.25
37 37 41.5 = 4.5 (4.5)2 = 20.25
42 42 41.5 = 0.5 (0.5)2 = 0.25
(x ) = 0 SSx = 88.5
Larson/Farber 4th ed. 41
Solution: Finding the Population
Standard Deviation
Population Variance

( x ) 2
88.5
8.9
2

N 10
Population Standard Deviation

2 8.85 3.0

The population standard deviation is about 3.0, or $3000.


Larson/Farber 4th ed. 42
Deviation, Variance, and Standard
Deviation
Sample Variance

( x x ) 2
s
2

n 1
Sample Standard Deviation

( x x ) 2
s s2
n 1

Larson/Farber 4th ed. 43


Finding the Sample Variance & Standard
Deviation
In Words In Symbols
1. Find the mean of the x
x
sample data set. n

2. Find deviation of each xx


entry.
3. Square each deviation. ( x x )2
4. Add to get the sum of SS x ( x x ) 2
squares.

Larson/Farber 4th ed. 44


Finding the Sample Variance & Standard
Deviation
In Words In Symbols
5. Divide by n 1 to get the ( x x ) 2

sample variance. s2
n 1
6. Find the square root to get
the sample standard ( x x ) 2
s
deviation. n 1

Larson/Farber 4th ed. 45


Example: Finding the Sample Standard
Deviation
The starting salaries are for the Chicago branches of a
corporation. The corporation has several other branches,
and you plan to use the starting salaries of the Chicago
branches to estimate the starting salaries for the larger
population. Find the sample standard deviation of the
starting salaries.
Starting salaries (1000s of dollars)
41 38 39 45 47 41 44 41 37 42

Larson/Farber 4th ed. 46


Solution: Finding the Sample Standard
Deviation
Determine SSx Salary, x Deviation: x Squares: (x )2
n = 10 41 41 41.5 = 0.5 (0.5)2 = 0.25
38 38 41.5 = 3.5 (3.5)2 = 12.25
39 39 41.5 = 2.5 (2.5)2 = 6.25
45 45 41.5 = 3.5 (3.5)2 = 12.25
47 47 41.5 = 5.5 (5.5)2 = 30.25
41 41 41.5 = 0.5 (0.5)2 = 0.25
44 44 41.5 = 2.5 (2.5)2 = 6.25
41 41 41.5 = 0.5 (0.5)2 = 0.25
37 37 41.5 = 4.5 (4.5)2 = 20.25
42 42 41.5 = 0.5 (0.5)2 = 0.25
(x ) = 0 SSx = 88.5
Larson/Farber 4th ed. 47
Solution: Finding the Sample Standard
Deviation
Sample Variance

( x x ) 88.5 2
s 9.8
2

n 1 10 1
Sample Standard Deviation

88.5
s s 3.1 2

9
The sample standard deviation is about 3.1, or $3100.
Larson/Farber 4th ed. 48
Interpreting Standard Deviation

Standard deviation is a measure of the typical amount


an entry deviates from the mean.
The more the entries are spread out, the greater the
standard deviation.

Larson/Farber 4th ed. 49


Interpreting Standard Deviation:
Empirical Rule (68 95 99.7 Rule)

For data with a (symmetric) bell-shaped distribution, the


standard deviation has the following characteristics:

About 68% of the data lie within one standard


deviation of the mean.
About 95% of the data lie within two standard
deviations of the mean.
About 99.7% of the data lie within three standard
deviations of the mean.

Larson/Farber 4th ed. 50


Interpreting Standard Deviation:
Empirical Rule (68 95 99.7 Rule)
99.7% within 3 standard deviations
95% within 2 standard deviations
68% within 1
standard deviation

34% 34%

2.35% 2.35%
13.5% 13.5%

x 3s x 2s x s x xs x 2s x 3s

Larson/Farber 4th ed. 51


Example: Using the Empirical Rule

In a survey conducted by the National Center for Health


Statistics, the sample mean height of women in the
United States (ages 20-29) was 64 inches, with a sample
standard deviation of 2.71 inches. Estimate the percent
of the women whose heights are between 64 inches and
69.42 inches.

Larson/Farber 4th ed. 52


Solution: Using the Empirical Rule
Because the distribution is bell-shaped, you can use
the Empirical Rule.

34%

13.5%

55.87 58.58 61.29 64 66.71 69.42 72.13


x 3s x 2s x s x xs x 2s x 3s

34% + 13.5% = 47.5% of women are between 64 and


69.42 inches tall.
Larson/Farber 4th ed. 53
Standard Deviation for Grouped Data

Sample standard deviation for a frequency distribution

( x x ) 2 f where n= f (the number of


s
n 1 entries in the data set)

When a frequency distribution has classes, estimate the


sample mean and standard deviation by using the
midpoint of each class.

Larson/Farber 4th ed. 54


Example: Finding the Standard Deviation
for Grouped Data

You collect a random sample of the Number of Children in


50 Households
number of children per household in 1 3 1 1 1
a region. Find the sample mean and 1 2 2 1 0
the sample standard deviation of the 1 1 0 0 0
data set. 1 5 0 3 6
3 0 3 1 1
1 1 6 0 1
3 6 6 1 2
2 3 0 1 1
4 1 1 2 2
0 3 0 2 4

Larson/Farber 4th ed. 55


Solution: Finding the Standard Deviation
for Grouped Data
First construct a frequency distribution.
x f xf
Find the mean of the frequency 0 10 0(10) = 0

distribution. 1 19 1(19) = 19

xf 91
2 7 2(7) = 14
x 1.8 3 7 3(7) =21
n 50 4 2 4(2) = 8
5 1 5(1) = 5
The sample mean is about 1.8 6 4 6(4) = 24
children. f = 50 (xf )= 91

Larson/Farber 4th ed. 56


Solution: Finding the Standard Deviation
for Grouped Data
Determine the sum of squares.
x f xx ( x x )2 ( x x )2 f
0 10 0 1.8 = 1.8 (1.8)2 = 3.24 3.24(10) = 32.40
1 19 1 1.8 = 0.8 (0.8)2 = 0.64 0.64(19) = 12.16
2 7 2 1.8 = 0.2 (0.2)2 = 0.04 0.04(7) = 0.28
3 7 3 1.8 = 1.2 (1.2)2 = 1.44 1.44(7) = 10.08
4 2 4 1.8 = 2.2 (2.2)2 = 4.84 4.84(2) = 9.68
5 1 5 1.8 = 3.2 (3.2)2 = 10.24 10.24(1) = 10.24
6 4 6 1.8 = 4.2 (4.2)2 = 17.64 17.64(4) = 70.56
( x x )2 f 145.40

Larson/Farber 4th ed. 57


Solution: Finding the Standard Deviation
for Grouped Data
Find the sample standard deviation.
x 2 x ( x x )2 ( x x )2 f
( x x ) f 145.40
s 1.7
n 1 50 1

The standard deviation is about 1.7 children.

Larson/Farber 4th ed. 58


Quartiles
Fractiles are numbers that partition (divide) an
ordered data set into equal parts.
Quartiles approximately divide an ordered data set
into four equal parts.
First quartile, Q1: About one quarter of the data
fall on or below Q1.
Second quartile, Q2: About one half of the data
fall on or below Q2 (median).
Third quartile, Q3: About three quarters of the
data fall on or below Q3.

Larson/Farber 4th ed. 59


Example: Finding Quartiles

The test scores of 15 employees enrolled in a CPR


training course are listed. Find the first, second, and
third quartiles of the test scores.
13 9 18 15 14 21 7 10 11 20 5 18 37 16 17
Solution:
Q2 divides the data set into two halves.
Lower half Upper half
5 7 9 10 11 13 14 15 16 17 18 18 20 21 37
Q2
Larson/Farber 4th ed. 60
Solution: Finding Quartiles
The first and third quartiles are the medians of the
lower and upper halves of the data set.
Lower half Upper half
5 7 9 10 11 13 14 15 16 17 18 18 20 21 37

Q1 Q2 Q3

About one fourth of the employees scored 10 or less,


about one half scored 15 or less; and about three
fourths scored 18 or less.

Larson/Farber 4th ed. 61


Interquartile Range

Interquartile Range (IQR)


The difference between the third and first quartiles.
IQR = Q3 Q1

Larson/Farber 4th ed. 62


Example: Finding the Interquartile Range

Find the interquartile range of the test scores.


Recall Q1 = 10, Q2 = 15, and Q3 = 18

Solution:
IQR = Q3 Q1 = 18 10 = 8

The test scores in the middle portion of the data set


vary by at most 8 points.

Larson/Farber 4th ed. 63


Percentiles and Other Fractiles

Fractiles Summary Symbols


Quartiles Divides data into 4 equal Q1, Q2, Q3
parts
Deciles Divides data into 10 equal D1, D2, D3,, D9
parts
Percentiles Divides data into 100 equal P1, P2, P3,, P99
parts

Larson/Farber 4th ed. 64


Example: Interpreting Percentiles

The ogive represents the


cumulative frequency
distribution for SAT test
scores of college-bound
students in a recent year. What
test score represents the 72nd
percentile? How should you
interpret this? (Source: College
Board Online)

Larson/Farber 4th ed. 65


Solution: Interpreting Percentiles

The 72nd percentile


corresponds to a test score
of 1700.
This means that 72% of the
students had an SAT score
of 1700 or less.

Larson/Farber 4th ed. 66


The Standard Score

Standard Score (z-score)


Represents the number of standard deviations a given
value x falls from the mean .

value - mean x
z
standard deviation

Larson/Farber 4th ed. 67


Example: Comparing z-Scores from
Different Data Sets
In 2007, Forest Whitaker won the Best Actor Oscar at
age 45 for his role in the movie The Last King of
Scotland. Helen Mirren won the Best Actress Oscar at
age 61 for her role in The Queen. The mean age of all
best actor winners is 43.7, with a standard deviation of
8.8. The mean age of all best actress winners is 36, with
a standard deviation of 11.5. Find the z-score that
corresponds to the age for each actor or actress. Then
compare your results.

Larson/Farber 4th ed. 68


Solution: Comparing z-Scores from
Different Data Sets

Forest Whitaker
x 45 43.7 0.15 standard
z 0.15 deviations above
8.8 the mean

Helen Mirren
x 61 36 2.17 standard
z 2.17 deviations above
11.5 the mean

Larson/Farber 4th ed. 69


Solution: Comparing z-Scores from
Different Data Sets

z = 0.15 z = 2.17
The z-score corresponding to the age of Helen Mirren
is more than two standard deviations from the mean,
so it is considered unusual. Compared to other Best
Actress winners, she is relatively older, whereas the
age of Forest Whitaker is only slightly higher than the
average age of other Best Actor winners.
Larson/Farber 4th ed. 70

You might also like