You are on page 1of 16

2.4 - 2.

5

The procedure for finding the variance and standard
deviation for grouped data is similar to that for finding
the mean for grouped data, and it uses the midpoints
of each class.

Make a table as shown
A
Class
B
Frequency
C
Midpoint
d
f*x
m
e
f*x
m
2

Multiply the frequency by the midpoint for each class, and place the products in
Column D.
Multiply the frequency by the square of the midpoint, and place the products in
column E.
Find the sums of columns B, D, and E, (The sums of column B is n. The sum of
column D is f*x
m.
The sum of column E is f*x
m
2
)


Substitute in the formula and solve to get the variance.


Take the square root to get the standard deviation
Find the variance and the
standard deviation for
the frequency
distribution of the data.
The data represent the
number of miles that 20
runners ran during one
week.
Class Frequency Midpoint
5.5-10.5 1 8
10.5-15.5 2 13
15.5-20.5 3 18
20.5-25.5 5 23
25.5-30.5 4 28
30.5-35.5 3 33
35.5-40.5 2 28
Class Frequency Midpoint f-x
m
f-x
m
2

5.5-10.5 1 8 8 64
10.5-15.5 2 13 26 676
15.5-20.5 3 18 54 2,916
20.5-25.5 5 23 115 13,225
25.5-30.5 4 28 112 12544
30.5-35.5 3 33 99 9801
35.5-40.5 2 28 76 5776
n=20 f-x
m
= 490 is f-x
m
2
= 45,002
Multiply the frequency by the midpoint for each class, and place the products in
the 4
th
column
Multiply the frequency by the square of the midpoint, he products and place the
5
th
column.
Find the sums of the 2
nd
, 4
th
and 5
th
column.
=20(45,002)-490
2
/20(20-1)
=900,040-240,100/20(19)
=659,940/380
=1736.68
Take the square root to get the standard deviation
S= 1736.68 = 41.67
Be sure to use the number found in the sum of the
2
nd
column for n. Do not use the number of classes.
The range can be used to approximate the standard
deviation. The approximation is called the range rule
of thumb.
S range/4
Example: The data set 5, 8, 8, 9, 10, 12, and 13, has a
standard deviation o f 2.7 and the range is 13-5= 8 The
range rule of thumb is s 2.
In this example the range rule of thumb
underestimates the standard deviation but it is in the
ballpark.
The range rule of thumb can be used to estimate the largest
and smallest data values of a data set. The smallest value
will be approximately 2 standard deviations below the
mean, and the largest data value will be approximately 2
standard deviations above the mean of the data set.
Example the mean from the data set 5, 8, 8, 9, 10, 12, and 13,
is 9.3 hence,
Smallest data value = X - 2s = 9.3 - 2(2.8) = 3.7
Largest data value = X + 2s = 9.3 + 2(2.8) = 14.9
Now look back at the original data set. The Smallest was 5
and the largest was 13. Again these are considered rough
estimates. Better approximations can be obtained by using
Chebyshevs theorem and the empirical rule.

The portion of values from any data set lying within z
standard deviations (z>1) of the mean is at least
1 1/z
2.
Z = 2: In any data set, at least 1 1/2
2
= , or 75%, of
the data lie within 2 standard deviations of the mean.
Z=3: In any data set, at least 1 1/3
2
= 8/9, or 88.9%, of
the data lie within 3 standard deviations of the mean.
Applies to any distribution regardless of its shape.
The age distributions for Alaska and Florida are shown in the histograms. Decide
which is which. Apply Chebychevs Theorem to the data for Florida.
The mean price of houses in a certain neighborhood is
$50,000, and the standard deviation is $10,000. Find
the price range for which at least 75% of the houses
will sell.
Chebyshevs theorem can be used to find the
minimum percentage of data values that will fall
between any two given values.
Example: A survey of local companies found that the
mean amount of travel allowances for executives was
$0.25 per mile. The standard deviation was $0.02.
Using Chebychevs theorem, find the minimum
percentage of the data values that will fall between
$0.20 and $0.30.
Data values that lie more than 2 standard deviations from the mean are considered
unusual. Data values that lie more than three standard deviations from the mean are
very unusual.
Applies only to bell shaped (NORMAL) distributions
Approximately 68% of the data values will fall within 1
standard deviation of the mean.
Approximately 95% of the data values will fall within 2
standard deviation of the mean.
Approximately 99.7% of the data values will fall within 3
standard deviation of the mean.


Many real-life data sets have distributions that are
approximately symmetric and bell shaped.
68% of the data lie within 1 standard deviation
95% of the data lie within 2 standard deviations
99.7% of the data lie within 3 standard
deviations
In a survey conducted by the
National Center for Health
Statistics, the sample mean
height of women in the U.S.
(ages 20-29) was 64 inches with
a sample standard deviation of
2.75 inches. Estimate the percent
of women whose heights are
between 64 inches and 69.5
inches.
We know 64 is the mean to
calculate how much 2 standard
deviations from the mean is we
take the MEAN + 2(STANDARD
DEVIATIONS)= or
64+2(2.75)=69.5
Because the distribution is
bell shaped, you can use
the Empirical Rule.
Because the 69.5 is 2
standard deviations above
the mean height, the
percent of the heights
between 64 inches and
69.5 inches is 34% + 13.6 %
or 47.6%
So 47.6% of women are
between 64 inches and
69.5 inches.

You might also like