Professional Documents
Culture Documents
LESSON OUTLINE:
1. Review of Continuous Random Variables
2. Motivation: Distribution of Weights of Babies
3. Main Lesson: The Normal Curve and Its Properties
4. Seatwork: Validating the Empirical Rule in the Weights Distribution
5. Enhancement : History regarding the Normal Curve
6. Further Enhancement: Distribution of Balls in The Quincunx
Ask students to recall the definition of a continuous random variable. (It is a random
variable that can take any real value within a specified range. whereas a discrete random
variable takes some on a countable number of values). Students should also remember that a
continuous variable involves a measurement of something, such as the height of a randomly
selected student, the weight of a newborn baby, or the length of time that the battery of a
cellphone lasts.
Consider the following data pertaining to hospital weighs (in pounds) of all the 36 babies that
are born in the maternity ward of a certain hospital.
Show students the histogram for these data set, or ask them to generate it; help them observe
that the histogram is approximately bell-shaped:
.3
.2
Density
.1
0
2 4 6 8 10
weight
Inform students that the many continuous random variables, such as IQ scores, heights of
people, or weights of M&Ms, have histograms that look bell shaped.
Tell them that the most important distribution in statistical science is a normal distribution,
which has a "bell-shaped" curve. Explain that there are many reasons why the normal
distribution is considered the most important curve in statistics.
(a) Many random variables are either normally distributed or at least, approximately
normally distributed. Heights, weights, examination scores, the log of the lifelength of
some equipment are among a few random variables that are approximately normally
distributed. Although the distributions are only approximately normal, the approximation
is usually quite close.
(b) It is easy for mathematical statisticians to work with the normal curve. A number of
hypothesis tests, and the regression model are based on the assumption that the
underlying data have normal distributions. (Extra note: There are, however, other kinds
of continuous distributions, that are used in practice. For instance, the distribution that
has been found convenient for modeling the lifelength of equipment is the Weibull
distribution.)
Stress that the normal distribution is a continuous distribution just like the uniform and
triangular distribution. However, the left and right tails of the normal distribution extend
indefinitely but come infinitely close to the x-axis.
Explain that the graph of the normal distribution depends on two factors - the meanand the
standard deviation . In fact, the mean and standard deviation characterize the whole
distribution. That is, we can get areas under the normal curve given information about the
mean and standard deviation.
Mention that the mean determines the location of the center of the bell shaped curve. Thus, a
change in the value of the mean shifts the graph of the normal curve to the right or to the left.
Ask students to recall what the mean, median and mode of a distribution represent. They
should say (a) the mean represents the balancing point of the graph of the distribution; (b)
the mode represents the high point of the probability density function (i.e. gthe graph of the
distribution), (b) the median represents the point where 50% of the area under the
distribution is to the left and 50% of the area under the distribution is to
the right.
For symmetric distributions with a single peak, such as the normal curve, assist to students to
think that the mean = median = mode.
Inform students that the standard deviation determines the shape of the graphs (particularly,
the height and width of the curve). When the standard deviation is large, the normal curve is
short and wide, while a small value for the standard deviation yields a skinnier and taller
graph.
Draw the curves on the board:
Mention to students that the curve above on the left is shorter and wider than the curve on the
right, because the curve on the left has a bigger standard deviation.
Help students notice that a normal curve is symmetric about its mean and is more
concentrated in the middle rather than in the tails, aside from observing that normal curves
differ in how spread out they are (and that the spread or variability is measured by the
standard deviation ).
Tell students that when a random variable has a normal distribution with mean and
variance 2, we denote this as X~N(,2).
1 1
f ( x) exp ( x )2 ,
2 2 2
You may notice that involves three famous numbers in the history of mathematics:
2 1.41421235652 ,
3.141592654 and
Eulers number e 2.7182818 .
Students need not be given this expression as they may feel threatened by it. Instead, you
should use the graphical form of the normal distribution by drawing the bell shaped curve.
About 68% of the area under the curve falls within 1 standard deviation of the mean.
About 95% of the area under the curve falls within 2 standard deviations of the mean.
Nearly the entire distribution (About 99.7% of the area under the curve) falls within 3
standard deviations of the mean.
Explanatory Note: The empirical rule is actually a theoretical result based on an analysis of
the normal distribution. In the first chapter, it was pointed out that the importance of the
mean and standard deviation as summary measures is due to Chebychevs inequality, which
guarantees that the area under a distribution within two standard deviations from the mean is
at least 75%. For nearly all sets of data, the actual percentage of data may be much greater
than the bound specified by Chebychevs Inequality. In fact, for a normal curve, the area
within two standard deviations from the mean is about 95%. Also, about two thirds of the
distribution lie within one standard deviation from the mean and nearly the entire distribution
(99.7%) is within three standard deviations from the mean:
Ask students to determine what is the frequency (and relative frequency) of babies weights
that are within:
a) One standard deviation from the mean
ANSWER: 26 out of 36; or about 72% . Values within one from are in boldface:
ANSWER: 34 out of 36; or about 95% Values within two from are in boldface
c) Three standard deviations from the mean (ANSWER: 36 out of 36; 100%)
Remark:
In place of examining the distribution of weights from babies, you may want to examine
heights or weights of students obtained from the data collection activities in Lesson 1 of
Chapter 1.
(D) Enrichment
(ii) In some disciplines, such as engineering, the Normal distribution is also called the
Gaussian distribution (in honor of Gauss who did not first propose it!). The first
unambiguous use of the term Normal distribution is attributed to Sir Francis Galton
in 1889 allthough Karl Pearson's consistent and exclusive use of this term in his
prolific writings led to its eventual adoption throughout the statistical community.
(E) Further Enrichment : Distribution of Balls in a Quincunx
http://www.mathsisfun.com/data/quincunx.html
In this webpage, they are shown a quincunx or "Galton Board" (named after Sir Francis
Galton). This is a triangular array of pegs. Balls are dropped onto the top peg, and then
subsequently they bounce their way down to the bottom where they are collected in little
bins. Each time a ball hits one of the pegs, it bounces either to the left or right with equal
probability, and consequently the number of pegs collecting in the bins form a "bell-shaped"
curve (especially as the number of rows (and bins) as well as the number of balls increases).
Tell students to reset defaults with the simulator and use 6 rows, and drop about 50 balls.
!
( ) (1 )
! ( )!
In particular, for 12 rows (n=12) and a probability of bouncing left of 0.5 (p=0.5), we can
calculate the probability of being in the 5th bin from the right (k=5) as follows:
12!
(5!7!) 0.55 (0.5)7 = 0.193
In fact we can build the entire probability distribution for rows=12 and probability=0.5 like
this:
Bin number from 12 11 10 9 8 7 6
right
Probability 0.000244 0.00293 0.016113 0.053711 0.12085 0.193359 0.225586
Bin number from 5 4 3 2 1 0
right
Probability 0.193359 0.12085 0.053711 0.016113 0.00293 0.000244
KEY POINTS
Albert, J. R. G. (2008).Basic Statistics for the Tertiary Level (ed. Roberto Padua, Welfredo
Patungan, Nelia Marquez), published by Rex Bookstore.
De Veau, R. D., Velleman, P. F., and Bock, D. E. (2006). Intro Stats. Pearson Ed. Inc.
Workbooks in Statistics 1: 11th Edition, Institute of Statistics, UP Los Banos, College Laguna
4031
http://www.amsi.org.au/ESA_Senior_Years/PDF/ExpoNormDist4f.pdf
http://cnx.org/contents/228deca0-5532-488c-8422-5878022132d6@9/Normal-Distribution:-
Teacher's
https://www.opened.com/video/normal-distribution-explained-simply-part-1/43505
https://www.opened.com/video/introduction-to-the-normal-distribution/109297
https://www.youtube.com/watch?v=yTGEMoaWDCQ
http://people.wku.edu/david.neal/183/Unit1/Normal.pdf
http://www.learner.org/courses/againstallodds/unitpages/unit07.html
ASSESSMENT
1. The data below and the accompanying histogram give the weights, to the nearest
hundredth of a gram, of a sample of 100 coins (each with a value of P10). The mean
weight is 8.69 grams and the standard deviation is approximately 0.055 gram.
8
6
Density
4
2
0
2. Fifty students were asked to run a 100-meter dash. The data below represents the time it
took to finish the dash, and the histogram. The mean time for the 50 students is 15.8
seconds, and the standard deviation s is approximately 3.29 seconds.
16 14 14 16 21 14 17 15 16 21
14 10 9 20 12 12 19 11 15 14
18 18 13 18 23 8 20 13 16 23
16 17 15 18 17 16 13 15 18 19
12 12 15 17 14 16 17 16 16 21
Soln:
a. Very close, the median is 8.7 grams.
b. 70%, 92%, 100%.
c. According to the empirical rule, the chances are 68%, 95% and 99.7%.
3. Toss a fair coin twice and let X be the number of heads obtained. Generate the histogram
for the distribution. Consider tossing the fair coin three times, five times, ten times,
fifteen times; generate the histogram for the number of heads also for these cases.
As the number of tosses increases, what curve can be used to approximate the histogram?
(a) Probability Distribution for number of heads in two tosses of a fair coin
X 0 1 2
P(X=x) 0.25 0.5 0.25
0.6
0.5
0.4
0.3
0.2
0.1
0
1 2 3
(b) Probability Distribution for number of heads in three tosses of a fair coin
x 0 1 2 3
P(X=x) 0.125 0.375 0.375 0.125
0.4
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0 1 2 3
(c) Probability Distribution for number of heads in five tosses of a fair coin
x 0 1 2 3 4 5
P(X=x) 0.03125 0.15625 0.3125 0.3125 0.15625 0.03125
0.35
0.3
0.25
0.2
0.15
0.1
0.05
0
0 1 2 3 4 5
(d) Probability Distribution for number of heads in ten tosses of a fair coin
x 0 1 2 3 4 5
P(X=x) 0.000976563 0.009766 0.043945 0.117188 0.205078 0.246094
x 6 7 8 9 10
P(X=x) 0.205078 0.117188 0.043945 0.009766 0.000977
0.3
0.25
0.2
0.15
0.1
0.05
0
0 1 2 3 4 5 6 7 8 9 10
(e) Probability Distribution for number of heads in fifteen tosses of a fair coin
x 0 1 2 3 4 5
P(X=x) 3.05176E-05 0.000458 0.003204 0.013885 0.041656 0.091644
x 6 7 8 9 10
P(X=x) 0.15274 0.196381 0.196381 0.15274 0.091644
x 11 12 13 14 15
P(X=x) 0.041656 0.013885 0.003204 0.000458 3.05E-05
0.3
0.25
0.2
0.15
0.1
0.05
0
0 1 2 3 4 5 6 7 8 9 10
(f) As the number n of tosses increases, the normal curve can be used to approximate the
histogram of the number of heads in n tosses of a fair coin (a binomial probability
distribution)?
4. Suppose that the weights of Filipino grade 11 students are normally distributed with a
mean of 52 kilograms and a standard deviation of 1 kilogram. Explain what this means in
terms of the properties of a normal distribution
Solution. Let be all grade 11 Filipino students, and let X denote their weight (in kg).
Then X ~ N(52, 1) kg. That is,
(i) the average weight, the most likely weight, and the median weight are all 52 kg.
(ii) weights of the grade 11 Filipino students as a whole are symmetric about the weight
52 kg.
(iii) Around 68% of weights are from 51 to 53 kg ( ); around 95% of weights are
from 50 to 54 kg ( 2); and around 99.7% of weights are from 49 to 55 kg ( 3 ).
(iv) A histogram of weights of Filipino grade 11 students creates a Bell-Shaped Curve
with the percentages of high and low weights dropping off exponentially
The following data pertaining to the points scored of the high school basketball team in 28
games;
66 75 41 57 54 82 67
42 60 37 49 87 101 78
60 66 48 43 42 61 64
67 37 51 63 68 77 13
The average of the points scored is 59.14286, while the standard deviation is 17.856 and the
histogram is provided below.
.03
.02
Density
.01
0
0 20 40 60 80 100
basketball
Can we approximate the distribution of scores with a normal curve, and what does this mean
in terms of the properties of a normal curve?
Solution. Yes, we can approximate the basketball scores distribution with a normal curve.
Let X denote their basketball score. Then X ~ N(59.14, 17.962). That is,
(i) the average score, the most likely score, and the median score are all 59 points.
(ii) scores in the basketball games as a whole are symmetric about the 59 points
(iii) Around 68% of scores are from 41 to 77 points ( ); around 95% of scores are from
23 to 95 points ( 2); and around 99.7% of weights are from 6 to 113 points ( 3 ).
(iv) A histogram of the basketball scores creates an approximate Bell-Shaped Curve with
the percentages of high and low scores dropping off exponentially