Professional Documents
Culture Documents
Blood Type
1
Probability Distributions for Probability Distributions for
Continuous Data Continuous Data
Continuous Data can take on any value within
the range of possible values so describing the
distribution of continuous data in a table is not
very practical
.2
1.0
The probability density curve in the systolic blood
.1
2
Shapes of Probability Density
Normal Distribution
Curves The Normal Distribution is also called
There are many possible shapes for the the Gaussian Distribution after
probability density curves of continuous Karl Friedrich Gauss, a German
data. mathematician (1777 1855)
Right Skewed
Left Skewed Characteristics of any Normal Distribution
Bell
Bell--shaped curve
Bimodal
Unimodal peak is at the mean
Multimodal
Symmetric about the mean
Mean = Median = Mode
The most commonly used probability Tails of the curve extend to infinity in both
distribution in the study of statistics is the directions
normal distribution
PubH 6414 Lesson 6 Part 1 13 PubH 6414 Lesson 6 Part 1 14
Normal Distribution
Q Is every variable normally distributed?
A No there are skewed (asymmetric)
distributions and there are bimodal
distributions.
Q Then why do we spend so much time studying
the normal distribution?
A Two answers:
1. Many variables in health research are normally
distributed
2. More importantly: Many statistical tests are based
on the normal distribution
Describing Normal
Symbol Notation
Distributions
A convention in statistics notation is to use Roman letters
for sample statistics and Greek letters for population Every Normal distribution is uniquely
parameters. Since the density curve describes the population, described by its mean (
() and standard
Greek letters are used for the mean (mu) and SD (sigma) deviation (
()
The Notation for a normal distribution is
Density
N(,
N(, )
Symbol Sample Curve
N(125, 4) refers to a normal distribution with
mean = 125 and variance = 16.
Mean X
Standard
Deviation s
PubH 6414 Lesson 6 Part 1 17 PubH 6414 Lesson 6 Part 1 18
3
The 68-
68-95-
95-99.7 Approximation
for all Normal Distributions
Regardless of the mean and standard deviation of
the normal distribution:
68% of the observations fall within one standard
Normal density with Two normal densities with different
mean=5 and =1 mean values and same deviation of the mean
95% of the observations fall within approximately*
two standard deviations of the mean
99.7% of the observations fall within three
standard deviations of the mean
The 68-
68-95-
95-99.7 Approximation Distributions of Blood Pressure
for all Normal Distributions
.4
.1
99.7%
0
83 97 111 125 139 153 167
4
Calculating the Areas under
Areas under the Curve
the Curve
What if you wanted to find the probability of a Calculating area (or probability) under a normal
man having SBP < 105 mmHg? distribution curve is a numeric problem involving
integration of the formula for the normal
The 68-95-99.7
We want the area distribution (see page 77 of text). This is not an
below 105 rule cant be used
to find this area easy calculation. Other options are:
under the curve Table A-
A-2 in the text is a table of areas under the
standard normal curve the normal distribution
with mean = 0 and standard deviation = 1
83 97 111
105
125 139 153 167 The NORMDIST function in Excel can be used
SBP in mmHg to find the area under a normal distribution
density curve.
PubH 6414 Lesson 6 Part 1 25 PubH 6414 Lesson 6 Part 1 26
SBP in mmHg
150 man has SBP > 150 = 0.037
PubH 6414 Lesson 6 Part 1 29 PubH 6414 Lesson 6 Part 1 30
5
Areas under the Curve Using NORMDIST function
What if you wanted to find the probability of a What is the probability that a man has
man having SBP between 115 and 135? SBP between 115 and 135 mmHg?
We want the area For area between two values, subtract the
between 115 and 135
area to the left of the smaller value from
the area to the left of the larger value
=NORMDIST(135, 125, 14, 1)
NORMDIST(115, 125, 14, 1) = 0.52
83 97 111 115 125 135139 153 167
The probability that a man has SBP
between 115 and 135 mmHg = 0.52
SBP in mmHg
Standard Normal
Formula for Z-
Z-score
Transformation
Any normal distribution of some variable X can be
transformed to a standard normal distribution by
the following calculations: X
Subtract the mean (
() from each value for X Z=
Divide each value of X by the standard deviation
These transformed variables are called Z-
Z-scores. Z is calculated by subtracting the mean () from X
Sometimes referred to as z-
z-variables or zz--values or and dividing by the standard deviation ()
standard scores Subtracting the mean centers the distribution at 0
Dividing by , rescales the standard deviation to 1
6
Standard Normal Scores
Divide by standard deviation
SubtractMean =
the mean The z-
z-score is interpreted as the number of SD
Subtract
SD = the mean an observation is from the mean
Z = 1: The observation lies one SD above
the mean
Standard normal curve Z = 2: The observation is two SD above the
mean
Z = -1: The observation lies 1 SD below the
mean
Z = -2: The observation lies 2 SD below the
mean
PubH 6414 Lesson 6 Part 1 37 PubH 6414 Lesson 6 Part 1 38
Standard Normal
Since the area under the curve = 1.0, 50% of the area is on either side of Distribution with 95% area
the mean. marked
Therefore, the probability of an observation being greater than 0 = 0.50 95% of the probability is between z = 1.96 and z = -1.96 on the standard normal curve
and the probability that an observation is less than 0 = 0.50.
PubH 6414 Lesson 6 Part 1 39 PubH 6414 Lesson 6 Part 1 40
7
Using NORMSDIST Using NORMSDIST
What is the probability that a man has SBP < 105?
What is the probability that a man has SBP > 150? Calculate the z-
z-score for 105 from the normal
First calculate the Z-
Z-score for 150 distribution with = 125 and = 14
150 125
Z= = 1.79 Z=
14
In EXCEL use =1 - NORMSDIST(1.79) = 0.0367
In Excel use =NORMSDIST(-
=NORMSDIST(-1.43) = 0.076
The probability that a man has SBP > 150 = 0.037.
This is the same as the result using
This is the same as the probability obtained using NORMDIST(105, 125, 14, 1) = 0.076
the NORMDIST function The probability that a randomly selected man has
SBP < 105 = 0.076
PubH 6414 Lesson 6 Part 1 43 PubH 6414 Lesson 6 Part 1 44
8
Inverse problem: Ex. 1 NORMSINV function in Excel
Find a z value such that the probability of obtaining Find the z-
z-score such that the probability of
a larger z score = 0.10. having a larger z-
z-score = 0.10
NORMSINV(0.10) returns the z- z-score such that
the probability of being < Z = 0.10
Area=0.10 If the area > than the z-
z-score = 0.1, then the
area < than the z-
z-score = 1 0.1 = 0.9
Use NORMSINV(0.9) = 1.28
The probability that a z-
z-score is greater than
1.28 = 0.10
What is this z score?
9
Human conception & the normal Using the 68% - 95% - 99.7%
curve approximation.
X ~ N(266, 16)
What percent of the data fall above What percent of the data fall below
266 days? 234 days?
1. 5% 1. 2.5%
2. 34% 2. 5%
3. 50% 3. 34%
4. 68% 4. 50%
5. 81.5% 5. 81.5%
What percent of the data fall The top 16% of pregnancies last
between 250 days and 298 days? approximately how many days?
10
WHICH EQUATION WILL GIVE YOU THE
The standard normal curve. RED SHADED AREA UNDER THE CURVE?
1. P(Z<1)
2. P(0<Z<1)
3. P(Z>1)
4. P(Z=1)
5. P(Z1)
WHICH EQUATION WILL GIVE YOU THE WHICH EQUATION WILL GIVE YOU THE
RED SHADED AREA UNDER THE CURVE? RED SHADED AREA UNDER THE CURVE?
1. P(Z<1) 1. P(Z2)
2. P(0<Z<1) 2. P(Z<--1)-
P(Z< 1)-P(Z<2)
3. P(Z>--1)
P(Z> 3. 1-P(Z<
P(Z<--1)
4. P(Z=--1)
P(Z= 4. P(Z<2)--P(Z
P(Z<2) P(Z--1)
5. P(Z--1)
P(Z 5. 1- P(Z
P(Z--2)
1. P(Z0.62)=z* Reading
Chapter 4 pgs. 76 80: Normal Distribution
2. P(Z>z*)=0.62
Lesson 6 Practice Exercises
3. 1-0.62=P(Z<z*)
Work through the Excel Module 6
4. P(Z<z*)=0.62
examples
5. 1- P(Zz*)= 0.62
Start Homework 4
11