You are on page 1of 19

Marvines Cliff Notes for Statistics

February 29, 2016

[1] Basics

For a specified population you can calculate a mean, , and a variance, 2, where the standard
deviation is the square root of the variance, i.e. . Furthermore, you can sample the
population, e.g. with a sample size n, to estimate population parameters. For a sample you can
calculate the mean of the sample, , with a sample variance, s2. Similarly, if categorical data is
involved you can use a sample proportion, p, to estimate a population proportion, .
Using this notation, keep in mind that z-scores can be calculated using:
=

A standard normal distribution is a normally distributed random variable where the mean
equals zero, i.e. =0, and the standard deviation equals 1, i.e. =1. We can use z-scores and the
normal probability tables to evaluate the cumulative probability of a random variable, given as
Z. Note that were using X to denote a normally distributed random variable or normal random
variable and Z to denote a standard normal random variable.
So, lets get into some of the equations. If X is a normally distributed random variable with
mean and standard deviation then the cumulative normal probability table can be used to
compute ( < < ) by:

( < < ) =
<<

Where Z denotes a standard normal random variable, a can be any decimal number or -, and
b can be any decimal number or . The endpoints ( )/ ( )/ are really the zscores where a and b are values of x. Notice that the nomenclature here denotes the mean and
standard deviation of a population.

Example 1.1:
Let X be a normal random variable with a mean = 10 and standard deviation = 2.5.
Calculate the probability that X < 14 or P(X<14).

Solution:
( < 14) = <
14 10

2.5
= ( < 1.60)
= 0.9452
= <

Page 2

14

where 0.9452 is found in the normal probability table using the value 1.60. This yields the area
of the left hand side of the normal distribution, i.e. the left hand tail. If you are looking for X >
14, or the right hand tail, you must subtract 0.9452 from 1, or 1 0.9452 = 0.0548.

But what about a sample rather than a population? We use the nomenclature to designate a
random variable that is sampled and to denote the values it takes. The sample has a mean
denoted by and a standard deviation denoted by . Remember that the probability
distribution of a discrete random variable X is a list of each possible value of X together with the
probability that X takes that value in one trial. For an example of such a probability distribution
see Table 1 and below. Also remember that each probability P(x) must be between 0 and 1, i.e.
0 P(x) 1, and the sum of all probabilities must equal 1, i.e. () = 1 . Therefore, for a
sample we have the following equations:
= ( ) = 2 ( ) 2

Example 1.2
For Table 1 find the sample mean and the standard deviation of the samples.
Table 1

For Table 1 above, the probability distribution of the sample mean is:

Page 3

Now we can apply the equations above to get the mean of the random variable and the
corresponding standard deviation. That is,
= ( )
= 152
= 158

1
2
3
4
3
2
1
+ 154 + 156 + 158 + 160 + 162 + 164
16
16
16
16
16
16
16

To find the standard deviation of the sample mean we must first calculate
1

2 ( ) = 1522 + 1542 + 1562 + 1582 + 1602 + 1622 +


16
16
16
16
16
16
1

1642 16 = 24,964

Thus, = 2 ( ) 2 = 24,974 1582 = 10 = 3.16


For this example, we have 16 samples. The population mean is 158; and, its standard deviation
is 4.54. We also see that the mean of the sample means equals the mean of the population.
And, the variance in the sample means is less than the variance in the population. Therefore, if
we follow the rules of thumb we have discussed (e.g. the sample size n > 30 is sufficiently large,
etc.) for random samples of size n drawn from a population with a mean and standard
deviation , we can relate the sample mean and standard deviation of the sample mean to the
mean and standard deviation of the population using = and = / . This is the
crux of the Central Limit Theorem; and, the basis of the concept that the larger the sample size
gets the better the approximation. For more on this see Appendix XXX.

[2] Tails of Probability Distributions


Having some introduction to the use of z-scores and the standard normal probability tables,
lets begin looking at some of their additional uses. For example, we may want to determine
how many students will get grades better than a specified score on an exam or how many
people are taller than a given height. To do this we first need to determine how to work with
the areas of the tails of a (normal) probability distribution. Consider Example 1.1 below.

Example 2.1:
Find the (cumulative) probability that Z is less than 1.48, i.e. ( < 1.48).
Page 4

Solution:
Using the normal cumulative probability tables, find the value 1.4 on the vertical axis of table
for positive Z and the value 0.08 on the horizontal axis. The intersection of these values is
0.9306. Therefore, the (cumulative) probability ( < 1.48) = 0.9306. This can be depicted as
shown in Figure 1 below.

Figure 1. Probability Computed Using the Normal Cumulative Probability Tables


But what if were asked to find the probability (notice that I dropped the term cumulative
now) that Z lies within an interval? For example, can we find the probability that (0.5 < <
1.57)? Yes, we can! First we find the probability that Z is less than 1.57, then the probability
that Z < 0.5 and compute the difference. This is shown in the calculations and in Figure 2
below.
(0.5 < < 1.57) = ( < 1.57) ( < 0.5) = 0.9418 0.6915 = 0.2503

Figure 2. Computed Probability for an Interval of Finite Length

There are practical uses for this. For example, the empirical rule says that the probability of the
value of a random normal variable falling within some number of standard deviations of the
mean is given. That is, we often talk about the probability of a random variable falling one

Page 5

standard deviation from the mean; or one sigma equals a given probability which we convert to
a percent, two sigma equals a given probability and so on. This comes from finding the interval
for a probability in the same way we just did. Consider(1 < < 1). Computing the
corresponding interval we get:
(1 < < 1) = 0.8413 0.1587 = 0.6826

That is, the probability of Z falling within 1 is 0.6826 or 68%. The values 2=95% and 3=99%
are found the same way.
This leads us to a discussion about the tails of the standard normal distribution and cutoff
values. Lets consider Example 1.2 below.

Example 2.2
Find the value of z* that cuts off the right hand tail of a normal distribution with an area a =
0.0250, i.e. ( > ) = 0.0250.
Solution:

In this case the area of the right hand tail is known, i.e. a = 0.0250, but we want to know z*, the
value that cuts off this area on the right hand side. The trick is to keep in mind that the tables
are tables of cumulative probabilities. That is, the values begin from the left hand side of the
standard normal distribution and move to the right, starting from an area zero and
accumulating area to the maximum 1.0. So, we cant use the area given for the right hand
side in either table to find z*. In order to use the normal cumulative probability tables we have
to find the corresponding area on the left hand side or 1 or in this case 1 0.0250 =
0.9750. This is the number we use for the area to look in the tables. Using 0.9750 we find that
z* is 1.96. This is illustrated in F below.

Figure 3 Finding the Value of z* Given the Right Hand Cutoff Area

Page 6

But what about cases when we need to determine, for a normally distributed random variable X
and a known area a, how to find the value of x* such that:
( < ) = ( > ) =

that is, the left hand tail or the right hand tail whichever is required. Now we cannot directly
exploit the standard normal probability tables. In this case we need to have a bit more
information, e.g. the mean and standard deviation of the random variable X. Consider Example
1.3 below.

Example 2.3
Find x* for an area equal to 0.9332 when the mean and standard deviation of the normal
random variable X are = 10 and = 2.5.
Solution:
If this were a standard normal random variable and a textbook problem we could use the
given area to find z* in the tables. In this case, we can begin by looking at the tables for the z*
given an area equal to 0.9332 and find z* = 1.50. What this means is that x* will be 1.50
standard deviations above the mean. So we need to essentially de-standardize in order to
find x*. We can use the formula to do this de-standardization.

Using this formula we find:

This is shown in F below.

= +

= + = 10 + (1.50 2.5) = 13.75

Figure 4. Finding the Value of x* for a Given Cutoff area

Page 7

So, now lets finally consider a more practical example for the use of all this. Consider Example
2.4 below.

Example 2.4
Assume that the scores on a standardized college entrance examination are normally
distributed with a mean of 510 and a standard deviation of 60. Further, suppose you worked at
a selective university, i.e. one that only admitted students with scores in the top 5% of this
entrance examination. If you wanted to determine what the minimum score would be to meet
that criterion how could you do that?
Solution:
First, lets assume that the scores on this examination are normally distributed. And, let X be
the random variable representing that distribution. We are given that = 510 and = 60.
Then, what we need to find is the score, x*, that produces a cutoff area of 0.05 on the right
hand side. This is similar to Example 1.2 above except that now were looking for x* rather than
z*. But weve already seen how to de-standardize. So, first we find the left hand area or:
= 1 0.05 = 0.95

From the standard normal cumulative probability tables we find that z* is not listed for an area
of 0.95. In this case 0.95 lies exactly between two listed values, i.e. 0.9495 and 0.9505, so we
can take the average of the two to get z* = 1.645. Since we were given the mean and the
standard deviation we use our equation to de-standardize and get:
= + = 510 + (1.645 60) = 608.7

This is illustrated in X below.

Figure 5. Finding the Cutoff Value for a Normally Distributed Random Variable

Page 8

This is all well and good if we have a population that is small enough that we can compute the
relevant statistics directly, e.g. the mean and standard deviation. But sometimes we cannot do
that. In that case we resort to samples with the mean of the sample denoted by , with a
sample variance, s2 , as above. And, sometimes we are interested in more than a point
estimate, i.e. using one single point to represent a large or very large population. That is what
we are really doing when we compute the mean of a sample. We are generating a point
estimate for the entire population. In reality, we usually want to generate a range of values
that will encompass all the possible point estimates, e.g. the means of all samples for a
population. This is called a confidence interval.

[3] Confidence Intervals


Confidence intervals take variations from sample to sample into consideration. They can
provide information about unknown population parameters in terms of levels of confidence,
e.g. 95% confident, 99% confident. Fortunately, we can use all the things weve already
covered to find and/or evaluate confidence intervals for given levels of confidence. Two
conditions must be true to reliably construct a confidence interval: 1) it must be a random
sample; and, 2) the sampling distribution must be approximately or nearly normal. Remember
that regardless of the distribution, if the sample size is large enough it will be nearly normal.
So, lets add another notation to our set. Lets let represent the level of significance. Some
common values for are 0.01, 0.05 and 0.10. For a given , then 1 will be our confidence
level. For example, if = 0.05 then our confidence level will be 0.95. In this case we can say we
are 95% confident of whatever it is were talking about. Now we can take this and what weve
already learned to develop the concept of confidence intervals.
We already know how to find the area of an interval. We looked at this in Example 2.1. To
evaluate a confidence interval we are looking at the interval that spans an area of a normal
probability distribution. That is, given an , e.g. = 0.05, then the corresponding level of
confidence of 0.95 is in fact the area we need in order to find the cutoff values for the interval.
Lets look at another example, Example 3.1.

Example 3.1
Consider a case in which we want a level of significance, or , of 0.05. As weve discussed this
leads us to an area equal to 0.95 of a normal probability distribution. Now we want to
determine what we can about the cutoff values.
Solution:

Page 9

We assume that a confidence interval is centered symmetrically about the mean of a normal
distribution. Given that symmetry the right and left hand tails will correspond to areas equal to
/2. Thus, for a standard normal distribution z* will be z/2 and for = 0.05 this is z0.025.
This is depicted in F below.

Figure 6. Evaluating the Placement of Cutoff Values for a 95% Confidence Interval

In cases where we know the sample size, sample mean, , and the population standard
deviation, , then we can easily make evaluations for different levels of confidence. Consider X
below.

Example 3.2
Given a sample size of 49 with a sample mean of 35 and corresponding standard deviation of
14, evaluate a 98% confidence interval.
Solution:
Given a level of confidence of 98%, = 1 0.98 = 0.02. So, using the standard normal
probability tables z/2 = z0.01 = 2.326.
Earlier we demonstrated that for samples we can use = and = / . Now we
need to expand this to cover our confidence interval. We can do this by considering Example
3.1 and what we have learned about sample means and their corresponding standard
deviations to develop an equation for the margin of error, denoted as E, which is also called the
standard error, denoted as SE, and the cutoff value for the sample:
= /2

thus the confidence interval will be given by /

Now, using our equation for a confidence interval we have:

Page 10

/2

14
= 35 2.326
= 35 4.652 35 4.7

49

This means that we can be 98% confident that our population mean, , lies within the interval
(30.3, 39.7). To consider additional cases, e.g. when the population standard deviation is
unknown, look in Appendix A: Selecting the Appropriate Test Statistic.

[4] Hypothesis Testing


Now we will make use of what we have learned to consider hypothesis testing. Hypothesis
testing allows us to evaluate statements about population parameters using samples. For
example, a manufacturer may claim that a fire safe will maintain its integrity for 75 minutes
during a fire. Hypothesis testing uses a statistical procedure to test the validity of such
statements.
To begin an evaluation using hypothesis testing both a null and an alternative hypothesis must
be formulated. The null hypothesis, denoted H0, is generally a claim such as the example
above, an expression of the status quo, or some other statement about a population
parameter. The null hypothesis is normally formulated to express the least favorable
position. For example, if an automobile manufacturer claimed that the average price of their
cars is at most $10,000, then $10,000 and $10,000 is the least favorable position, or in this
case the least favorable price. The alternative hypothesis, denoted Ha, makes another
statement. For example, consumers may believe that the average price of the auto
manufacturers cars is much more than $10,000. In this case the alternative hypothesis would
be expressed as Ha > $10,000.
Hypothesis testing always proceeds with the assumption that the null hypothesis is true. When
we talk about the results of the test we will say that we either Reject H0, i.e. we reject the null
hypothesis and therefore accept the alternative hypothesis, or we fail to reject the null
hypothesis and therefore fail to accept the alternative hypothesis. We test a hypothesis by
establishing a critical value or values that determine a rejection region or regions. This is similar
to our work above where we established cutoff values and corresponding areas for the right
hand tail, left hand tail, or both, or vice versa. However, for hypothesis testing the critical
value(s) are denoted by C where the cutoff value(s) before were denoted by x* or z* depending
on whether or not we were using the standard normal distribution.
The way hypothesis testing works is that we use the criterion we are given to establish the null
and alternative hypothesis as well as a rejection region or regions. If the sample value of the
population parameter, e.g. , falls within the rejection region then we reject the null hypothesis
and accept the alternative hypothesis, otherwise we fail to reject the null hypothesis and

Page 11

correspondingly fail to accept the alternative hypothesis. In this case, 0 is the value being
tested in the null hypothesis and we can say:

If Ha has the form Ha : < 0 we reject H0 if is far to the left of 0, i.e. to the left of the
critical value C such that the rejection region is the interval (-,C);
If Ha has the form Ha : > 0 we reject H0 if is far to the right of 0, i.e. to the right of the
critical value C such that the rejection region is the interval (C ,,);
If Ha has the form Ha : 0 we reject H0 if is far away from 0 in either direction i.e. to
the left or right of the critical value C such that the rejection region is the interval (-,C)
(C ,,).

Now what we have to do is determine the critical value or values, C. We want to be confident
in our rejection of H0; or in other words, we want only a very small probability that the value
will fall in the rejection region. So, well define = 0.01 as defining a rare event which
provides us with the confidence we want. Then, we have the situation as before. This is shown
in Figure 7.

Figure 7. Hypothesis Testing Rejection Regions


Consider Example 4.1 below.

Example 4.1
Suppose you are a bakery chef and have developed a recipe for the worlds most delicious
chocolate cupcakes. Each of these cupcakes have 8 grams of fat per serving. Furthermore,

Page 12

suppose that you know that the amount of fat in all the cupcakes baked (the population) is
normally distributed with a standard deviation of 0.15 grams. You want to make sure that, on
average, this is truly how much fat the cupcakes contain. You set aside 5 cupcakes for testing.
But, your testing equipment is not very accurate so youll allow = 0.10. How would you
validate this?
Solution:
You can use hypothesis testing to validate the statement that your cupcakes (on average) have
8 grams of fat per serving, where the serving size is one cupcake. Establish the null hypothesis
as H0 : = 8.0. Thus, the alternative hypothesis will be Ha : 8.0.
Remember from before that = = 8.0, = / = 0.15/5 = 0.067

Since Ha has an inequality we have rejection regions for both left and right hand tails.
Therefore, we are looking for rejection regions with an area equal to /2 = 0.10/2 = 0.05. Using
the normal probability tables we find that this corresponds to z0.05 = 1.645. So, the critical
values will be 1.645 standard deviations of to the right and left of the mean 8.0.
We still have one last thing to do. We have to de-standardize this.

= 8.0 (1.645)(0.067) = 7.89 = 8.0 + (1.645)(0.067) = 8.11

Therefore, for our sample of 5 cupcakes there will be a less than 10% chance that well find a
mean of 7.89 grams of fat or less or 8.11 grams of fat or more. If we do indeed find that, we
will have to reject the null hypothesis!

In reality, it is more commonly the case when the population standard deviation is unknown.
Then, we use the t-distribution rather than the normal distribution and the respective test
statistics are shown below.
Type of Test

Test Statistic

One sample test for mean, known

One sample test for mean, unknown


where 0 is the hypothesized mean of the population.

0
/

0
/

We also use the t-distribution when the sample size is small, i.e. less than n = 30.
There are essentially two types of errors that occur in hypothesis testing. These are shown in
Table 2 below. One type of error occurs when we reject the null hypothesis even though it is

Page 13

actually true. This is generally considered the worst type of error and labeled a Type I error.
The second type of error is to fail to reject the null hypothesis when it is actually false. This is
labeled a Type II error.
Table 2. Possible Outcomes of Hypothesis Testing

Our Decision

Do not reject H0
Reject H0

True State of Nature


H0 is true
H0 is false
Correct decision Type II error
Type I error
Correct decision

Now we need to say a bit more about . When we talked about confidence intervals we said
that denoted the significance level. That is, the significance level reflects how much risk we
are willing to take in our results. Before, if = 0.05 we were implicitly saying that we wanted to
be 95% confident we would be right or conversely we were willing to take a 5% risk of being
wrong. This established the critical values we used as criteria to accept or reject our results.
Now, well talk about the significance level as the probability of rejecting the null hypothesis if it
is true. So for = 0.05 what we mean is that we are willing to accept that 5% of the time we
will reject the null hypothesis even though it is true.
This is directly related to P values. That is, the P value is the probability of obtaining a result
from sampling that is equal to or more extreme than what was actually observed. So, if the P
value is less than or equal to the significance level we reject the null hypothesis. Another way
to say this is to say that the P value is the probability of an observation being at least as
favorable if not more so to the alternative hypothesis than to the null hypothesis. Luckily for us
this all boil down to the P-value being equal to the rejection area that weve already learned
how to calculate. To clarify lets consider another example.

Example 4.2
Lets suppose your still working at your job as a bakery chef but now you want to greatly reduce
the amount of fat in your delicious cupcakes. Wouldnt it be great if you could invent
something that actually had negative fat? Lets set up an experiment to see if we have invented
a recipe that has fat, on average, equal to -4.0 grams. Since this would be such a break through
well include many more cupcakes in our observation, i.e. 90 cupcakes. The mean of the
sample turns out to be -5.033 and the standard deviation 3.567. And, the minimum fat
measured is -13.511 and the maximum 4.490.
Solution:
First lets set up our hypotheses.

Page 14

H0: average grams of fat = -4


Ha: average grams of fat -4
In this case we do not know what the population standard deviation will be. What we have in
the problem description is the sample standard deviation. So, we will use the data given in the
problem description with the t-distribution. Our test statistic is:
=

0
/

5.033 + 4
3.567 /90

= 2.747

Because we have our hypothesis set up as an equality this will be a two-sided t-test, i.e. t < 2.747 and t > 2.747 will both be considered a rare event or extreme. Next we compare our test
statistic to the t-distribution with n-1 degrees of freedom to find the P-value. Unfortunately,
the resolution for most t-distribution tables does not extend to 89 degrees of freedom so some
interpolation is required. You can use a program such as R to calculate the value or it is easy
enough to do a linear interpolation between values in the tables to get:
(89 2.747) = 0.00364

Since the t-distribution is symmetric we have the same result for 2.747. So, the P value will
be:
= 2 0.00364 = 0.00727

which means that the null hypothesis is rejected at the 1% significance level. That is, P =
0.00727 is less than = 0.01.
There is yet another way to look at this problem. Lets go back to our original problem
statement. We have a sample size of n = 90. Lets also assume that we want our level of
significance to be 1%. Then, we can also use the t-distribution tables to find a value given these
parameters. That value is 2.632. So if we draw our typical diagram showing the rejection
regions we have:

Rejection region

Critical value = 2.632


P value = 2.747

Rejection region

Either way, it doesnt look like weve validated that we made our cupcakes with negative fat!

Page 15

[5] Two Sample Tests and Analysis of Variance


Up to now weve only considered cases where we have one population or one sample. There
are many times that we may want to consider the differences between two populations and so
compare their means, proportions or other population parameters. Cases such as these follow
from what weve done already.
For example, we set might set-up a hypothesis to compare the parameter of one population to
the same parameter of a second population, e.g. the average of one group is greater than or
less than the other group. Lets let D0 be the hypothesized value of the population parameter.
Then:

H0: population parameter (1) population parameter (2) D0


Ha: population parameter (1) population parameter (2) < D0

produces a lower-tailed test. Conversely,

H0: population parameter (1) population parameter (2) D0


Ha: population parameter (1) population parameter (2) > D0

produces an upper-tailed test. For these cases remember to subtract from one, keep signs
straight, etc. When we want to validate that the population parameter of one population
varies from another population then we set-up our null hypothesis as an equality and perform a
two-tailed test.

H0: population parameter (1) population parameter (2) = D0


Ha: population parameter (1) population parameter (2) D0

To properly perform this hypothesis testing we need to ensure that the samples from the
respective populations have all the properties discussed before and in addition they are
independent. That is, the samples are drawn without reference to and have no connection to
each other.
In fact, there are several tests that involve two samples including:

Two-sample test for means, 2 known


Two sample test for means, 2 unknown and assumed unequal
Two sample test for means 2 unknown but assumed equal
Paired two sample test for means
Two sample test for equality of variances

Selecting the proper test statistic involves determining whether or not the populations
standard deviations are known and whether or not they are assumed to be equal. Typically the

Page 16

populations standard deviations will be unknown. Furthermore, it will probably be difficult to


support the variances of two populations being equal. So, the most common case is the two
sample test for means unknown and assumed unequal which requires the use of the t-test.
Well also introduce one more new notation. To remind ourselves to use the t-distribution well
call our test statistic the t-score. This is entire analogous to the z-score as before. Lets work
through an example.

Example 5.1
One classic example is evaluating whether or not there is a relationship between a mother
smoking (designated by the letter s), or not (designated by the letter n), and a babys average
birth weight. Our data consists of 50 samples for which the mother was a smoker and 100
samples for which the mother was a non-smoker. Table 3 summarizes our data. Well establish
our level of significance at 0.050, i.e. = 0.050.
Table 3

Solution:
The null hypothesis will represent the case where there is no difference. So we set-up our
hypothesis as:
H0: n - s = 0 (no difference in birth weight)
Ha: n - s 0 (there is a difference in birth weight)
We check the conditions and see that the data comes from a random sample and consists of
less than 10% of all cases so we believe the observations to be independent; and, the sample
sizes are both greater than 30 so we believe the distribution will be nearly normal. Therefore,
the t-score is appropriate. Since we are comparing the difference between two population
means with some manipulation 1 the equation for our t-score becomes:
=

(
1 )
2 0
2
2
1+ 2
1 2

For the derivation of this equation see Appendix C: Calculating the t-score and Confidence Interval for the
Difference of Two Means.

Page 17

where D0 for H0 equals 0. I have actually found a couple different ways of calculating the
number of degrees of freedom for comparing the means of two populations. In this case well
set the number of degrees of freedom equal to the smaller sample size, i.e. ns -1 = 49.
=

0.40 0
= 1.54
26

Keep in mind that here we have the t-score and need to find the corresponding area or P value.
Again, we go to the t-tables and find that the P value falls between 0.100 and 0.200, i.e. the P
value is greater than 0.050. Therefore, we fail to reject the null hypothesis. Or stating this
another way, there is insufficient evidence in this data set to say that there is a statistically
significant difference in a babys birth weight if the mother smokes.

In summary, to use the t-distribution to perform statistical inference the general procedure is:
1. Write the appropriate hypothesis.
2. Verify the required conditions for using the t-distribution.
a. Samples must be independent and nearly normal. For large sample sizes the
condition that the sample distribution be nearly normal can be relaxed.
b. When comparing population parameters between two populations, the samples
must be independent and each sample must satisfy the conditions in 2.a above.
3. Calculate the point estimate for the parameter of interest, e.g. the mean, the corresponding
standard error 2, and the degrees of freedom. A software program such as R can be used to
calculate the degrees of freedom, or taking a conservative approach the smaller of n1 1 or
n2 1 can be used for the degrees of freedom.
4. Calculate the t-score and find the corresponding P value.
5. State the results.

[6] Appendix A: Selecting the Appropriate Test Statistic


There are really 2 cases; 1) the population standard deviation is known as in Example 3.2 and,
2) the population standard deviation is not known, i.e. is unknown. When the population
standard deviation is unknown we can use:

/
2

See Appendix B: Sampling and Estimation for a discussion about the Standard Error.

Page 18

Where, as at the beginning, s2 is the sample variance. Well look at this more closely in our
discussion of hypothesis testing. For now, similar to the empirical rule which covers specific
numbers of standard deviations, there are common levels of confidence, i.e. 90%, 95%, 98%,
and 99%. Correspondingly these correlate to specific cutoff values, i.e. 1.645, 1.96, 2.326 and
2.576 respectively. This means we can reduce our equations for the confidence interval, e.g.
for 98%, to:
1.96/ or 1.96/ = 1.96

depending on whether the population standard deviation is known or unknown.

[7] Appendix B: Sampling and Estimation

[8] Appendix C: Calculating the t-score and Confidence Interval for the Difference of
Two Means

Page 19

You might also like