You are on page 1of 76

3.

3 Statistical Inference with one sample from a population


3.3.1 Introduction
A hypothesis is a theory that has neither been proven nor
disproven.
A statistical test will never prove nor disprove a hypothesis with
100% certainty.
1 / 1
Statistical Hypotheses
A statistical hypothesis is one that can be tested using a random
sample or samples.
It relates to a parameter of the population, commonly the
population mean or proportion e.g.
a) 30% of the adult population smoke.
b) On average, Americans are heavier than Japanese.
2 / 1
Statistical Hypotheses
In a statistical test we test between two opposing hypotheses, the
null hypothesis H
0
and the alternative H
A
(sometimes referred to
as H
1
).
The null hypothesis always contains an equality e.g. in case a) we
test H
0
: p = 0.3.
In case b), the hypothesis in words states that two population
means (say
X
: mean mass of Americans and
Y
: mean mass of
Japanese) are unequal.
This cannot be the null hypothesis. The null hypothesis in this
case is H
0
:
X
=
Y
i.e. on average Americans and Japanese
weigh the same.
3 / 1
One-tailed and two-tailed tests
The alternative H
A
can be either directional (one-tailed) or
non-directional (two-tailed).
Two-tailed alternatives simply state that there is a dierence.
One-tailed alternatives state what type of dierence there is.
4 / 1
One-tailed and two-tailed tests
e.g. in case a) H
A
: p = 0.3 would be the appropriate two-tailed
alternative hypothesis i.e. the proportion of smokers in the
population simply diers from 30%.
In case b) the appropriate one-tailed alternative hypothesis would
be H
A
:
X
>
Y
i.e. on average Americans weigh more than
Japanese.
5 / 1
One-tailed and two-tailed tests
We use two-tailed alternatives when the initial hypothesis (written
in words) i) does not state what sort of dierence is expected and
ii) the source of the hypothesis is unknown or can be assumed to
be unbiased.
This is true in case a), hence we use a two-tailed alternative.
This is not true in case b), the hypothesis given in words states
that on average Americans weigh more that Japanese.
Hence, the alternative in this case is one-tailed i.e. H
A
:
X
>
Y
.
6 / 1
One-tailed and two-tailed tests
Suppose the hypothesis is given by a source that is likely to biased,
for example, a hypothesis of a producer regarding a product of his.
The alternative should state that the product is worse than the
producer states. i.e. if a car producer states that his car consumes

0
litres/100km, then the alternative should be that the car
consumes more petrol i.e. H
A
: >
0
.
7 / 1
General procedure of statistical testing
Before the test is carried out, it is assumed that the null hypothesis
(H
0
) is correct.
The test is based on a sample of observations. If the sample is
close to what we would expect under H
0
, then we DO NOT
REJECT H
0
.
This does not indicate that H
0
is true, but it is a reasonable
hypothesis given our data.
If the sample is far away from what we would expect under H
0
,
then we accept H
A
is correct i.e. WE REJECT H
0
.
In this case it is likely that H
0
is false, but we can never be 100%
sure of this.
8 / 1
3.3.2 Types of Error
A type I error is committed when a true null hypothesis is rejected.
A type II error is committed when a false null hypothesis is not
rejected.
Suppose a drug company is testing a new drug to see whether a
new drug is more eective than the presently used drug.
The null hypothesis here is that the two drugs are as eective as
each other. The alternative is that the new drug is better than the
old drug.
9 / 1
Types of statistical error
A type I error occurs when the test indicates that the new drug is
better, but in fact it is not (H
0
is incorrectly rejected).
This leads to costs, since such a conclusion implies the
introduction of a new drug which is no better (and possibly worse)
than the old drug.
A type II error occurs when the test indicates that the new drug is
not better, but in fact it is (H
0
should be rejected, but is not).
This leads to costs, since such a conclusion implies that an
improved drug is not introduced to the market.
10 / 1
Types of statistical error
The probability of a type I error is the signicance level of a test
and denoted by .
The signicance level is dened by the person carrying out the test.
Clearly, the signicance level of a test should be small (commonly
5% ).
The stronger the evidence required to reject H
0
, the lower the
signicance level (e.g. when the costs of rejecting H
0
are high, we
wish to avoid wrongly rejecting H
0
and so we reduce ).
However, the more we try to avoid type I errors (as the signicance
level is decreased), the more likely type II errors become.
11 / 1
Types of statistical error
The probability of a type II error is denoted by .
The power of a test is dened to be 1-.
This is the probability of correctly rejecting a false null hypothesis.
The power of a test cannot be measured in practice, since it
depends on the (unknown) value of the parameter which is the
subject of the test.
12 / 1
The power of a test
For example, consider the test of the hypothesis that 30% of the
adult population smoke.
H
0
: p = 0.3 versus H
1
: p = 0.3.
If in reality p = 0.5, the power of this test will be greater than if
p = 0.4 (the further H
0
is from reality, the more likely it is that we
reject it).
For a xed signicance level, the power of a test increases as the
sample size increases.
Ideally, we would like a test to have a low signicance level and
high power (i.e. the likelihood of either type of error is small).
This can only be achieved if we have a large sample.
13 / 1
3.3.3 The process of hypothesis testing
The procedure is as follows
1. State H
0
and H
A
.
2. Choose the appropriate test statistic, T. This can be
thought of as a measure of distance from H
0
. If e.g.
H
0
: p = 0.3 is true, then we expect that close to
30% of our sample will smoke (there will be some
random variation around this population proportion).
3. Calculate the realisation of the test statistic, t, based
on the sample.
14 / 1
The process of hypothesis testing
4. Either a) calculate the p-value of the test (this is a
measure of the credibility of a null hypothesis.
Statistical packages give this value). If the p-value of
the test is less than the signicance level then we
reject H
0
, or
b) determine the appropriate critical value (this is a
critical distance from H
0
). If the realisation of the
test statistic exceeds this critical value then we reject
H
0
.
5. Based on the p-value of the test (or the critical value
and realisation of the test statistic) state your
conclusion in words.
15 / 1
3.3.4 Testing hypotheses for a population mean ( known
or n > 30)
The null hypothesis is H
0
: =
0
(
0
given).
The test statistic is
Z =
X
0
S.E.(X)
Given that the null hypothesis is true, then this statistic has
approximately a standard normal distribution (independently of the
distribution that the observations come from).
Note that when the sample mean is close to
0
, then the realisation
of the test statistic is close to 0. In this case, we do not reject H
0
.
Realisations of the test statistic far from 0 correspond to the
sample mean being signicantly dierent from
0
. In this case,
we should (in general) reject H
0
.
16 / 1
Testing hypotheses for a population mean ( known or
n > 30)
If the population variance
2
(or standard deviation ) is known,
we use
S.E.(X) =

n
.
When is unknown, we use
S.E.(X)
s

n
.
17 / 1
Calculation of the p-value for two-sided tests
Suppose the alternative is non-directional i.e. H
A
: =
0
.
The p-value of the test is given by p = P(|Z| > |t|) = 2P(Z > |t|),
where t is the realisation of the test statistic.
The p-value is the probability that given H
0
is true a randomly
chosen sample favours the alternative more than the sample
observed.
Low values of the p-value indicate that H
0
should be rejected.
18 / 1
Calculation of the p-value for two-sided tests
19 / 1
Interpretation of the p-value
p > 0.05 indicates that there is no evidence against H
0
(do not
reject at the 5% level).
0.01 < p < 0.05 indicates that there is evidence against H
0
(reject
at the 5% level but not at the 1% level).
0.001 < p < 0.01 indicates that there is strong evidence against
H
0
(reject at the 1% level but not at the 0.1% level).
p < 0.001 indicates that there is very strong evidence against H
0
(reject at the 0.1% level).
20 / 1
The critical value of such a test
The critical value of such a test at a signicance level of is
Z
/2
= t
,/2
.
We reject H
0
if and only if |t| > Z
/2
= t
,/2
.
It should be noted that if |t| > Z
/2
= t
,/2
then the p-value is
less than .
21 / 1
The critical value of such a test
22 / 1
Example 3.3.1
The average weight of a sample of 100 students is 72kg with a
standard deviation of 12kg.
Test the hypothesis that on average students weigh 75kg at a
signicance level of 1%.
23 / 1
Example 3.3.1
i) First, we state the hypotheses
H
0
: = 75; H
A
: = 75.
ii) Second, we choose the appropriate test statistic. For a
hypothesis regarding the population mean with one large sample,
we use
Z =
X
0
S.E.(X)
.
We use the approximation
S.E.(X)
s

n
=
12

100
= 1.2
24 / 1
Example 3.3.1
iii) We calculate the realisation of the test statistic
t =
72 75
1.2
= 2.5
iv) We can calculate the p-value of the test
p = 2P(Z > |t|) = 2P(Z > 2.5) = 2 0.00621 = 0.01242
25 / 1
Example 3.3.1
v) Based on this, we can state our conclusion.
Since p > = 0.01, we do not reject H
0
at a signicance level of
1%.
Hence, we do not reject the hypothesis that the average weight of
students is 75kg.
26 / 1
Example 3.3.1
Instead of calculating the p-value, we can base our conclusion on
the appropriate critical value.
iv) The critical value for a non-directional test is
Z
/2
= t
,/2
= t
,0.005
= 2.576.
v) Based on this, we make our conclusion. Since |t| = 2.5 < 2.576,
we do not reject H
0
.
Hence, we do not reject the hypothesis that the average weight of
students is 75kg.
27 / 1
Duality between condence intervals and two-sided tests
Result
Suppose we are testing H
0
: =
0
against H
A
: =
0
. We
should reject H
0
at a signicance level of 100% if and only if
0
does not belong to the 100(1 )% condence interval for the
population mean.
e.g. we reject H
0
at a signicance level of 5% if and only if
0
does
not belong to the 95% condence interval for the population mean.
Condence level + Signicance level = 100%.
28 / 1
Duality between condence intervals and two-sided tests
Intuition: The values in the condence interval are credible values
of the population mean at the appropriate signicance level.
Hence, we can carry out the test H
0
: =
0
against H
A
: =
0
by calculating the appropriate condence interval and basing our
conclusion on the condence interval.
29 / 1
Example 3.3.2
The average weight of a sample of 100 students was 72kg with a
standard deviation of 12kg.
Test the hypothesis that on average students weigh 75kg at a
signicance level of 1%.
30 / 1
Example 3.3.2
Since the signicance level is 1%, we calculate a 99% condence
interval for the population mean.
Since we have a large sample, the condence interval is given by
X t
,/2
S.E.(X) X
st
,/2

n
31 / 1
Example 3.3.2
We have
= 0.01, t
,/2
= t
,0.005
= 2.576.
The 99% condence interval is given by
72
12 2.576

100
= 72 3.1 = [68.9, 75.1]
Since 75 belongs to this condence interval, we do not reject H
0
(the hypothesis that the average weight of all students is 75kg).
32 / 1
Use of the duality thoerem
Note that using duality we can test a number of dierent
hypotheses at a given signicance level.
e.g. in this case, we would not reject the null hypothesis that the
average weight of students is 70kg at a signicance level of 99%,
since 70 also belongs to the condence interval.
By calculating the p-value (or realisation of the test statistic), we
can test a particular null hypothesis at various signicance levels.
i.e. we can give more precise information on the weight of evidence
against a null hypothesis.
33 / 1
Right-sided tests
We consider two types of one-sided tests. Right-sided tests
H
0
: =
0
; H
A
: >
0
.
In this case, we reject H
0
if the sample mean is signicantly
greater than
0
.
The p-value is given by p = P(Z > t).
Note, large positive realisations of the test statistic (associated
with small p-values) occur when the sample mean is signicantly
greater than the hypothetical population mean
0
.
34 / 1
Right-sided tests
35 / 1
One-sided tests
As before, the null hypothesis is rejected if p < .
The critical value is given by Z

= t
,
.
We reject H
0
if t > Z

= t
,
.
36 / 1
Right-sided tests
37 / 1
Left-sided tests
In this case we test between
H
0
: =
0
; H
A
: <
0
.
In this case, we reject H
0
if the sample mean is signicantly
lower than
0
.
The p-value is given by p = P(Z < t).
38 / 1
Left-sided tests
39 / 1
Left-sided tests
Note, large negative realisations of the test statistic (associated
with small p-values) occur when the sample mean is signicantly
lower than the hypothetical population mean
0
.
As before, the null hypothesis is rejected if p < .
The critical value is given by Z

= t
,
. We reject H
0
if
t < Z

= t
,
.
40 / 1
Left-sided tests
41 / 1
Example 3.3.3
A manufacturer states that his light bulbs function on average for
1000hrs.
The mean working life of a sample of 81 bulbs was measured to be
920hrs with a standard deviation of 360hrs.
Is the manufacturers claim reasonable at a signicance level of 5%?
42 / 1
Example 3.3.3
i) We state our hypotheses
H
0
: = 1000; H
A
: < 1000.
Note that this alternative states that the bulbs are worse than the
producer states. i.e. this is a left-sided test.
ii) The appropriate test statistic is
Z =
X
S.E.(X)
.
43 / 1
Example 3.3.3
iii) We calculate the realisation of the test statistic
S.E.(X)
s

n
=
360

81
= 40
t=
920 1000
40
= 2.
iv) The p-value for this test is
p = P(Z < t) = P(Z < 2) = P(Z > 2) = 0.02275.
44 / 1
Example 3.3.3
v) Conclusion. Since p < 0.05 = , we reject H
0
.
We have evidence that the statement of the producer is unfounded.
45 / 1
Example 3.3.3
iv) We can also base our conclusion on the appropriate critical
value.
For a left-sided test, the appropriate critical value is
Z

= t
,
= 1.645.
v) Since t = 2 < Z

, we reject H
0
at the 5% level.
We have evidence that the statement of the producer is unfounded.
46 / 1
3.3.5 Testing hypotheses for a population mean (with a
small sample, n < 30)
In this case we use the test statistic
T =
X
0
S.E.(X)
,
where
S.E.(X) =
s

n
.
Given H
0
is true, if the observations come from a normal
distribution, then this statistic has a Student t distribution with
n 1 degrees of freedom.
Note: if the observations come from a distribution which is not
normal, then this will not be true.
47 / 1
Testing hypotheses for a population mean (with a small
sample, n < 30)
We cannot calculate p-values by hand using tables.
Hence, inference is based on the appropriate critical value read
from the table for the Student t-distribution (Table 7).
Again, the test statistic is a measure of how far the data are away
from H
0
.
48 / 1
Two sided tests
We reject the null hypothesis if and only if
|t| > t
n1,/2
, where t
n1,p
satises P(T > t
n1,p
) = p when T has a student
t-distribution with n 1 degrees of freedom.
49 / 1
Two-sided tests
50 / 1
Example 3.3.4
The average weight of a sample of 25 students was 72kg with a
standard deviation of 12kg.
Test the hypothesis that on average students weigh 75kg at a
signicance level of 5% .
51 / 1
Example 3.3.4
i) We state the hypotheses
H
0
: = 75; H
A
: = 75.
ii) We choose the appropriate test statistic
T =
X
0
S.E.(X)
.
Given H
0
is true and the data come from a normal distribution, this
statistic has a student t-distribution with n 1 degrees of freedom.
52 / 1
Example 3.3.4
iii) We calculate the realisation of the test statistic
S.E.(X)
s

n
=
12

25
= 2.4
t=
72 75
2.4
= 1.25.
iv) We read the appropriate critical value from the table for the
Student t-distribution.
iv) Since this is a two-tailed test, the signicance level is = 0.05
and the sample size is small, the critical value is
t
n1,/2
= t
24,0.025
= 2.064.
53 / 1
Example 3.3.4
v) We state our conclusion. Since
|t| = 1.25 < t
24,0.025
= 2.064,
we do not reject H
0
(the hypothesis that the average weight of
students is 75kg).
54 / 1
Example 3.3.4
It should be noted that weight does not have a normal distribution.
However, its distribution is not highly asymmetrical and the
number of observations is not very low.
Hence, the distribution of the test statistic will be reasonably close
to the student t-distribution.
Also, the realisation of the test statistic is not particularly close to
the critical value. Hence, our conclusion seems reasonable.
55 / 1
Use of duality for two-sided tests
We can also use the duality between condence intervals and two
sided tests.
In this case since the signicance level is 5%, the appropriate
condence level is 95%. The appropriate condence interval for
the population mean (n < 30) is given by
X t
n1,/2
S.E.(X)=X
st
n1,/2

n
=72
12 2.064

25
=72 4.95 = [67.05, 76.95].
Since 75 belongs to this condence interval, we do not reject H
0
.
56 / 1
Right-sided tests
We consider two types of one sided tests. The rst are right sided
tests.
These are tests of the form
H
0
: =
0
; H
A
: >
0
.
We reject H
0
only if the sample mean is signicantly greater than

0
.
This corresponds to realisations of the test statistic signicantly
greater than 0.
Precisely, we reject the null hypothesis if and only if t > t
n1,
.
57 / 1
Right-sided tests
58 / 1
Left-sided tests
The second type of tests are left-sided tests. These are tests of the
form
H
0
: =
0
; H
A
: <
0
.
We reject H
0
only if the sample mean is signicantly smaller than

0
.
This corresponds to realisations of the test statistic signicantly
smaller than 0.
Precisely, we reject the null hypothesis if and only if t < t
n1,
.
59 / 1
Left-sided tests
60 / 1
Example 3.3.5
A car producer states that one of his cars burns 6.2 litres of petrol
per 100km.
10 magazines tested the car. The average of their results was 6.5
litres/100 km with a standard deviation of 0.3 litres/100 km.
Is the statement of the producer reasonable at a 5% signicance
level?
61 / 1
Example 3.3.5
i) In this case the hypothesis H
0
: = 6.2 is from a producer.
The alternative states that the product is worse than the producer
states (i.e. consumes more petrol). Hence, H
A
: > 6.2.
ii) The test statistic is
T =
X
0
S.E.(X)
,
where S.E.(X) =
s

n
.
If the observations come from a normal distribution, then this has
a student distribution with n 1 degrees of freedom.
62 / 1
Example 3.3.5
iii) We calculate the realisation of the test statistic.
S.E.(X)=
s

10
=
0.3

10
0.0095
t=
6.5 6.2
0.0095
3.16.
iv) We read the appropriate critical value. Since this is a
right-sided test the critical value is given by
t
n1,
= t
9,0.05
= 1.833.
63 / 1
Example 3.3.5
v) We state our conclusion. This is a right sided test. Since
t = 3.16 > t
n1,
= t
9,0.05
= 1.833,
we reject H
0
at a signicance level of 5%.
Hence, there is evidence that the producers statement is
unfounded.
64 / 1
3.3.6 Tests for a population proportion
We only consider such tests with large samples (n > 30).
The null hypothesis is H
0
: p = p
0
.
Under the null hypothesis the standard error of the sample
proportion, p, is
S.E.( p) =

p
0
(1 p
0
)
n
65 / 1
Tests for a population proportion
The test statistic,
Z =
p p
0
S.E.( p)
,
has approximately a standard normal distribution.
Note that this statistic is analogous to the statistic for large
sample tests for a population mean.
66 / 1
Tests for a population proportion
The test statistic is a measure of the distance between the sample
proportion and the population proportion.
We reject H
0
if this dierence is signicantly large.
The p-values and critical values for such tests can be calculated in
the same way as for tests for the population mean with a large
sample.
67 / 1
Example 3.3.6
100 of 300 people stated that they wanted to vote for Fine Gael at
the next election.
Test the hypothesis that 30% of the population wish to vote for
Fine Gael at a signicance level of 5%.
68 / 1
Example 3.3.6
i) We state our hypotheses
H
0
: p = 0.3; H
A
: p = 0.3
Since we do not know where this hypothesis is from, we use a
two-sided test.
ii) The test statistic is
Z =
p p
0
S.E.( p)
.
69 / 1
Example 3.3.6
iii) We calculate the realisation of the test statistic
p=
100
300
=
1
3
S.E.( p)=

p
0
(1 p
0
)
n
=

0.3 0.7
300
=

0.0007 0.02646.
Hence,
t =
1/3 3
0.02646
1.26.
70 / 1
Example 3.3.6
iv) We can calculate the p-value of the test. For a two-sided test
p = 2P(Z > |t|) = 2P(Z > 1.26) = 2 0.1038 = 0.2176.
v) Since p > = 0.05, there is no evidence that this proportion
deviates from 30% (we do not reject H
0
).
71 / 1
Example 3.3.6
Note that this conclusion can also be based on the appropriate
critical value.
For a two-sided test this is
Z
/2
= t
,/2
= t
,0.025
= 1.96.
Since |t| = 1.26 < t
,0.025
= 1.96, we do not reject H
0
at a
signicance level of 5%.
There is no evidence that the population proportion deviates from
30%.
72 / 1
Example 3.3.6
The duality between condence intervals and two-sided tests also
works for tests for the population proportion.
However, when we calculate a condence interval for a proportion,
the estimate of the standard error is based on the sample
proportion and not (as in the hypothesis test) on the supposed
population proportion.
Hence, the duality in this case is only approximate. For example, if
I base the conclusion of a test on a 99% condence interval for a
population proportion, then the signicance level is approximately
1%.
73 / 1
Example 3.3.7
100 of 300 people stated that they wanted to vote for Fine Gael at
the next election.
Calculate a 95% condence interval for the proportion of the
population wishing to vote for Fine Gael.
On the basis of this condence interval test the hypothesis that
30% of the population wish to vote for Fine Gael.
74 / 1
Example 3.3.7
The 95% condence interval for the population proportion is
p t
,/2
S.E.( p),
where
S.E.( p)

p(1 p)
n
=

1/3 2/3
300
=

0.000741 0.02722
75 / 1
Example 3.3.7
t
,/2
= t
,0.025
= 1.96.
The condence interval is given by
p t
,/2
S.E.( p)=
1
3
1.96 0.02722
=0.333 0.053 = [0.280, 0.386]
Since 0.3 belongs to this interval, we do not reject the null
hypothesis that 30% of the population wish to vote for Fine Gael.
The signicance level of this test is approximately 5%.
76 / 1

You might also like