You are on page 1of 51

Hypothesis Testing I

(The Case of a Large Sample)


EM 521

G. Kksal

METU, 2012
G.Kksal, UIUC, 2011 1
Contents
Construction of hypotheses
Type I and Type II errors
Statistical inference for the mean (large sample)
Confidence interval approach
Critical value approach
p-value approach
Power and sample size (large sample)

2
Recall the Filling Machine Example
A filling machine is designed to fill each bottle with
16 fl.oz. of beer. An engineer tested 20 bottles
randomly and obtained a sample mean of 15.9. A
normal plot was constructed and data fell
approximately along a straight line. It is known from
past experience that the standard deviation is 0.2
fl.oz.
Test: the true mean filling amount = 16 (vs 16)

3
Hypothesis
Hypothesis: a statement about a population
parameter (true mean filling amount =16)
Test the hypothesis to see if recalibration is necessary
Info from data: sample mean 15.9
The difference due to random variation; or to the fact
that 16
Hypothesis testing: assess the evidence provided by
the data, reject/not reject the hypothesis

4
Null and Alternative Hypotheses
The null hypothesis, denoted by H0, is a tentative assumption
about a population parameter.
The alternative hypothesis, denoted by H1, is the opposite of
what is stated in the null hypothesis.
Example:
Null hypothesis H0: =16
Alternative hypothesis H1: 16
Begin with the assumption that H0 is true
If data show strong evidence against H0, reject H0
Otherwise, fail to reject H0
Failing to reject H0 does not mean H0 is true
Failing to reject H0 roughly means that you grudgingly accept H0

5
U.S. Justice System
An accused person is presumed innocent
H0: not guilty
H1: guilty
To prove the accused person guilty, need strong
evidence against the not guilty hypothesis
Insufficient evidence: Fail to reject H0, not guilty
verdict rendered
Not necessarily mean the accused is truly innocent

6
Formulation of Hypotheses
It is not always obvious how the null and alternative
hypotheses should be formulated.
Care must be taken to structure the hypotheses
appropriately so that the test conclusion provides the
information the researcher wants.
The context of the situation is very important in
determining how the hypotheses should be stated.
In some cases it is easier to identify the alternative
hypothesis first. In other cases the null is easier.
Correct hypothesis formulation will take practice.

7
Alternative Hypothesis as a Research
Hypothesis
Many applications of hypothesis testing involve an
attempt to gather evidence in support of a research
hypothesis.
In such cases, it is often best to begin with the
alternative hypothesis and make it the conclusion that
the researcher hopes to support.
The conclusion that the research hypothesis is true is
made if the sample data provide sufficient evidence to
show that the null hypothesis can be rejected.

8
Example
A random sample of 100 recorded deaths in a country
during the past year showed an average life span of
71.8 years with a standard deviation of 8.9 years.
Does this seem to indicate that the average life span
today is greater than 70 years?

H0: 70
H1: 70

9
Null Hypothesis as an Assumption to be
Challenged
We might begin with a belief or assumption that a
statement about the value of a population parameter is
true.
We then use a hypothesis test to challenge the
assumption and determine if there is statistical
evidence to conclude that the assumption is
incorrect.
In these situations, it is helpful to develop the null
hypothesis first.

10
Example
The label on a soft drink bottle states that it contains 350 ml.
Null Hypothesis:
The label is correct. > 350 ml.
Alternative Hypothesis:
The label is incorrect. < 350 ml.

11
Exercise
A new drug is developed with the goal of lowering blood
pressure more than the existing drug.
Consider the following hypotheses:
S1. The new drug lowers blood pressure more than the
existing drug.
S2. The new drug does not lower blood pressure more than
the existing drug.

Which of the following is a correct formulation of the


hypotheses?

A) H0: S1, H1: S2 B) H0: S2, H1: S1

12
Exercise
It is claimed that emissions from a certain type of
car do not exceed the required level, say 0. Which
of the following hypotheses is appropriate to test?

A) H0: = 0 B) H0: 0 C) H0: 0


H1: 0 H1: < 0 H1: > 0

13
Exercise

A manufacturer claims that the mean weight of a filled


package of chicken is 1.5 kg. It is desired that the package is
not too small or too large. Which of the following hypotheses
is appropriate to test?

A) H0: = 1.5 B) H0: =1.5 C) H0: =1.5


H1: 1.5 H1: <1.5 H1: >1.5

14
Type I and Type II Errors

= P(Type I Error) = P(Reject H0 H0 is true)


The accused is innocent but judged guilty
= P(Type II Error) = P(Fail to reject H0 H0 is false)
The accused is guilty but judged not guilty
1- = Power of the test= P(Reject H0 H0 is false)
The accused is guilty and judged guilty
15
Type I and Type II Errors
= P(Type I Error) = P(Reject H0 H0 is true)
Also called significance level of the test
Determined by the decision maker
Typically taken as 0.05
Typically the largest accepted value of is 0.10
= P(Type II Error) = P(Fail to reject H0 H0 is false)
Depends on the test statistic used
1- = Power of the test= P(Reject H0 H0 is false)

16
Hypothesis Testing Approaches

We can use in testing such hypotheses


Confidence intervals
and/or
Critical values
and/or
p-values

17
Recall the Filling Machine Example
A filling machine is designed to fill each bottle with
16 fl. oz. of beer. The actual amount it fills varies
slightly from bottle to bottle. It is known from past
experience that the standard deviation is 0.2 fl. oz. An
engineer chooses 100 bottles randomly and measures
the amount of beer in each bottle. It gives a mean of
15.9 fl. oz.
Based on this information can you conclude that the
mean amount of beer of all bottles produced is 16 fl.
oz.?
18
Solution Using CI
H0: = 0 = 16
H1: 0 = 16

The interval does not cover 16.000 fl. oz. Hence with 95%
confidence we can conclude that the process mean is not
16.000.
19
Testing Using Critical Values
H0: = 0 = 16
H1: 0 = 16
Assume H0 is true.
Reject H0 only ifX is significantly different from 0.
How do we determine the significance of the difference?
If X 0 A or X 0 A,
we can be suspicious about correctness of H 0 and reject H 0 .

If we choose A z / 2 , then P(TypeI error)
n

20
Testing Using Critical Values

X 0
Z0
/ n

21
Testing Using Critical Values

Sampling
distribution
x 0
of z0
/ n

Reject H0 Do Not Reject H0 Reject H0


/2 /2

z
- z/2 0 z/2

22
Solution Using Critical Values

1. Determine the hypotheses. H0: = 16


H1: 16

2. Specify the level of significance. = .05

3. Compute the value of the test statistic.


x 0 15.9 16.0
z0 2.236
/ n 0.2 / 20

23
Solution Using Critical Values

4. Determine the critical value and rejection rule.


For /2 = .05/2 = .025, z.025 = 1.96
Reject H0 if z0 < -1.96 or z0 > 1.96

5. Determine whether to reject H0.

Because -2.236 < 1.96, we reject H0.


We are at least 95% confident that the mean amount
of beer in the produced bottles is not 16 fl. oz.

24
Solution Using Critical Values

Sampling
distribution
x 0
of z0
/ n

Reject H0 Do Not Reject H0 Reject H0


/2 = .025 /2 = .025

z
-2.24 -1.96 0 1.96

25
Testing Using p-Value
What is the smallest level of significance that would lead to
rejection of H0 with the given data?

1/2 1/2
p -value p -value

/2 /2

z
Reject H0
-zp/2 0 zp/2 if p
-z/2 z/2
26
Solution Using p-Value
4. Compute the p value.

For z0 = -2.236, cumulative probability = .0125


pvalue = 2(.0125) = .025

5. Determine whether to reject H0.


Because pvalue = .025 < = .050, we reject H0.
We are at least 95% confident that the mean amount
of beer in the produced bottles is not 16 fl. oz.

27
Solution Using p-Value

1/2 1/2
p -value p -value
= .0125 = .0125

/2 = /2 =
.025 .025

z
z = -2.236 0 z = 2.236
-z/2 = -1.96 z/2 = 1.96

28
One p-Value Works For All
Using p-value for testing hypotheses
Can be used by different users with different significance
levels
Testing mean filling amount:
P(|Z0|>|z0| when H0 is true)=
=2(-|z0|)=2 (-2.236)=2.5%
Reject H0 for =4%, 3%;
Fail to reject H0 for =2%, 1%

29
Extremeness
Measures extremeness of data
Area outside of z0=2.236=P (observing a value for Z0 at
least as extreme as z0 under H0)
z0=-4 vs z0=-2.236: different level of extremeness
p-value when z0=-4
2(-4)=0.006%
Much stronger evidence against H0
Not directly shown in previously discussed critical value
based testing procedure

30
Exercise
A company is evaluating the quality of aluminum rods received
in a recent shipment. Diameters of aluminum alloy rods
produced on an extrusion machine are known to have a standard
deviation of 0.0001 in and distributed normally. A random
sample of 25 rods has an average diameter of 0.5046 in. Test
whether or not the mean rod diameter is 0.50455 using the
critical value approach and =0.05.
A) z0=2.50, We fail to reject that the mean is 0.50455
B) z0=2.50, We reject that the mean is 0.50455
C) z0=1.96, We reject that the mean is 0.50455
D) z0=1.96, We fail to reject that the mean is 0.50455
31
Solution

32
Solution Using Minitab

Minitab: Stat>Basic Statistics>1-Sample Z


33
Solution Using Minitab
Minitab: Stat>Basic Statistics>One-Sample Z

Test of mu = 0.50455 vs not = 0.50455


The assumed standard deviation = 0.0001

N Mean SE Mean 95% CI Z P


25 0.504600 0.000020 (0.504561, 0.504639) 2.50 0.012
Reject H0. (=0.05 > p=0.012)

Test of mu = 0.50455 vs not = 0.50455


The assumed standard deviation = 0.0001

N Mean SE Mean 99.73% CI Z P


25 0.504600 0.000020 (0.504540, 0.504660) 2.50 0.012
Fail to reject H0. (=0.0027 < p=0.012)

34
Critical Values for Testing the Mean
(Normal or n30, known)

35
p-Values for Testing the Mean
(Normal or n30, known)

36
CIs for Testing the Mean
(Normal or n30, known)

37
One-Sided Hypothesis Testing Example

38
One-Sided Hypothesis Testing Example

39
One-Sided Hypothesis Testing Example
Critical Value Approach

40
One-Sided Hypothesis Testing Example
p-Value Approach

41
One-Sided Hypothesis Testing Example
CI Approach

42
What If We Test the Opposite?
If we had constructed the hypotheses as the opposite:
H0:7980
H1: >7980
Then, 99% lower confidence bound would be

x z 7847.16
n
Then, I would not have strong evidence to claim that the mean
is strictly more than 7980, since lower values are also in the
interval [7847.16,).
Hence, I would fail to reject H0. But compared to the previous
conclusion, this would be a weak one.
43
Power and Sample Size

44
Power and Sample Size

N(0,1)

45
Power and Sample Size

46
Benzene in Exit Water Example

47
Statistical vs. Practical Significance
Filling machine H0: =16, H1: 16
Suppose true mean 15.98
Practically no significant difference
Test 20, 200, 2000 bottles, sample mean 15.98,
=5%
How does your decision change?

48
Statistical vs. Practical Significance

49
Statistical vs. Practical Significance

50
Summary
Developing null and alternative hypotheses
: Type I error probability
Determined by the decision maker
: Type II error probability
1- : Power of the test
Depends on the test procedure
Can be increased by increasing sample size
One-sided and two-sided hypothesis tests
Hypothesis testing for the mean (large sample)
Confidence interval approach
Critical value approach
p-value approach
Power and sample size (large sample)

51

You might also like