You are on page 1of 62

College Of Science, Engineering & Technology

School of Life and Physical Sciences

Foundation Studies

General Mathematics B

Hypothesis
Testing
Notes
Hypothesis Testing
1 Hypotheses

(i) What is a Hypothesis?

An hypothesis is an assumption, a statement made to explain a set of


facts and to form a basis of further investigation. It is understood that
the statement is subject to proof or checking.

(ii) Examples of hypotheses, or statements, made are:

- 25% of all males over the age of 50 are divorced,


- the average length of time spent combing one's hair is
6 minutes/day,
- trains only run on time 5% a day,
- the average weekly income per family is $900 ,
- females spend more time watching television than males.

All these hypotheses have one thing in common. The populations of


interest are so large that for various reasons it would not be feasible
to study all the items, or persons in the population.

(iii) What is Statistical Testing?

One of the major roles of statisticians in practice is to draw conclusions


from a set of data. This process is known as statistical inference but
it must always be borne in mind that, whatever conclusion is reached,
it can always be wrong. However, in many circumstances we can put
a probability on whether our conclusion is correct and so we can make
a decision that we could say is 'beyond reasonable doubt'.
This process is called statistical testing.

Statistical testing begins with a hypothesis - an assumption about the


value of a population parameter, (which is usually the mean).
A sample is taken from the population, and the value of the sample
mean is calculated. A decision then has to be made. If there is no
significant difference between the values, the hypothesis may be
accepted. However, if there is a difference, the hypothesis may be
rejected.

These decisions are made on the significance (or size) of the


difference.

2
(iv) In hypothesis testing, the hypothesis is not accepted or rejected with
absolute certainty, but with a definite level of confidence that the error
in the decision is small.

Hypothesis testing starts with an assumed value of the population


mean, and sampled data is collected to test the assumption made with a
specified level of confidence.

(v) Null Hypothesis

The null hypothesis is the hypothesis that is to be tested.


The null hypothesis is denoted by H0 .

H - stands for hypothesis


0 - implies nothing has changed.

H0 : µ = µ 0 suggests that the population mean µ , is as it claims to be,


in other words, there is no difference between what is observed and
what is claimed.

Generally speaking, the null hypothesis is set up for the purpose of


either rejecting or accepting it. Alternatively, it is a statement that will
be accepted if the sample data fails to provide us with convincing
evidence that it is false.

(vi) Alternative Hypothesis

The alternative hypothesis describes what you would believe by


rejecting the null hypothesis. It is denoted H1 , and read as 'H one'.
The alternative hypothesis (H1) will be accepted if the sample data
provide us with evidence that the null hypothesis (H0) should be
rejected.

The alternative hypothesis (H1) is the statement that will be accepted


if the data from the sample provide us with enough evidence that the
null hypothesis should be rejected (ie H0 is false).

(vii) Classification of Hypotheses

The specific wording of a hypothesis for a question should


always be expressed in terms of the data for that question.
For example, consider the following question:

Is the population mean µ equal to a specified value?

3
Hypotheses:

H0 : The population mean µ is equal to the specified value.


v.
H1 : The population mean µ is not equal to the specified value.

Another way of expressing the null and alternative hypotheses is


in the form of symbols. In the case where we have only a single
sample, this takes the form in the null hypothesis of the population
mean µ taking on a specified value µ 0 . That is:

H0 : µ = µ 0

(viii) Two-tailed Test

In every case, the alternative hypothesis is the complement of the


null hypothesis.

A two-tailed test assumes no preconceived notions about the true


value of µ . That is, the true value of µ can either be above or
below the hypothesized value of µ 0 .

The alternative hypothesis is then written as:

H1 : µ ≠ µ 0

(ix) One-tailed Test

A one-tailed test assumes that there is a stronger conviction about


the true value of µ . That is, the true value of µ can be greater
than µ 0 . In this case:

H0 : µ = µ 0
H1 : µ > µ 0 , or ,

the true value of µ can be less than µ 0 .

4
In this case:

H0 : µ = µ 0
H1 : µ < µ 0

(x) Significance Levels

The null hypothesis is rejected if the probability of obtaining a


result is unlikely as the one which was obtained is small.
How small? This is a question to which the answer is arbitrary
and depends to some extent on the use that is to be made of the
investigation. It is necessary to choose a probability and agree
that probabilities below this are 'unlikely'. The value which is
chosen is called the significance level and it measures the
probability of rejecting the null hypothesis when it is true.

The level of significance is the risk we assume if rejecting


the null hypothesis (H0) when it is actually true.

The level of significance is designated by α (the Greek letter


alpha). It is also known as the level of risk. This may be a
more appropriate term because it is the risk taken if you reject
the null hypothesis when it is really true.

The most common significance level used is 0.05 (often


called the 5% significance level) which is commonly written
as α = 0.05 . Another widely used level is α = 0.01 (or the
1% significance level). Although in theory any significance
level may be used, these two are by far the most popular.
If we use, say, a 5% significance level, what we are saying in
effect is that an event (or sample) that occurs less than 5% of
the time is considered unusual. In this case, we will reject H0
as being false if the probability of obtaining a sample like ours
is less than 0.05 and accept H0 as being true if this
probability is more than 0.05 .

If we use, for example, a 1% significance level then we are


saying that an event (or sample) that occurs less than 1% of
the time is considered unusual. In this case, we will reject
H0 , as being false if the probability of obtaining a sample like
ours is less than 0.01 and accept H0 as being true if this
probability is more than 0.01 .

5
(xi) Errors

In performing a hypothesis test, a statistician must be aware


of the consequences of drawing the wrong conclusion.
These consequences assist in deciding which significance
level to use. In effect there are two possible errors that can
be made when making a conclusion about a null hypothesis.
These are:

(xii) Type I Error

This error occurs when you reject H0 as being false when


H0 is really true. The probability of making a type I error
is the significance level of the test. A type I error is
designated by the Greek letter α .

A type I error occurs if the null hypothesis (H0) is rejected


when it is actually true.

(xiii) Type II Error

This error occurs when you accept H0 as being true when


H0 is really false. The probability of making a type II error
is denoted by the Greek letter β (beta). Of course we
would like to avoid both errors as much as possible.
Unfortunately, in trying to avoid one of them we increase the
chance of making the other one.

A type II error occurs if we accept the null hypothesis (H0)


when it is actually false.

(xiv) Summary

The table below, (table 1), summarises the relationship


between rejecting/accepting H0 and whether or not H0
is true is shown below in terms of type I and type II errors:

6
decision H0 true H0 false
reject H0 type I error no error made
accept H0 no error made type II error

table 1

Example 1

In a courtroom, of a murder case for example, we must test


the hypothesis:

H0 : the defendant is innocent


H1 : the defendant is guilty

It is up to the prosecutor to show reasonable evidence to


convict.

(a) A type I error occurs when we reject H0 , when


H0 is true, in other words a jury convicts an
innocent person.

(b) A type II error occurs when we accept H0 ,


when H0 is false, in other words the jury finds
a guilty person innocent.

If a jury finds an innocent verdict, this means that there are


not sufficient evidence to show his guilt.

Example 2

A company has developed a drug which it feels may be a cure


for certain types of cancer. It has collected vast amounts of
data as a result of clinical trials and has asked you whether the
drug actually works. The null and alternative hypotheses
(in words) are:

H0 : the drug does not work


H1 : the drug does work

7
(a) A type I error occurs when it is concluded that the
drug works when in fact it doesn't.

If we want to avoid a type I error then a small value


of α should be chosen, say α = 0.01 .

(b) A type II error occurs when it is concluded that the


drug does not work when in fact it does work.

2 z-test Statistic

(i) What is a Test Statistic?

A test statistic is a value, determined from sample information,


used to accept or reject the null hypothesis: H0 : µ = µ 0 .

We will deal with the case of a single sample being chosen from
a population and the question of whether that particular sample
might be consistent with the rest of the population. Exactly
which test statistic is appropriate depends on the information
available. However, it is very important that the correct one is
used since the use of an incorrect test statistic can lead to an
incorrect conclusion.

In calculating the value of a test statistic, it will be assumed


that the following information will always be available:

1 the size (n) of the sample,


2 the mean ( x ) of the sample,
3 the standard deviation (s) of the sample.

(ii) z-test statistic

A z-test statistic is used when the size of the sample is more


than 25 , (n > 25) .

(a) If the standard deviation of the population, σ , is


known, then:

8
x − µ0
z=
σ
n

(b) If the value of σ is unknown, the standard deviation


is approximated by the sample deviation s , then:

x − µ0
z=
s
n

(iii) Standard Error

σ
The expression is referred to as the standard error of the
n
mean.

3 The Critical Value

(i) The critical value is the value of the test statistic which is
significant. By significant we mean the value that leads
to the rejection of the null hypothesis.

A critical value for a z-test statistic is denoted by zc .

The critical value (zc) is the dividing point between


the region where the null hypothesis is rejected or
not rejected.

(ii) The particular critical value to use depends on two things:

1 whether we are using a one-tailed or two-tailed test, and


2 the significance level used. ( α = 0.01 or α = 0.05)

9
There are four cases:
case 1: two-tailed test with α = 0.05
case 2: two-tailed test with α = 0.01
case 3: one-tailed test with α = 0.05
case 4: one-tailed test with α = 0.01

(iii) Case 1: Two-tailed Test with α = 0.05

H0 : µ = µ 0

H1: µ ≠ µ 0 α = 0.05

The critical value zc = 1.96 and -1.96 are obtained by considering


the z-score when 95% of the region under a normal curve is
acceptable, (figure 1):

region of rejection region of region of rejection


(0.025) acceptance (0.025)
(0.95)

-1.96 0 1.96 z scale

figure 1

(iv) Case 2: Two-tailed Test with α = 0.01

H0 : µ = µ 0
H1: µ ≠ µ 0 α = 0.01

The critical value zc = 2.58 and -2.58 are obtained by considering


the z-score when 99% of the region under a normal curve is
acceptable, (figure 2):

region of rejection region of region of rejection


(0.005) acceptance (0.005)
(0.99)

0 z scale

-2.58 2.58
figure 2

10
(v) Case 3: One-tailed Test with α = 0.05

(a) H0 : µ = µ 0 (b) H0 : µ = µ 0
H1: µ < µ 0 or H1: µ > µ 0

with α = 0.05

The critical values are zc = - 1.645 for (a) or zc = 1.645 for (b) .
These values are obtained by considering the z-score when 95% of
the region under a normal curve is acceptable, (figure 3):

(a)

region of rejection region of


(0.05) acceptance
(0.95)

- 1.645 0 z scale

(b)

acceptance
region of region of rejection
(0.95) (0.05)

0 1.645 z scale

figure 3

(vi) Case 4: One-tailed Test with α = 0.01

(a) H0 : µ = µ 0 (b) H0 : µ = µ 0
H1: µ < µ 0 or H1: µ > µ 0

with α = 0.01

11
The critical values are zc = -2.33 for (a) or zc = 2.33 for (b) .
These values are obtained by considering the z-score when 99% of
the region under a normal curve is acceptable, (figure 4):

(a)

region of rejection
(0.01)

-2.33 0 z scale
zc

(b)

region of rejection
(0.01)

z scale 0 2.33
zc

figure 4

Example 3

The efficiency ratings of BHP steelworkers at the Newcastle plant have


been studied over a period of many years and found to be normally
distributed. The arithmetic mean ( µ ) of the workers is 150 , and the
standard deviation ( σ ) is 12 . Recently, however, young employees
have been hired and new training and production methods have
commenced. The latest sample of 100 workers revealed a sample mean
x of 152.7 . Test the hypotheses that the mean of 150 is still correct
at:

(a) α = 0.05
(b) α = 0.01

12
Solution

(a) H 0 : µ = 150
H1 : µ ≠ 150 α = 0.05

Note, this is a two-tailed test because the alternative hypothesis does


not give a direction of the difference. That is, it does not state whether
the mean is greater than or less than 150 .

sample mean x = 152.7


sample size n = 100
population standard deviation σ = 12
population mean µ 0 = 150

Because we know the population standard deviation ( σ ) we use the


following z-test statistic formula:

x − µ0
z=
σ
n

which gives:

152.7 − 150
z= 12
100

= 2.25

From the sample of 100 workers, the z-test statistic z = 2.25.


Since 2.25 lies outside the region between –1.96 < zc < 1.96 , (case 1).
∴ H 0 is rejected.

region of rejection region of rejection


(0.025) (0.025)

-1.96 0 1.96 z scale


(zc) (zc)
test statistic 2.25

13
(b) H 0 : µ = 150
H1 : µ ≠ 150 α = 0.01

The z-test statistic z = 2.25 (as before)

Since 2.25 is within the region between -2.58 and +2.58 (case 2)
which is the region of acceptance, H0 is not rejected. We can conclude
that the population mean is not different from 150 . The difference
between 152.7 and 150 can be attributed to the variation due to
sampling (chance). We therefore conclude that based on the sample
data we do not reject the null hypothesis. We therefore assume that
the null hypothesis is true.

We did not reject the null hypothesis that the population mean
efficiency rating is 150 , based on sample evidence. However, we did
not prove beyond doubt that H0 is true. The only way to prove beyond
doubt that it is 150 is to check every efficiency rating in the
population - that is, to take a 100 percent sample, which is really a
census.

accept H0

reject H0 0 reject H0 z scale


-2.58 z = 2.25 2.58

It should be noted that if the z-test statistic for our example had
produced a value that was less than –2.58 or greater than +2.58 (the
critical values) then the null hypothesis would be rejected in favour of
the alternative hypothesis. Also, another thing to remember is that as
the level of significance changes so to has the outcome changed.

It is important to select the significant level before setting up the


hypothesis and sampling the population. As seen in this example the
decision on the null hypothesis changed when the level of significance
changed.

14
Example 4

The Myer Department Store issues its own credit card (Myercard).
The finance manager of credit services wants to find out if the mean
monthly unpaid balance is still at $1000 as it was six months ago.
A random check of 172 unpaid balances revealed the sample mean to
be $1017.50 and the standard deviation of the sample $95 . Should
this finance manager conclude that the mean unpaid balance on
Myercards is greater than $1000 , or is it reasonable to assume that the
difference of $17.50 ($1017.50 - $1000 = $17.50) is due to
coincidence (or chance)?

Test the hypothesis that the mean unpaid balance is not different from
the usual amount at:

(a) α = 0.05
(b) α = 0.01

Solution

(a) H 0 : µ = $1000
H1 : µ > $1000 α = 0.05

This is a one-tailed test

x = 1017.5
s = 95 (Note, this is the sample standard deviation)
n = 172
µ 0 = 1000

Because the sample standard deviation (s) is known only, we


use the following z-test statistic formula:

xμ−
z= 0
s
n

$1017.5 − $1000
which gives $95 $17.50
z= = = 2.416
172 $7.2437

A one tailed test at the α = 0.05 level has a critical value


zc = 1.645 (case 3(b)).

15
region of rejection
(0.05)

z scale 0 1.645
critical value 2.42
test statistic

As the test statistic (z) of 2.42 lies in the region of rejection for
the null hypothesis, (i.e. it is greater than the critical value (zc)
of 1.645 , then the null hypothesis (H0) is rejected or the
alternate hypothesis (H1) is accepted.

Therefore the decision is: The mean unpaid balance on


Myercard is greater than the usual amount of $1000 .

(b) H 0 : µ = $1000
H1 : µ > $1000 α = 0.01

The z-test statistic = 2.416 (as before)

A one tailed test at the α = 0.01 level has a critical value


zc = 2.33 (case 4(b)).

As before this z-test statistic lies in the region of rejection for


the null hypothesis ie z > zc.
The alternate hypothesis H1 is accepted.

Example 5

Cereal packets are meant to contain 500 gm of cereal. To check the


accuracy of this statement, 100 packets were randomly selected and
showed a mean of 497 gm with a standard deviation of 20 gm.
Is the manufacturer under filling the packets?
Perform a hypothesis test at the 5% level.

16
Solution

H 0 : µ = 500
H1 : µ < 500
x = 497 µ 0 = 500
s = 20 n = 100

z-test statistic

497 − 500
z= = − 1.5
20
100

A one-tailed test at α = 0.05 has a critical value zc = -1.645


(case 3(a)).
We accept the null hypothesis as the z-test statistics lies in the region
of acceptance.

region of rejection
(0.05)

-1.645 0 z scale
zc
-1.5 test statistic

Example 6

The personnel department of a company has been surveying employees


and asking them how long it takes for them to travel from home to
work each morning. It found that the distribution of times was skewed
to the right with a mean of 21.6 minutes and a standard deviation of
7.2 minutes.

A random sample of 25 employees in the accounts section took an


average of 24.1 minutes to travel to work. Are these employees
different to other employees in their travel time? Test at significance
level of α = 0.05 .

17
Solution

H 0 : µ = 21.6
H1 : µ ≠ 21.6

x = 24.1
σ = 7.2
n = 25
µ 0 = 21.6

∴ z-test statistic

x − µ0
z=
σ
n

24.1 − 21.6
=
7.2
25

= 1.74

Since z = 1.74 lies within the region –1.96 < zc < 1.96 (case 1).
H0 is accepted.

Example 7

A taxi driver claims to make an average of $12.00 on each fare, but


the Taxation Office believes that the average is higher than that. To
test the driver’s claim, the Taxation Office makes a random sample of
30 fares. The amounts that the taxi driver made on the fares in the
sample had a mean of $13.30 with a standard deviation of $2.50 .
Test the driver’s claim at α = 0.01

Solution

H 0 : µ = 12.00
H1 : µ > 12.00

x = 13.30
s = 2.5
n = 30
µ 0 = 12

18
x − µ0
z=
s
n

13.30 − 12.00
=
2.50
30

= 2.85

A one-tailed test at α = 0.01 has a critical value zc = 2.33 (case 4(b)).


Since the z-test statistic z = 2.85 lies in the region of rejection, H0 is
rejected at α = 0.01 .
We therefore do not believe the taxi driver’s claim and conclude that
there is evidence that the taxi driver makes an average of more than
$12.00 on each fare.

Exercise 1(a)

4 t-test Statistic

(i) A small sample is one of less than 25 observations. If the population


standard deviation ( σ ) unknown then the z distribution is not the
appropriate test statistic. The student t , or the t distribution, as it is
usually called, is used as the test statistic.
The characteristics of student’s t distribution were developed by
William S Gossett, a brewmaster for the Guinness
Brewery in Ireland, who published his finding in 1908 using the pen
name ‘Student’. Gossett was concerned with the behaviour of the z –
statistic formula:

x −µ
z= s
n

when s had to be used as an estimator of σ . He was especially worried


about the discrepancy between s and σ when s was calculated from a
very small sample. He proved that his t distribution (which is flatter,
more ‘spread out’, than the normal z distribution) gave better or more
correct results for small samples from a population which displayed a
normal distribution.

19
The important to remember that the critical value for a given level of
significance is greater for small samples than for larger samples. This is
because there is more variability in sample means computed from small
samples, therefore we have less confidence in the resulting estimates and
are less likely to reject the null hypothesis.

(ii) Then we can use a t-test statistic defined as:

xμ−
t= 0
s
n

(iii) Unlike the z-test statistic, the t-test statistic has associated with it a
quantity called degrees of freedom. In this case the degrees of freedom
are denoted by the Greek letter v and are defined by v = n -1.

v=n-1

(iv) Critical Value for t-test

The critical value in any t-distribution, tc , is found in the student-t


distribution tables.

To use these tables, the following need to be ascertained:

1 the level of significance: α


2 the number of degrees of freedom: ν
3 What type of tailed test is in question: one-tailed or two-
tailed?

(v) To find tc look down the left-hand side of the row with the appropriate
degrees of freedom, and across the top for the appropriate test (either
one-tailed or two-tailed) and the significance level used.

20
Example 8

The General Insurance Company over a period of years has established


that it costs $70 on average to process the paperwork, pay the assessor
and finalise the claim. This cost when compared with that claimed by
other insurance firms, is said to be much more expensive. As a result,
cost-cutting measures were instituted. In order to evaluate the impact of
these new measures a sample of 22 recent claims was chosen at random
and costs were recorded. It was found that the sample mean, ( x ) , and
the standard deviation, s , of the sample were $66 and $10 ,
respectively. At the α = 0.01 level of significance is there a reduction
in the average cost, or can the difference of $4 ($66-$70) be attributed to
chance?

Solution

H 0 : µ = 70
H1 : µ < 70

The test is one-tailed because there is interest only in whether or not


there has been a reduction in cost. The inequality in the alternative
hypothesis points to the region of rejection in the left tail of the
distribution.

x = 66 n = 22 v = 21
s = 10 µ 0 = 70 α = 0.01

t-test statistic

66 − 70
t= = −1.876
10
22

tc -critical value

One-tailed test

α = 0.01
v = 21 (degrees of freedom)

Using the t-distribution tables:

tc = 2.518 , however as this is a one-tailed test, “less than” situation,


tc = -2.518

21
region of rejection
(0.01)

-2.52 0 t scale
(tc) -1.876
test statistic

As the t-test statistic lies in the region of acceptance, we accept the null
hypothesis. Therefore, the cost cutting measures have not reduced the
mean cost per claim to less than $70 based on the samples results.

Example 9

Experience has shown that the number of matches in boxes follows a


normal distribution. A manufacturer claims that the average number of
matches in its boxes is 50 .

A customer purchases a random sample of 9 boxes and counts the


contents of each box. They were:

49 50 51 46 48 45 52 47 48

Based on this sample, should the customer believe the manufacturers


claim?
Use a two-sided test at α = 0.05 .

Solution

H 0 : µ = 50
H1 : µ ≠ 50

x = 48.4 n=9 v =8
s = 2.298 µ 0 = 50 α = 0.05

22
t-test statistic

x − µ0
t=
s
n

48.44 − 50
=
2.298
9

= −2.09

tc - critical value

Using the t-distribution tables; with

α = 0.05
v=8 (degrees of freedom)
two-tailed test
tc = 2.306

Since t = -2.036 lies in the acceptable region, ie


we accept H0 at α = 0.05 level of significance.

The claim made by the company that there is an average of 50 metres in


its boxes may well be true.

Example 10

In a random sample of 20 components taken from a production line, the


mean length of each component in this sample is 108.6 millimetres with
a standard deviation of 6.3 millimetres. Given that each component
should measure 105 millimetres long and that the population has
proved to be normal, is there enough statistical evidence to show that the
production line is producing components that are of an incorrect length?
Test at the 5 percent level of significance.

Solution

H 0 : µ = 105
H1 : µ ≠ 105

x = 108.6 n = 20 µ 0 = 105
s = 6.3 v = 19 α = 0.05

23
t-test statistic

108.6 − 105
t= = 2.556
6.3
20

This is a two-tailed test at a level of significance of α = 0.05 , with 19


degrees of freedom.

tc - critical value

Using the t-distribution tables:


tc = 2.09

region of rejection region of rejection


(0.025) (0.025)

2.09 0 2.09 z scale


2.556
test statistic

As the t-test statistic lies in the region of rejection, we reject the null
hypothesis.
The components produced on the production line are of a different length
to normal.

(vi) Summary of Steps in One Sample Hypothesis Testing

(a) Write down the null hypothesis H0 , and choose an appropriate


form for the alternative hypothesis H1 , either not equal to µ ≠ µ 0
(a two tailed test) or a one tailed test either upper tail µ > µ 0 or
lower tail µ < µ 0 .

(b) Use the appropriate test statistic to calculate the value of z or t .

(c) Use a decision rule (at the level of significance) to test for
the value of the test statistic.

24
(d) Compare the calculated z or t value and compared it with the
critical z or t value and decide from the decision to either accept
or reject the null hypothesis.

So far we have only considered one-sample tests. However, the general


principles apply to all hypothesis testing in statistics for problems
involving larger numbers of samples and other instances where a
conclusion is to be drawn from data collected.

It should be emphasised that statistics is not an exact science – it


doesn’t prove anything. What it does do is provide us with a guide for
making reasonable conclusions based on the evidence before us, and
even provide us with the probability that we have made an error.
However, the chance always remains that our conclusions may be
incorrect!

Exercise 1(b)

5 Two sample Hypothesis Testing


(i) Another important use of statistical testing is to see whether there is a
significant difference between the means of samples from two
populations.
A mathematics teacher may wish to know whether students taught
with
the aid of a computer have significantly higher grades than those taught
with traditional methods.

(ii) The symbols used to describe aspects of each sample is shown in the
table below, (table 2)
Note, the two samples are drawn independently from the population:

sample symbol
1 2
size n1 n2
mean x1 x2
standard deviation s1 s2

table 2

(iii) We wish to examine the difference between the means of the two
samples:

xd = x1 − x2

25
Generally speaking, when two sample means are different, we have two
hypotheses to explore. First, there is the null hypothesis that the two
populations from which the two samples originate have the same mean
( µ1 = µ2 ) . If this is the case, then the observed difference between the
two sample means is not significant and is attributed to chance or
random sampling fluctuations. The alternative hypothesis to be explored
is that the two samples are drawn from populations which have different
means. If this hypothesis is true, the observed differences between the
two sample means is deemed significant.

When two sample means are different, how can we decide whether or not
the difference between the two means is significant? The standard
procedure is to test the validity of the null hypothesis, which states that
µ1 = µ 2 , utilizing the information from the two samples. On the basis of
the evidence produced by the two samples, we will either accept or reject
the null hypothesis. If the null hypothesis is rejected, the observed
difference between the two sample means is significant. However, the
observed difference is not significant whenever the null hypothesis is
accepted.

Symbolically we write:

Two Tailed Test One Tailed Test


H 0 : µ1 = µ 2 H 0 : µ1 = µ 2
H1 : µ1 ≠ µ 2 H1 : µ1 > µ2 or µ1 < µ 2

Having established the appropriate null and alternative hypotheses, the


appropriate statistic test needs to be used, depending on the sample size.

(iv) We will consider the situation when the sample size is large ( n ≥ 25 ) .
This requires the z-statistic test.

(v) Standard Deviation: σ known

When two samples are large, ( n1 , n2 ≥ 25) and the population standard
deviation, σ , is known, the standard error σ d , (where d indicated
“difference”), of xd = x1 − x2 is given by the expression:

1 1
σd = σ +
n1 n2

26
Note: the population standard deviation for a single sample is given by:

σ
n

(vi) Standard Deviation: σ is not known

When two samples are large, ( n1 , n2 ≥ 25) and the population standard
deviation, σ , is not known, the standard error, sd , of
xd = x1 − x2 is given by the expression:

s12 s2 2
sd = +
n1 n2

(vii) The z-statistic used for one sample hypothesis testing was given by:

xμ−
z= 0
s
n

When calculating the z-statistic for two sample hypothesis testing, we


replace:

x for xd = x1 − x2

s s12 s2 2
for sd = +
n n1 n2

µ0 for µ d = µ1 − µ 2

d −

which gives: z= d

sd

27
Example 11

To compare the average life of two brands of 9-volt batteries, a sample


of 100 batteries from each brand is tested. The sample selected from the
first brand shows an average life of 47 hours and a standard deviation of
4 hours. A mean life of 48 hours and a standard deviation of 3 hours
are recorded for the sample from the second brand. Is the observed
difference between the means of the two samples significant at the 0.01
level?

Solution

There are two hypotheses:

H 0 : µ1 = µ 2

H 0 : μ1 ≠ μ2

n1 = 100 n2 = 100
x1 = 47 x2 = 48
s1 = 4 s2 = 3

xd = x1 − x2 = 47 − 48 = −1 and µd = 0 ( i.e H 0 : µ1 = µ2 )

s12 s2 2 42 32
sd = + = + = 0.5
n1 n2 100 100

xd − µ d
Now z=
sd

−1 − 0
∴ z-test statistic: z=
0.5

∴ z = −2

Now zc = −2.58 (case 2) at α = 0.01

∴ we accept the null hypothesis

That is, the difference between the means of the two samples is not
significant at the α = 0.01 level.

28
Example 12

The efficiency of two training centers in a large company is to be


evaluated. The test results of a group of students from each training
centre is given below:

sample centre I centre II


size 50 40
mean 82.5 77
standard deviation 7.2 9.1

Determine whether there is a significant difference between the centres at


the α = 0.01 level of significance?

Solution

H 0 : μ1 = μ2
H1 : µ1 ≠ µ 2

xd = x1 − x2 = 82.5 − 77 = 5.5

s12 s22 7.22 9.12


sd = + = + = 1.763
n1 n2 50 40

xd − µ d
µd = 0 ; z =
sd

5.5 − 0
∴ z-test statistic: z = = 3.12
1.763

Now zc = 2.575
∴ we reject the null hypothesis. There is a significant difference at
α = 0.01 .

Example 13

Two research laboratories have independently produced drugs that


provide relief to arthritis sufferers. The first drug was tested on a group
of 90 arthritis victims and produced an average of 8.5 hours of relief,
with a standard deviation of 1.8 hours. The second drug was tested on
80 arthritis victims, producing an average of 7.9 hours of relief, with a

29
standard deviation of 2.1 hours. At the .05 level of significance, does
the second drug provide a significantly shorter period of relief?

Solution

H 0 : µ1 = µ 2
H1 : µ1 > µ 2

where x1 = 8.5 x2 = 7.9


s1 = 1.8 s2 = 2.1
n1 = 90 n2 = 80
first drug second drug

This is a one-tailed test.

xd = x1 − x2 = 8.5 − 7.9 = 0.6 and µ d = 0

s12 s2 2 1.82 2.12


sd = + = +
n1 n2 90 80

sd = 0.302
xd − µ d
Now z-test statistic =
sd

0.6 − 0
z= = 1.98
0.302

Now zc = 1.645 (one tailed test at α = 0.05 ) (case 3)


We therefore reject H0 .
The second drug does provide significantly shorter relief.

Exercise 2(a)

30
6 t-test Statistic – two samples

(i) Standard Deviation: σ unknown

When two samples are small, (n1, n2 < 25) the sample standard deviation,
sd of xd = x1 − x2 is given by the expression:

( n1 − 1) s12 + ( n2 − 1) s22 1 1
sd = +
( n1 + n2 − 2) n1 n2

(ii) Degrees of Freedom

With a t-test , the degrees of freedom, v , is given by:

v = n1 + n2 − 2

(iii) The t-statistic used for one sample hypothesis testing was given by:

xμ−
t= 0
s
n

When calculating the t-statistic for two sample hypothesis testing, we


replace:

x for xd = x1 − x2

s
for sd =
( n1 − 1) s12 + ( n2 − 1) s2 2 1 1
+
n n1 + n2 − 2 n1 n2

µ 0 for µ d = µ1 − µ 2

31
t= d −
xμ d
which gives:
sd

Example 14

A building security wishes to determine if there is a significant


difference
between the activity in the cheque account of two of its branches.

The following data was obtained:

sample branch I branch II


size 12 10
mean $1000 $900
standard deviation $150 $120

Is there a significance difference between the two branches at the


5%
level?

Solution

H 0 : µ1 = µ 2 (two-tailed test)
H1 : µ1 ≠ µ2

sd =
( n1 − 1) s12 + ( n2 − 1) s2 2 1 1
+ xd = 1000 − 900 = 100
n1 + n2 − 2 n1 n2

(12 − 1)1502 + (10 − 1)1202 1 1


= +
12 + 10 − 2 12 10

= 58.79 also µ d = 0 ( i.e H 0 : µ1 = µ 2 )

xd − µ d
t–test statistic t=
sd

32
100 − 0
gives: t = = 1.70
58.79

Now tc at α = 0.05 two-tailed test with v = 12 + 10 − 2 = 20 degrees of


freedom,

tc = 2.086 ∴ we accept the null hypothesis.

There is no significant difference between the branches at the 5% level.

Example 15

A reading test is given to an elementary school class that consists of 12


Anglo-American children and 10 Mexican-American children.
The results of the test are:

Anglo-American Mexican-American
x1 = 74 x2 = 70
s1 = 8 s2 = 10

Is the difference between the mean of the two groups significant at the
0.05 level?

Solution

H 0 : µ1 = µ2
H1 : µ1 ≠ µ 2

Level of significance = 0 .05

To test the null hypothesis, we compute the observed value of t as:

( n1 − 1) s12 + ( n2 − 1) s22 1 1
sd = +
( n1 + n2 − 2 ) n1 n2

( 12 − 1) ( 8 ) + ( 10 − 1) ( 10 )
2 2
1 1
sd = +
( 12 + 10 − 2 ) 12 10

= 3.83

33
xd = 74 − 70 = 4

t-test statistic

xd − µ d
t=
sd

4
t= = 1.043
3.83

With v = 20 degrees of freedom (v = 12 + 10 - 2)


at α = 0.05 level, the t-critical value tc :

tc = 2.086 (two tailed test) ∴ we accept the null hypothesis.

The difference between the mean is not significant at the 0.05 level.

Example 16

A consumer-research organization routinely selects several car models


each year and evaluates their fuel efficiency. In this year’s study of two
similar subcompact models from two different automakers, the average
gas mileage for twelve cars of brand A was 27.2 miles per gallon, with a
standard deviation of 3.8 mpg . The nine brand B cars that were tested
averaged 32.1 mpg , with a standard deviation of 4.3 mpg. At α = 0.01
should it conclude that brand B cars have higher average gas mileage
than do brand A cars?

Solution

H 0 : µ1 = µ2 (one tailed)
H1 : µ1 < µ 2 α = 0.01

n1 = 12 n2 = 9
x1 = 27.2 x2 = 32.1
s1 = 3.8 s2 = 4.3

xd = x1 − x2 = −4.9

sd =
( n1 − 1) s12 + ( n2 − 1) s2 2 1
+
1
( n1 + n2 − 2 ) n1 n2

34
sd = 1.77 µd = 0

35
t-test statistic

xd − µ d −4.9 − 0
t= =
sd 1.77

t = −2.76

t–critical value

ν = 12 + 9 − 2 = 19 d.f

One tailed test at α = 0.01

tc = -2.539

∴ reject H0 :

Brand B does have a significantly higher average gas mileage than


Brand A at the 1% level of significance.

Exercise 2(b)

36
7 Hypothesis Testing of Proportions

(i) So far we have discussed hypothesis testing involving the mean


(one sample test) of a sample, or two means (two sample test) of different
samples.

In each case we have dealt with large samples (z statistic) and small
samples (t statistic).

In this section we are going to discuss hypothesis testing of proportions,


that is, proportion of occurrences in a population.

(ii) Normal Approximation to the Binomial Distribution

When dealing with proportions the binomial distribution is the theoretically


correct distribution to use, since the data is discrete, not continuous. It can
be shown that as a sample size increases, the binomial distribution
approaches the normal in its characteristics.

We will use this normal approximation to the binomial when dealing with
the hypothesis testing of proportions.

(iii) Sample proportions

The sample proportion ( p) represents the probability of a success of a


given sample. The sample proportion in the best estimate when the
population proportion ( p) is not known.

(iv) Mean and Standard Deviation

The mean or expected proportion of a sample, μ p , equals the


population proportion.

µp = p

The standard deviation of a sample proportion σ p is also referred to as


the standard error of the mean proportion and is given by:

pq
σp =
n

37
Note: q=1-p
n = number of independent binomial trials

(v) Hypotheses

When dealing with testing a single proportion, the null hypothesis is that
the expected proportion equals the population proportion.

H0 : µ p = p alternatively,
H1 : µ p ≠ p

(vi) z-test statistic

To test whether the null hypothesis is accepted or rejected we determine


the z statistic and test this value against the critical value (zc) at a given
level of significance.

The z-statistic when dealing with proportions is given by:

p− p
z=
σp

p = sample proportion
p = population proportion
σp = standard error

Example 17

Consider a company that is evaluating the promotability of its employees;


that is, determining the proportion of them whose ability, training and
experience qualify them for promotion.

The company estimates that 80% of their employees are promotable.


After interviewing a random sample of 150 employees, a committee
finds that only 70% of the sample deserve promotion.
The company wishes to test the hypothesis that 80% of their workforce
are promotable at a 5% level of significance.

38
Solution

The null hypothesis H0 is that the original proportion estimate of


promotability.

H 0 : p = 0.8 alternatively,
H1 : p ≠ 0.8 at α = 0.05

Note also that p = 0.7


q = 0.3
n = 150

We are to test the expected proportion of the sample against the actual
sample proportion.

0.8 × 0.2
The standard error: σp = = 0.0327
150

p− p
The z–test statistic: z=
σp

0.7 − 0.8
z= = − 3.058
0.0327

The critical value zc at 5% level: zc = −1.96

region of rejection region of region of rejection


(0.025) acceptance (0.025)
(0.95)

-1.96 (zc) 0 1.96 (zc) z scale

z = -3.058
test statistic

39
We reject the null hypothesis at α = 0.05 .

The company should conclude that there is a significant difference


between the expected (or hypothesized) proportion and the observed or
actual proportion at the α = 0.05 level of significance.
The true proportion of promotable employees is not 80% .

Example 18

A member of a public interest group concerned with industrial pollution


estimates that less than 60% of all factories comply with pollution
standards.

A sample of 60 factories are sampled, with 33 complying with the


pollution standards.

Test the null hypothesis that 60% are complying with pollution
standards at the 1% level of significance.

Solution

H 0 : p = 0.6
H1 : p < 0.6

33 27
p= = 0.55 , q= = 0.45 , n = 60
60 60

pq
standard error: σp =
n

0.6 × 0.4
=
60

= 0.0642

p− p
z-test statistic: z=
σp

0.55 − 0.6
z=
0.0642

z = −0.779

critical value zc at 1% level: zc = -2.33 (one tailed test).

40
region of rejection
region
of acceptance

zc = -2.33 0 z scale
critical value z = -0.779

z-test statistic

We accept the null hypothesis, even though the actual sample proportion
is indeed below the expected proportion is indeed below the expected
proportion, it is not significantly below this figure at the 1% level of
significance.

Example 19

The sponsor of a weekly television show would like the studio audience
to consist of an equal number of men and women. Out of 400 persons
attending the show on a given night, 220 are men. Using a level of
significance of 0.01 , can sponsor conclude that the desired sex
composition of the audience is not properly maintained?

Solution

H 0 : p = 0.5
H1 : p ≠ 0.5

220
p= = 0.55 n = 400 q = 0.45
400

standard error: σp =
( 0.5) ( 0.5)
400

σ p = 0.25

p− p
z-test statistic: z=
σp

0.55 − 0.5
z=
0.25

z=2

41
critical value: zc

At 1% level zc = 2.58 (two tailed test)

∴ we accept null hypothesis at this level of significance.

Example 20

The Department of Health, Education and Welfare reports that only 10%
of all persons over 65 years old are covered by adequate private health
insurance. What would the Australian Medical Association (AMA)
conclude about the Department’s claim if, out of a random sample of 900
elderly persons, 99 possessed adequate private health insurance? Use a
level of significance of .05 .

Solution

H 0 : p = 0.1
H1 : p > 0.1

99
p= = 0.11 n = 900 q = 0.89
900

standard error: σp =
( 0.1) ( 0.9 ) = 0.01
900

p − p 0.11 − 0.1
z-test statistic: z= = =1
σp 0.01

critical value: zc = 1.64 (one-tailed test at α = 0.05)

Since z is 1.0 , which is less than 1.64 , the null hypothesis cannot be
rejected using the .05 level of significance. In other words, the AMA
does not have enough evidence to reject the claim made by the
Department of Health, Education, and Welfare.

Exercise 3(a)

42
8 Hypothesis Testing Between the Proportions

(i) In this section we will discuss the difference between the proportions of
two samples.

(ii) Sample Proportions

For two samples, each containing respectively n1 and n2 data values,


p1 is the sample proportion with n1 values
p2 is the sample proportion with n2 values

(iii) Mean of Sample Proportions

The mean or expected proportion for each respective sample equals their
population proportions.

μ p1 = p1
μ p 2 = p2

(iv) Hypotheses

If p1 and p2 denote the population proportions then the null hypothesis


is that there is no significant difference in their proportions.

H 0 : p1 = p2

The alternative hypotheses would be either:

H 0 : p1 ≠ p2 (two-tailed) , or
H1 : p1 > p2 or p1 < p2 (one-tailed)

(v) We wish to examine the difference between the two proportions:

pd = p1 − p2

43
Standard Error

The standard error (standard deviation) of the difference between the two
proportions p1 and p2 is given by:

σ d = σ p1 − σ p2

p1q1 p2q1
σd = +
n1 n2

However, we do not know the population proportions, and thus we need


to estimate them from the sample proportions. So in practice we calculate
σ d using:

p1q1 pq
σd = + 2 2
n1 n2

(vii) Overall Proportion

If we hypothesize that there is no difference between the two proportions,


then our best estimate of the overall proportion of successes is the
combined proportion of successes in both samples.

If p̂ is the overall proportion of success for both samples, then:

n1 p1 + n2 p2
pˆ =
n1 + n2

(viii) The standard error of the difference between the two proportions using
the overall proportion, σˆ d , is given by:

44
ˆˆ
pq ˆˆ
pq
σˆ d = +
n1 n2

(ix) z-test statistic

To test whether the null hypothesis is accepted or rejected we determine


the z-score and then test this value against the critical value (zc) at a
given level of significance.

When testing one proportion, we used, z :

p− p
z=
σp

When calculating the z-score for two proportions hypothesis


testing we replace:

p for p d = p1 − p 2

ˆ ˆ pq
pq ˆˆ
σ p for σˆ d = +
n1 n2

pd for p = p1 − p2

pd − pd
z=
σˆ d

45
Example 21

A drug company tests two compounds intended to reduce blood pressure


levels. The compounds are given to different groups of animals.

Group 1 contained 100 animals, with 71 showing lower blood


pressure levels with drug A .

Group 2 contained 90 animals, with 58 showing lower blood


pressure levels with drug B .

Test to see if there is a difference between the effectiveness of the


two drugs at a 0.05 level of significance.

Solution

Group 1 Group 2

71 58
p1 = = 0.71 p2 = = 0.644
100 90

29 32
q1 = = 0.29 q2 = = 0.356
100 90

n1 = 100 n2 = 90

The null hypothesis is that there is no difference between their


population proportions.

H 0 : p1 − p2 with,

H1 : p1 ≠ p2 at α = 0.05

Two-tailed test

(a) Overall Proportion Estimate

n1 p1 + n2 p2
pˆ =
n1 + n2

100 ( 0.71) + 90(0.644)


pˆ =
100 + 90

pˆ = 0.6789 qˆ = 0.3211

46
(b) Standard Error

ˆ ˆ pq
pq ˆˆ
σˆ d = +
n1 n2

σˆ d =
( 0.6789 ) ( 0.3211) + ( 0.6789 ) ( 0.3211 )
100 90

σˆ d = 0.0678

(c) Critical Value

At a 5% level of significance for a two-tailed test the zc critical


values are +1.96 (case 1).

(d) z-test statistic

pd − pd
z=
σˆ d

pd = p1 − p 2 = 0.71− 0.644 pd = 0 ( i.e H0 : p1 = p 2 )

= 0.066

0.066 − 0
∴ z= = 0.973
0.0678

region of rejection region of region of rejection


(0.025) acceptance (0.025)
(0.95)

-1.96 (zc) 0 1.96 (zc) z scale

z-statistic = 0.973

The difference between the two sample proportion lies within the
acceptance limits. Thus, we accept the null hypothesis and conclude that
these two drugs produce effects on blood pressure that are not
significantly different, (at α = 0.05 ) .

47
Example 22

A dental inspector found that, in area A, 20 out of a random sample of


200 had tooth decay, while in area B. 18 our of a random sample of
150 had tooth decay.

Does this indicate any difference in proportions at a 1% level of


significance?

Solution

Area B Area B

20 18
p1 = = 0.1 p2 = = 0.12
200 150

180 132
q1 = = 0.9 q2 = = 0.88
200 150

n1 = 200 n2 = 150

The Null hypothesis H0 , is that there is no difference in the proportion of


tooth decay in the two areas.

H 0 : p1 = p2
H1 : p1 ≠ p2 (two tailed)

(a) Overall Proportion Estimate

n1 p1 + n2 p2
pˆ =
n1 + n2

200 ( 0.1) + 150(0.12)


pˆ =
200 + 150

38
pˆ = = 0.109
50

∴ qˆ = 1 − 0.109 = 0.891

48
(b) Standard Error

ˆ ˆ pq
pq ˆˆ
σˆ d = +
n1 n2

σˆ d =
( 0.109 ) ( 0.891) +
( 0.109 ) ( 0.891 )
200 150

σˆ d = 0.0337

(c) Critical Value

At a 1% level of significance zc = ±2.58

(d) z-test statistic

pˆ d − pd 0.1 − 0.12
z= = = −0.59
σˆ d 0.0337

region of rejection region of rejection


(0.05) region of (0.05)
acceptance
(0.95)

-2.58 (zc) 0 2.58 (zc) z scale

z-test statistic
z = 0.59

The difference between the two samples proportions is not


significant at 1% level.

Accept the null hypothesis.

49
Example 23

A coal-fired power plant is considering two different systems for


pollution abatement. The first system has reduced the emission of
pollutants to acceptable levels 68 percent of the time, as determined
from 200 air samples. The second, more expensive system has reduced
the emission of pollutants to acceptable levels 76 percent of the time, as
determined from 250 air samples. If the expensive system is
significantly more effective than the inexpensive system in reducing
pollutants to acceptable levels, then the management of the power plant
will install the former system. Which system will be installed if
management uses a significance level of 0.01 in making its decision?

Solution

H 0 : p1 = p2
H1 : p1 < p2 (one tailed test at α = 0.01)

p1 = 0.68 p2 = 0.76
q1 = 0.32 q2 = 0.24
n1 = 200 n2 = 250

(a) Overall Proportion Estimate

n1 p1 + n2 p2
pˆ =
n1 + n2

200 ( 0.68) + 250(0.76)


pˆ =
450

pˆ = 0.724

qˆ = 0.276

(b) Standard Error

ˆ ˆ pq
pq ˆˆ
σˆ d = +
n1 n2

0.724 × 0.276 0.724 × 0.276


σˆ d = +
200 250

σˆ d = 0.0424

50
(c) Critical Value

zc = −2.33 α = 0.01 (one tailed)

(d) z-test statistic

pˆ d − pd
z= pd = 0
σˆ d

pˆ d = 0.68 − 0.76 = − 0.08

−0.08 − 0
∴ z= = − 1.89
0.0424

Accept H0 : install cheaper system

Exercise 3(b)

9 Chi-Square Analysis

(i) We have investigated hypothesis tests from either one or two samples.
We used one-sample tests to determine whether a mean of a proportion
was significantly different from a hypothesized value. In the case of two-
sample tests, we examined the difference between the two means or two
proportions, to decide whether this difference was significant.

(ii) Chi–square Tests

Suppose we have more than two proportions to examine. If this is the


case the current z-test would not be applicable. Instead we must use the
Chi-square test. Chi-square tests enable us to test whether more than two
population proportions can be considered equal.

(iii) Contingency Tables

Suppose that in four regions, the National Health Care Company samples
its hospital employees’ attitudes toward job performance reviews.
Respondents are given a choice between the present method, a proposed
new method.

51
The table below, (table 3), illustrated the response to this question from
the sample polled, is called a contingency table. A table such as this is
made up of rows and columns; rows run horizontally, columns vertically.
Notice that the four columns in Table 1 provide one basis of classification
– geographical regions- and that the two rows classify the information
another way; preference for review methods. Table 9-1 is called a “2×4
contingency table”, because it consists of two rows and four columns. We
describe the dimensions of a contingency table by first stating the number
of rows and then the number of columns. The “total” column and the
“total” row are not counted as part of the dimensions.

method region
Northeast Southeast Central Westcoast total
present 68 75 57 79 279
new 32 45 33 31 141
total 100 120 90 110 420

table 3

(iv) Hypotheses

The null hypothesis (H0) in this case is that there is no relationship


between the employee’s attitudes to job performance reviews and the
region that they live in.

H0 : region and choice of method are independent alternately,


H1 : region and choice of method are dependent

(v) Observed and Expected Frequencies

The observed frequencies, f0 , are the actual values obtained, which are
recorded on the original contingency table.

The expected frequencies, fe , are those which are theoretically expected


by considering the overall proportions of each classification.

The expected frequencies in a contingency table are determined by using


the following formula:

RT × CT
fe =
n

52
where:

fe = the expected frequency in a given call


RT = the row total for the row containing that cell
CT = the column total for the column containing that cell
n = the total number of observations

For example, the f e value for someone who prefers the present method
in the Northeast region is given by:

100 × 279
fe = = 66.43
420

The table below, (table 4), gives a summary of the observed and
expected frequencies from table 1.

method Northeast Southeast Central Westcoast


f0 68 75 57 79
present
fe 66.43 79.72 59.79 73.07

f0 32 45 33 31
new
fe 33.57 40.28 30.21 36.93

table 4

(vi) Chi-square Statistics

The chi-square statistic χ 2 is given by:

(f −f )
2

χ 2
=∑ 0 e
fe

Using the information in table 4, we can establish the Chi-square


statistic, (table 5):

53
( fo − f e )2
f0 fe f o − fe ( fo − fe )
2
fe

68 66.43 1.57 2.46 .0370


75 79.72 -4.72 22.28 .2795
57 59.79 -2.79 7.78 .1301
79 73.07 5.93 35.16 .4812
32 33.57 -1.57 2.46 .0733
45 40.28 4.72 22.28 .5531
33 30.21 2.79 7.78 .2575
31 36.93 -5.93 35.16 .9521
total 2.7638

table 5

( f0 − fe )
2

χ =∑
2
= 2.764
fe

(vii) Interpretation of Chi-square

The answer of 2.764 is the value for chi-square in our problem


comparing preferences for review methods. If this value were as large
as, say, 20 , it would indicate a substantial difference between our
observed values and our expected values. A chi-square of zero, on the
other hand, indicates that the observed frequencies exactly match the
expected frequencies. The value of chi-square can never be negative,
since the differences between the observed and expected frequencies
are always squared.

(viii) Chi-square Distribution

If the null hypothesis is true, then the sampling distribution of the


chi-square statistic, χ 2 , can be closely approximated by a continuous
curve known as chi-square distribution. As in the case of the
t distribution, there is a different chi-square distribution for each
different number of degrees of freedom.
The chi-square distribution is a probability distribution. Therefore, the
total area under the curve in each chi-square distribution is 1.0 .

(ix) Degrees of Freedom

To use the chi-square test, we must calculate the number of degrees if


freedom (v) in the contingency table:

v = (r − 1)(c − 1) ,

54
Where r is the number of rows in the problem, and c is the number of
columns in the problem.

(x) Chi-square Critical Value

Returning to our example of job-review preferences of national health


care hospital employees, we use the chi-square test to determine whether
attitude about reviews is independent of geographical region. If the
company wants to test the null hypothesis at the 0.05 level of
significance, our problem can be summarized:

H0 : region and choice are independent


H1 : region and choice are dependent
α = 0.05

Since our contingency table for this problem (table 1) has two rows
and four columns, the appropriate number of degrees of freedom is:

number of degrees of freedom v = (r-1)(c-1)


v = (2-1)(4-1)
v = (1)(3)
v =3

The chi-square tables reveal that the chi-square critical value, with
α = 0.05 and v = 3 degrees of freedom equals 7.81 .

Thus the acceptance region for the null hypothesis in the figure below,
(figure 5) goes from the left tail of the curve to the chi-square statistic of
7.81.

acceptance region
chi-square distribution
for 3 degrees of freedom

sample chi-square
value of 2.764

0.05 of the area

2.764 7.81

figure 5

The chi-square value calculated earlier , χ 2 = 2.764 falls within the


acceptance region. Therefore, we accept the null hypothesis that there is
no difference between the attitudes about job interviews in the four
geographical regions.

55
Example 24

Random samples of 160 , 240 , and 200 persons were selected from
Melbourne, Sydney and Brisbane respectively. The persons selected
were asked “What type of television program do you like best: drama,
western, documentary, or comedy?” The responses are summarized
below:

type of number of persons


program Melbourne Sydney Brisbane total
drama 60 100 80 240
western 30 30 30 90
documentary 30 40 50 120
comedy 40 70 40 150
total 160 240 200 600

Test the hypothesis that there is a difference in television preferences


among the resident in the three cities, at a level of significance of 0.05 .

Solution

(a) Hypotheses

H0 : the type of program watched is independent of the city.


H1 : the type of program watched depends upon the city.

(b) Observed and Expected Frequencies

RT × CT
Using the formula: fe =
n

we can establish both the observed and expected frequencies in


one table.

The expected frequencies are in brackets.

Program Melbourne Sydney Brisbane


drama 60 (64) 100 (96) 80 (80)
western 30 (24) 30 (36) 30 (30)
documentary 30 (32) 40 (48) 50 (40)
comedy 40 (40) 70 (60) 40 (50)

(c) Chi-square Statistic

( f0 − fe )
2

χ =∑
2

fe

56
( f o − fe )2
f0 fe fo − fe ( fo − fe ) 2
fe
60 64 -4 16 0.25
100 96 4 16 0.16
80 80 0 0 0
30 24 6 36 1.5
30 36 -6 36 1
30 30 0 0 0
30 32 -2 4 0.125
40 48 -8 64 1.333
50 40 10 100 2.5
40 40 0 0 0
70 60 10 100 1.667
40 50 -10 100 2
total 10.535

∴ χ = 10.535
2

(d) Degrees of Freedom

v = (r - 1)(c -1)
= (4 -1)(3 - 1)
=6

(e) Critical Value

From tables χ 0.05 with v = 6 gives


2

χ 2 0.05 = 12.6

acceptance region

0.05 of area

10.535 12.6
2
χ statistic critical value

The χ 2 statistic falls in the acceptance region.

Accept H0 . There is no connection between the preference for a


program and the city that it is watched in.

57
Example 25

A teacher wished to determine whether the performance in a problem


solving test is independent of the students’ year at school.

The teacher selected 120 students, 40 from each of Years 8, 9, and 10


and graded their performance in a test as A or B as shown in the table
below:

year grade awarded total


A B
8 22 18 40
9 26 14 40
10 27 13 40
total 75 45 120

Test the hypothesis that performance in the test is independent of the


students’ year at school, using the 5% and 1% level of significance.

Solution

(a) Hypotheses

The hypotheses being tested are:

H0 : there is no relationship between grades


H1 : there is a relationship

(b) Observed and Expected Frequencies

The table below sets out the observed and expected frequencies
(in brackets):

year A B total
8 22 (25) 18 (15) 40
9 26 (25) 14 (15) 40
10 27 (25) 13 (15) 40
total 75 45 120

(c) Chi-square Statistic

( fo − fe )
2

f0 fe f o − fe
fe
22 25 -3 0.36
18 15 3 0.60
26 25 1 0.04
14 15 -1 0.07
27 25 2 0.16
13 15 -2 0.27
total 1.50

58
( f0 − fe )
2

χ =∑
2
= 1.50
fe

(d) Degrees of Freedom

v = (r-1)(c-1)
= (3-1)(2-1)
=2

(e) Critical Value

From tables χ 2 0.05 with v = 2 = 5.99


also χ 2 0.01 with v = 2 = 9.21

So, we can accept the hypothesis that performance is independent


of the students’ year at school at both the 1 % and 5 % level of
significance.

Example 26

For random samples of 200 people contacted in each of six states, the
number who favoured Australia becoming a republic is recorded in the
table below:

preference State total


A B C D E F
yes 13 10 12 10 12 12 720
2 8 8 4 8 0
no 6 9 7 9 7 8 480
8 2 2 6 2 0
total 20 20 20 20 20 20 1200
0 0 0 0 0 0

Test the hypothesis that people in the six states are equally in favour at
the 5% level of significance.

Solution

(a) Hypotheses

H0 : people of the states are equally in favour


H1 : people of the states are not equally in favour

(b) Observed and Expected Frequencies

The table below sets out the observed and expected frequencies
(in brackets):

59
preference State total
A B C D E F
yes 132 (120) 108 (120) 128 (120) 104 (120) 128 (120) 120 (120) 720
no 68 (80) 92 (80) 72 (80) 96 (80) 72 (80) 80 (80) 480
total 200 200 200 200 200 200 1200

(c) Chi-square Statistic

( fo − fe )
2

f0 fe f o − fe
fe
132 120 12 1.2
108 120 -12 1.2
128 120 8 0.533
104 120 -16 2.133
128 120 8 0.533
120 120 0 0
68 80 -12 1.8
92 80 12 1.8
72 80 -8 0.8
96 80 16 3.2
72 80 -8 0.8
80 80 0 0
total 14

∴ χ = 14
2

(d) Degrees of Freedom

v = (r-1)(c-1)
v = (2-1)(6-1)
v =5

(e) Critical Value

From tables χ 0.05 with v = 5 = 11.1


2

Reject null hypothesis. Not all states are equally in favour of a


republic.

Exercise 4

60
61
62

You might also like