Lecture08 ACCG615 2pp

26/07/2012
ACCG615 Lecture 8: Analysis of Variance Copyright Macquarie University 2012

1
Comparing means:
more than two groups
Multiple Comparisons:
Bonferroni
Tukey
Lect ur e 8
Anal y si s of Var i ance
2
Summar y of Lect ur e 7
Samples of interval data selected from two separate
populations can be either dependent or independent .
If the samples are independent :
Use a t wo sample pooled t - t est if sigma is unknown
and the populations are believed to have equal
standard deviations
Use a t wo sample unpooled t - t est if the sigmas are
unknown and the populations do not have equal
standard deviations
If the samples are dependent :
Use a paired t - t est to determine if there is a specified
difference between the two population means.
Use a two-sample z- t est to test differences when the
standard deviations are known
26/07/2012
3
Compar i ng Mor e t han Tw o Popul at i on Means
We compare several population means by extending
the method of the pooled two sample test.
When we compare two population means we select
a sample from each population and compare the
sample means. If the difference between the
sample means compared to the variability within
the samples is small, we conclude that the samples
could arise from populations with equal means.
If the difference between the two sample means,
compared to the variability within the samples is
large enough to produce a small p-value, we will
conclude that there is a statistically significant
difference between the means of the populations
from which the samples are drawn.
4
To compare more than two population means, well
look at the variabilit y bet ween t he samples and
compare this to the variabilit y wit hin t he samples.
To carry out this procedure we need to check that
the samples may have arisen from normal
populations with equal spreads.
If the variability between the groups (treatments) is
large compared to the variability within the
treatments, we conclude that the populations have
significantly different means.
If the variability between treatments is similar to
that of the variability within the treatments, we
conclude that the population means could be t he
same.
26/07/2012
5
Ex ampl e
Toy shops in Sydney pay the same rate to each of
their employees. The following data represent the
hourly rates, in dollars, paid to the employees of
five toy shops selected from three Sydney suburbs:
Ryde Bondi Manly overall
11 22 13
15 24 17
20 30 22
26 36 27
28 38 31
mean 20 30 22 24
std.dev 7.18 7.07 7.28 8.01
6
Toy Shops
The target population here is toy shops in Ryde,
Bondi and Manly. We note that the variation
between the suburbs (treatments) is not very
different from the variation within the suburbs
indicating that t he locat ions of t he t reat ment means
may not be significant ly different from each other.
26/07/2012
7
Anot her Ex ampl e
Compare this to the following example which
represents the hourly rates paid to five DVD
shops selected from each of the three suburbs:
Ryde Bondi Manly overall
17 27 19
19 28 20
20 30 22
21 32 23
23 33 26
n
i
5 5 5 15
mean 20 30 22 24
std.dev 2.24 2.55 2.74 5.04
8
DVD Shops
The target population here is DVD shops in
Ryde, Bondi and Manly. We note that the
variation between the suburbs is larger than the
variation within the suburbs indicating that the
locations of the treatment means appear to be
significant ly different from each other.
26/07/2012
9
A Compar i son
Remember that the box plots show medians
and were trying to find differences between
means. However, for roughly symmetric
data the medians and means will be close.
It is easy to see that
the means are different
for the DVD shops
because we naturally
compare the difference
between the treatment
means to the variation
within each treatment.
10
Test i ng f or Di f f er ences
We can therefore test for differences between
means by considering the variability in the data.
The t ot al variat ion in all the data from the overall
pooled mean can be partitioned into two parts:
variat ion bet ween t reat ment s
and
variat ion wit hin t reat ment s
The relative size of these two measures of
variation allows us to t est t he hypot hesis t hat t he
samples arise from populat ions wit h equal means.
These variations are measured as sums of squares.
26/07/2012
11
Sums of Squar es
Total variation about overall mean (SSTotal) =
Between treatment variation (SST) =
Within treatment variation (SSE) =
where
y
ij
= j
th
observation in i
th
treatment
y
i
= average of i
th
treatment
y = overall average
n
i
= sample size for treatment i
s
i
= standard deviation of i
th
treatment
Note: SS(Total) = SST + SSE

2
ij
) y y (
2
i i
) y y ( n

2
i ij
) y y (
12
Ex ampl e
The Chan, Smith and Patel families were asked
to record the number of times their eldest child
washed the dishes each week, with the following
results:
Chan Smith Patel
1 2 3
3 4 5
6 7
9
n
i
2 3 4
y
i
2 4 6
s
i
1.414 2 2.582
Note: The Chan family recorded the times for two
weeks, the Smith family for three weeks
and the Patel family for four weeks.
26/07/2012
13
Could the mean number of times that the eldest child
washes dishes per week differ for the three families?
For the analysis we need only the Between and Within
treatment variations.
Between treatment (SST) =
= 2(2 4.44)
2
+3(4 4.44)
2
+4(6 4.44)
2
= 22.22
Ex er ci se 1 Calculate the Within Treatment variation
(SSE) and hence the Total variation (SS(Total)):
2
i i
) y y ( n
14
To proceed with the comparisons we calculate
Mean Squares = Sums of Squares
Degrees of Freedom
Note: Total sample size = n, # treatments = k
Degr ees of Fr eedom
Between treatment df: k 1
Within (Error) treatment df: (n
i
1) = n k
Total df: n 1
The ratio of Mean Square Treatment (MST) to Mean
Square Error (MSE) is the F-statistic and compares
the between treatment variation to the within
treatment variation (error).
See Keller: Ch.8 pp301-304
26/07/2012
15
Obt ai ni ng t he F- st at i st i c
MST = SST/df
T
n = 9, k = 3
df
T
= (k 1) = (3 1) = 2
MST = 22.22/2 = 11.11
The F-statistic = MST/MSE and has df
T
, df
E
Ex er ci se 2 Calculate MSE and hence the F-statistic
16
Anal y si s of Var i ance Model
The ANOVA model is given by:
y
ij
= u
i
+ c
ij
where u
i
is the population mean of the i
th
treatment
and the c
ij
are the residual errors which are N(0, o
2
)
So for our example the residuals (y
ij
u
i
) are:
Treatment 1: (1 2) and (3 2) ie. 1 and +1
Treatment 2: 2, 0, +2
Treatment 3: 3, 1, +1, +3
It is the residuals we look at to check the
assumptions of the model.
26/07/2012
17
Hy pot hesi s Test
Our null hypothesis is
H
0
: treatments come from populations with equal means
: u
1
= u
2
= u
3
(o = 0.05)
H
1
: not all treatments have the same population mean
The ANOVA procedure requires that the samples
arise from normal populations with equal spreads.
Look at the four-in-one plot to check this.
The test is: F = MST/MSE with df equal to df
T
, df
E
We reject H
0
if the p-value < o
Well use MINITAB output to perform the test.
18
MI NI TAB out put
Analysis of Variance
Source DF SS MS F P
Family 2 22.22 11.11 2.22 0.190
Error 6 30.00 5.00
Total 8 52.22
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev -------+---------+---------+---------
Chan 2 2.000 1.414 (------------*------------)
Smith 3 4.000 2.000 (---------*----------)
Patel 4 6.000 2.582 (--------*--------)
-------+---------+---------+---------
Pooled StDev = 2.236 0.0 3.0 6.0
Four in one plot
26/07/2012
19
Four i n one pl ot
This plot, which MINITAB provides, can be used to
check the assumptions of the model:
The normal probabilit y plot plots the expected
cumulative percentages of the residuals under
normality against the actual residuals. This plot
will be a straight line if the residuals are
normally distributed. The hist ogram of t he
residuals should also look normal.
The residuals vs. fit t ed values plot checks for
homogeneity of variance and should show
similar variations within the Treatments.
(We can also check that s
max
/s
min
< 2 for
homogeneity of variance)
20
H
0
: the mean number of times the eldest child
washes dishes each week is the same for all
three families.
: u
1
= u
2
= u
3
(o = 0.05)
H
1
: not all families have the same population mean
Note that our target
population here
would be weeks.
From the plots the normality assumption seems
reasonable as the normal scores plot is close to a
straight line and the histogram of residuals looks
fairly normal. The residuals vs. fitted values plot
indicates that the spreads are similar (s
max
/s
min
<2).
F
2,6
= 2.22
p-value = 0.19
26/07/2012
21
Concl usi on
We dont reject H
0
and conclude that the mean
number of times the eldest child wash dishes per
week could be the same for the three families.
If we had rejected H
0
we would need to go back
and look at the individual treatments to see which
ones differed from each other. Well look at this in
more detail later.
22
Tw o sampl e t - t est vs ANOVA
Consider the data we looked at in the last lecture
where we compared temperatures between Northern
and Southern Sydney suburbs during a heatwave.
MINITAB Output
Two-sample T for Temerature
N Mean StDev SE Mean
North 9 37.04 1.60 0.53
South 9 35.51 1.43 0.48
Difference = mu (North) - mu (South)
Estimate for difference: 1.533
95% CI for difference: (0.016, 3.050)
T-Test of difference=0(vs not=):T-Value=2.14 P-Value=0.048 DF=16
Both use Pooled StDev = 1.5180
The results of the test indicated that Northern
suburbs were hotter, on average, than Southern
suburbs (t = 2.14, p=0.048)
26/07/2012
23
ANOVA appr oach
We can now compare the two sample t-test with the
results from an ANOVA performed on the same data:
One-way ANOVA: Temperature versus Location
Source DF SS MS F P
Factor 1 10.58 10.58 4.59 0.048
Error 16 36.87 2.30
Total 17 47.45
S = 1.518 R-Sq = 22.30% R-Sq(adj) = 17.44%
Individual 95% CIs For Mean Based on
Pooled StDev
Level N Mean StDev ------+---------+---------+---------+---
North 9 37.044 1.602 (---------*----------)
South 9 35.511 1.429 (----------*----------)
------+---------+---------+---------+---
35.0 36.0 37.0 38.0
Pooled StDev = 1.518
Note that t
16
2
= (2.14)
2
= 4.59 = F
1,16
and the p-value = 0.048, as before.
The four in one plot would allow
us to check the assumptions of
normality and equal spread as
we did in the two sample t-test.
24
Back t o t he DVD shops ex ampl e
Consider the DVD shops example we saw earlier:
Source DF SS MS F P
Suburb 2 280.00 140.00 22.11 0.000
Error 12 76.00 6.33
Total 14 356.00
S = 2.517 R-Sq = 78.65% R-Sq(adj) = 75.09%
Pooled StDev
Ryde 5 20.000 2.236 (-----*-----)
Bondi 5 30.000 2.550 (-----*-----)
Manly 5 22.000 2.739 (-----*-----)
------+---------+---------+---------+---
20.0 24.0 28.0 32.0
26/07/2012
25
Ex er ci se 3 Use the MINITAB output on this and
the previous slide to carry out a hypothesis test to
compare the hourly rates in the different suburbs.
26
Sol ut i on t o Ex er ci se 3
26/07/2012
27
Concl usi ons
When we dont reject, then we conclude that all the
treatments could have come from populations with
the same mean.
When we reject H
0
we conclude that at least one
t reat ment has a significant ly different mean from at
least one of t he ot her t reat ment s.
BUT
this does not tell us which treatment or treatments
are different.
To detect which treatments are significantly different
from each other we need to do some more tests.
28
Compar i sons
The tests we perform will be a series of two sample
t-tests comparing means.
Note that the o-level of a test determines the
probability of a Type I error (ie. rejecting H
0
when
it is actually true).
However, with many treatments to compare we
must be mindful of the over al l error rate.
If we had five treatments to compare after finding
a significant ANOVA result then, two at a time,
there would be 10 comparisons.
These performed at a per comparison error rate of
5% would yield an overall error rate of 40%.
P(X 1) = 1 P(X = 0)
= 0.40126
) 0.95 x 0.05 x (C - 1
10 0 10
0
=
26/07/2012
29
To ensure that we get the overall error rate down
to a reasonable level we can use the
Bonf er r oni Met hod of Compar i sons
This takes into account the number of comparisons
and adjusts the o value to compensate.
Thus, with our 10 comparisons, if we want an
overall 5% error rate, each test needs to be made
at the 5%/10 ie. % or 0.005 significance level.
Not e t hat for t he pooled st andard deviat ion ( s
p
) we
use \MSE wit h it s df ( n # t reat ment s) for each
t est , as this is our best estimate of the common
population standard deviation.
30
I ncome Tax Ret ur ns Ex ampl e
An accounting firm wants to trial three different
methods aimed at training its employees in
preparing income tax returns. In particular
management want to know whether any training
method is effective.
80 employees are randomly assigned to four
treatments as follows:
1 cont r ol
2 w or k w i t h col l eague f or a w eek
3 r ead t he f i r m s pr epar ed document
4 at t end a t r ai ni ng cour se
26/07/2012
31
After training, each of the employees is then
timed to determine how long it takes them, in
minutes, to prepare a standard income tax return.
One-way ANOVA: Time versus Method
Source DF SS MS F P
Method 3 6442 2147 19.37 0.000
Error 76 8426 111
Total 79 14867
S = 10.53 R-Sq = 43.33% R-Sq(adj) = 41.09%
Pooled StDev
1:control 20 61.30 7.21 (-----*----)
2:colleague 20 51.80 11.11 (-----*-----)
3:document 20 62.00 13.38 (-----*----)
4:course 20 39.85 9.43 (-----*-----)
------+---------+---------+---------+---
40.0 48.0 56.0 64.0
Here we have used
ANOVA to compare
the times for the
four Treatments.
32
From the normal probability plot and the histogram
of the residuals the normality assumption seems
reasonable. The residuals vs. fitted values plot
indicates that homogeneity of variance is also OK.
(note s
max
/s
min
= 13.38/7.21 < 2.)
F
3,76
= 19.37 p-value 0, so reject the H
0
of no
differences and conclude that not all the treatments
arise from populations with the same mean.
Note that our target population
here would be the employees
of the accounting firm.
26/07/2012
33
Mul t i pl e Compar i sons
For the income tax returns data we wish to
determine whether any training is effective, so we
compare the control group to each of the training
groups using the Bonferroni method with an overall
significance level of 10%. For each test o = 0.033.
Ex ampl e control vs colleague:
H
0
:
control
=
colleague
vs. H
1
:
control
=
colleague
Assumption check: see previous slide
0.005 < p < 0.01 Reject H
0
at o = 0.033.
Working with a colleague is effective.
85 . 2
20
1
20
1
53 . 10
8 . 51 3 . 61
n
1
n
1
s
y y
t
colleague control
p
colleague control
76
=
+
=
+
=
34
Ex er ci se 4
Carry out the other comparisons for the income tax
returns example and summarise your findings.
Remember o = 0.1/3 = 0.033 for each comparison.
26/07/2012
35
36
Fi sher s Least Si gni f i cance Di f f er ence ( LSD)
In the income tax returns example we could have
performed all pairwise comparisons using any
specified overall significance level using two-sample
t-tests where:
20
1
20
1
53 . 10
y y
n
1
n
1
s
y y
t
j i
j i
p
j i
76
+
=
+
=
The comparison will be statistically significant if the
p-value < o/6 (since there are 6 comparisons)
ie. for o = 0.05, o/6 = 0.00833 and so
71 . 2
20
1
20
1
53 . 10
| y y |
j i
>
+
02 . 9 | y y | if . ie
j i
>
Note that this LSD method uses
the Bonferroni correction but
simplifies calculations for equal
sample sizes. This method does
not work for unequal sample sizes.
LSD = 9.02
t
0.008,76
= 2.71
26/07/2012
37
Tuk ey s Mul t i pl e Compar i sons
This is a more powerful method than Bonferroni.
The technique determines a critical number e with
g
n
MSE
) , k ( q v = e
o
where k = number of treatments
v = degrees of freedom for MSE (nk)
n
g
= # of observations in each sample
o = significance level
q
o
(k,v) = critical value of the studentised range
If any pair of sample means has a difference which
is larger than e then we conclude that the
corresponding population means are different.
Table 7 in textbook
38
Level N Mean StDev
1:control 20 61.30 7.21
2:colleague 20 51.80 11.11
3:document 20 62.00 13.38
4:course 20 39.85 9.43
I ncome Tax Ret ur ns Ex ampl e
We can now use Tukeys method with an overall
error rate of 5% to conduct all the pairwise
comparisons for the income tax returns data.
81 . 8 36 . 2 x 74 . 3
20
111
) 76 , 4 ( q
n
MSE
) , k ( q
05 . 0
g
= = = v = e
o
control vs. colleague: |61.30 51.80| = 9.5
control vs. document: |61.30 62.00| = 0.7
control vs. course: |61.30 39.85| = 21.45
colleague vs. document: |51.80 62.00| = 10.2
colleague vs. course: |51.80 39.85| = 11.95
document vs. course: |62.00 39.85| = 22.15
26/07/2012
39
Using Tukeys method with an overall significance
level of o = 0.05 we find that all treatments are
significantly different from each other except
control vs. document. Attending a course is shown
to be the most effective, followed by working with
a colleague. Neither reading a document nor no
training is as effective but these two do not differ
significantly from each other.
So long as the sample sizes are not too different
Tukeys method can be used with
k 2 1
g
n
1
.....
n
1
n
1
k
n
+ + +
=
I ncome Tax Ret ur ns Ex ampl e cont i nued
Tukeys method is exact
for equal sample sizes and
conservative for sample
sizes which are not equal.
40
Ex er ci se 5
Using Tukeys method and o = 0.05, carry out all
pairwise comparisons for the DVD shops example.
q
0.05
(3,12) = 3.77
Grouping Information Using Tukey Method
Area N Mean Grouping
Bondi 5 30.000 A
Manly 5 22.000 B
Ryde 5 20.000 B
Means that do not share a letter are significantly different.
Tukey 95% Simultaneous Confidence Intervals
All Pairwise Comparisons among Levels of Area_DVDs
Individual confidence level = 97.94%
Area = Ryde subtracted from:
Area Lower Center Upper -------+---------+---------+---------+--
Bondi 5.757 10.000 14.243 (-----*-----)
Manly -2.243 2.000 6.243 (-----*-----)
-------+---------+---------+---------+--
-7.0 0.0 7.0 14.0
Area = Bondi subtracted from:
Area Lower Center Upper -------+---------+---------+---------+--
Manly -12.243 -8.000 -3.757 (-----*-----)
-------+---------+---------+---------+--
-7.0 0.0 7.0 14.0
26/07/2012
41
42
Lect ur e 8 summar y
Analysis of Variance is an extension of the two-
sample pooled t-test which allows us to compare
two or more treatments.
Pairwise comparisons, using s
p
from the original
ANOVA, can be made if we determine that there is
a difference in the treatments, but we must adjust
the test to accommodate an overall significance
level for all the comparisons we wish to make.
We can carry out the pairwise comparisons using
Bonferroni or Tukeys method. We need to
consider the comparisons we wish to make to
determine which method is preferable.

Lecture08 ACCG615 2pp

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture08 ACCG615 2pp

Uploaded by

Copyright:

Available Formats

26/07/2012

ACCG615 Lecture 8: Analysis of Variance Copyright Macquarie University 2012

You might also like