Anova Lecture

ME 311 ME 311
Engi neer i ng Ex per i ment at i on I I

Lec t ur e 7
Basi c St at i st i c s and ANOVA Basi c St at i st i c s and ANOVA
ME 311, Mechanical Engineering
University of Kentucky
Summar y of Lec t ur e 6 y
Regression Model
Linear model coefficients
Model evaluation
Exploit contour and surface plots
Error bars for 2
2
example p
Single factor multiple level
Make sense of your data
Direct and indirect data analysis Direct and indirect data analysis
Model Linearization
Curve fitting
Goodness of the fit Goodness of the fit
R
2
definition
Single factor example
Basic Statistical Concepts Basic Statistical Concepts
Simple comparative experiments Simple comparative experiments
The hypothesis testing framework
The two-sample t-test
Ch ki ti lidit Checking assumptions, validity
Comparing more than two factors levelsthe analysis of
variance
ANOVA decomposition of total variability
Statistical testing & analysis
Checking assumptions, model validity
Post-ANOVA testing of means
Sample size determination
Portland Cement Formulation (page 23)
Graphical View of the Data Graphical View of the Data
Dot Diagram, Fig. 2-1, pp. 24
Box Plots, Fig. 2-3, pp. 26
The Hypothesis Testing Framework
Statistical hypothesis testing is a useful framework for
many experimental situations
O i i f th th d l d t f th l 1900 Origins of the methodology date from the early 1900s
We will use a procedure known as the two-sample t-
test test
The Hypothesis Testing Framework y g
Sampling from a normal distribution
Statistical hypotheses:
0 1 2
: H =
Statistical hypotheses:
1 1 2
: H
Estimation of Parameters
1
estimatesthepopulationmean
n
i
y y =

1
estimates the population mean
1
i
i
n
y y
n

=
2 2 2
1
1
( ) estimates the variance
1
i
i
S y y
n

=
=

Summary Statistics (pg. 36) y (pg )
Modified Mortar
New recipe
Unmodified Mortar
Original recipe
1
2
16.76
0100
y
S
=
New recipe
g p
1
2
17.04
0061
y
S
=
2
1
1
0.100
0.316
S
S
=
=
1
1
0.061
0.248
S
S
=
=
1
10 n =
1
10 n =
How the Two-Sample t-Test Works:
Usethesamplemeanstodrawinferencesabout thepopulationmeans
1 2
Use the sample means to draw inferences about the population means
16.76 17.04 0.28
Differenceinsamplemeans
y y = =
2
Difference in sample means
Standard deviation of the difference in sample means
2
y
This suggests a statistic:
n
=
gg
1 2
0
2 2
1 2
Z
y y

=
1 2
1 2
n n

+
2 2 2 2
U d t ti t d S S
2 2 2 2
1 2 1 2
1 2
Use and to estimate and
Thepreviousratiobecomes
S S
y y

1 2
2 2
1 2
The previous ratio becomes
y y
S S
+
1 2
2 2 2
1 2
However, we have the case where
n n
= =
1 2
2 2
,
Pool the individual sample variances:
2 2
2
1 1 2 2
1 2
( 1) ( 1)
2
p
n S n S
S
n n
+
=
+
1 2
The test statistic is
1 2
0

1 1
y y
t

=
1 2
1 1
p
S
n n
+
Values of t
0
that are near zero are consistent with the null hypothesis
Values of t
0
that are very different from zero are consistent with the
alternative hypothesis alternative hypothesis
t
0
is a distance measure-how far apart the averages are expressed in
standard deviation units
Notice the interpretation of t
0
as a signal-to-noise ratio Notice the interpretation of t
0
as a signal to noise ratio
The Two-Sample (Pooled) t-Test
2 2
2
1 1 2 2
1 2
( 1) ( 1) 9(0.100) 9(0.061)
0.081
2 10 10 2
p
n S n S
S
n n
+ +
= = =
+ +
0.284
p
S =
1 2
0
16.76 17.04
2.20
1 1 1 1
0284
y y
t
S

= = =
+ +
1 2
0.284
10 10
p
S
n n
+ +
The two sample means are a little over two standard deviations apart
Is this a "large" difference?
So far, we havent really done , y
any statistics
We need an objective basis
for deciding how large the test
t ti ti t ll i
t
0
=-2.20
statistic t
0
really is
In 1908, W. S. Gosset derived
the reference distribution
for t called the t for t
0
called the t
distribution
Tables of the t distribution -
text page 606 text, page 606
A value of t
0
between 2.101 and 2.101 is consistent with equality of means
t
0
is exceeding the range of 2.101 or 2.101, leads to significant means difference
Could also use the P-value approach
t
0
=-220 t
0
2.20
The Two-Sample (Pooled) t-Test ( )
t
0
=-2.20
The P-value is the risk of wrongly rejecting the null hypothesis of equal g y j g yp q
means (it measures rareness of the event)
The P-value in our problem is P = 0.042
The Normal Probability Plot y
Importance of the t-Test p
Provides an objective framework for simple comparative
experiments
C ld b d t t t ll l t h th i t Could be used to test all relevant hypotheses in a two-
level factorial design, because all of these hypotheses
involve the mean response at one sideof the cube involve the mean response at one side of the cube
versus the mean response at the opposite side of the
cube
What If There Are More Than Two Factor Levels?
The t-test does not directly apply
There are lots of practical situations where there are either more
than two levels of interest, or there are several factors of
simultaneous interest
The analysis of variance (ANOVA) is the appropriate analysis
engine for these types of experiments Chapter 3, textbook
Th ANOVA d l d b Fi h i th l 1920 d The ANOVA was developed by Fisher in the early 1920s, and
initially applied to agricultural experiments
Used extensively today for industrial experiments y y p
An Example (See pg. 60) p ( pg )
An engineer is interested in investigating the relationship
between the RF power setting and the etch rate for this tool The between the RF power setting and the etch rate for this tool. The
objective of an experiment like this is to model the relationship
between etch rate and RF power, and to specify the power
setting that will give a desired target etch rate setting that will give a desired target etch rate.
The response variable is etch rate.
She is interested in a particular gas (C2F6) and gap (0.80 cm),
and wants to test four levels of RF power: 160W, 180W, 200W,
and 220W. She decided to test five wafers at each level of RF
power.
The experimenter chooses 4 levels of RF power 160W, 180W,
200W, and 220W
The experiment is replicated 5 times runs made in random The experiment is replicated 5 times runs made in random
order
An Example (See pg. 62)
Does changing the power
change the mean etch
rate?
Is there an optimum level
for power? for power?
The Analysis of Variance (Sec. 3-2, pg. 63)
In general, there will be a levels of the factor, or a treatments,
and n replicates of the experiment, run in random ordera p p ,
completely randomized design (CRD)
N = an total runs
We consider the fixed effects case the random effects case We consider the fixed effects casethe random effects case
will be discussed later
Objective is to test hypotheses about the equality of the a
treatment means
treatment means
The Analysis of Variance
The name analysis of variance stems from a partitioning of
the total variability in the response variable into components that the total variability in the response variable into components that
are consistent with a model for the experiment
The basic single-factor ANOVA model is
1,2,...,
,
ij i ij
i a
y
=
= + +

,
1,2,...,
ij i ij
y
j n
+ +

=
2
an overall mean, treatment effect,
i t l (0 )
i
ith
NID
= =
2
experimental error, (0, )
ij
NID =
Models for the Data
There are several ways to write a model for the data:
is called the effects model
Let then
ij i ij
y

= + +
+ Let , then
is called the means model
i i
ij i ij
y

= +
= +
Regression models can also be employed
ij i ij
y
Total variability is measured by the total sum of squares:
The basic ANOVA partitioning is:
2
..
( )
a n
T ij
SS y y =
1 1 i j = =
2 2
.. . .. .
( ) [( ) ( )]
a n a n
ij i ij i
y y y y y y = +

1 1 1 1
2 2
( ) ( )
j j
i j i j
a a n
i ij i
n y y y y
= = = =
= +

. .. .
1 1 1
( ) ( )
i ij i
i i j
T Treatments E
y y y y
SS SS SS
= = =
= +

T Treatments E
SS SS SS = +
A large value of SS
Treatments
reflects large differences in treatment
means
A small value of SS
Treatments
likely indicates no differences in
treatment means
Formal statistical hypotheses are:
0 1 2
: H = = = L
0 1 2
1
:
: At least one mean is different
a
H
H

While sums of squares cannot be directly compared to test the
hypothesis of equal means, mean squares can be compared.
A mean square is a sum of squares divided by its degrees of freedom:
T l T E
df df df = +
1 1 ( 1)
Total Treatments Error
df df df
an a a n
SS SS
+
= +
,
1 ( 1)
Treatments E
Treatments E
SS SS
MS MS
a a n
= =

If the treatment means are equal, the treatment and error mean
squares will be (theoretically) equal.
If t t t diff th t t t ill b l th If treatment means differ, the treatment mean square will be larger than
the error mean square.
Analysis of Variance: Summarized
Computingsee text, pp 66-70
The reference distribution for F
0
is the F
a-1, a(n-1)
distribution
R j t th ll h th i ( l t t t ) if Reject the null hypothesis (equal treatment means) if
0 , 1, ( 1) a a n
F F

>
0 , 1, ( 1) a a n
ANOVA Table: Example 3-1
The Reference Distribution:
ANOVA calculations are usually done via
t computer
Calculations can be done on Minitab, NCSS, Excel,
Matlab, Scilab, etc , ,
Model Adequacy Checking in the ANOVA
f S Text reference, Section 3-4, pg. 75
Checking assumptions is important
Normalityy
Constant variance
Independence p
Have we fit the right model?
Later we will talk about what to do if some of these Later we will talk about what to do if some of these
assumptions are violated
Model Adequacy Checking in the ANOVA
Examination of Examination of
residuals (see text, Sec.
3-4, pg. 75)
ij ij ij
e y y
y y
=
=
NCSS generates the
. ij i
y y =
g
residuals
Residual plots are very
useful useful
Normal probability plot
of residuals
Other Important Residual Plots
Post-ANOVA Comparison of Means
The analysis of variance tests the hypothesis of equal treatment The analysis of variance tests the hypothesis of equal treatment
means
Assume that residual analysis is satisfactory
If th t h th i i j t d d t k hi h ifi If that hypothesis is rejected, we dont know which specific means
are different
Determining which specific means differ following an ANOVA is
called the multiple comparisons problem
There are lots of ways to do thissee text, Section 3-5, pg. 87
We will use pairwise t-tests on meanssometimes called Fishers e use pa se t tests o ea s so et es ca ed s e s
Least Significant Difference (or Fishers LSD) Method
Two-Factor, Multiple levels Experiment , p p
alevels of factor A; blevels of factor B; nreplicates
Extension of the ANOVA to Factorials
2 2 2
a b n a b

2 2 2
... .. ... . . ...
1 1 1 1 1
( ) ( ) ( )
ijk i j
i j k i j
a b a b n
y y bn y y an y y
= = = = =
= +

2 2
. .. . . ... .
1 1 1 1 1
( ) ( )
ij i j ijk ij
i j i j k
n y y y y y y
= = = = =
+ + +

T A B AB E
SS SS SS SS SS = + + +
breakdown:
1 1 1 ( 1)( 1) ( 1)
df
abn a b a b ab n = + + + 1 1 1 ( 1)( 1) ( 1) abn a b a b ab n + + +
ANOVA Table Fixed Effects Case ANOVA Table Fixed Effects Case
NCSS and Minitab will perform the computations
Text gives details of manual computing see pp.
169 & 170
Analysis of Variance Table
Source Sum of Mean Prob Power
Term DF Squares Square F-Ratio Level (Alpha=0.05)
A: C2 2 900801.2 450400.6 2563.41 0.000000* 1.000000
B: C3 2 420599.2 210299.6 1196.90 0.000000* 1.000000
AB 4 809992.1 202498 1152.50 0.000000* 1.000000
S 18 3162.667 175.7037
Total (Adjusted) 26 2134555 ( j )
Total 27
* Term significant at alpha =0.05
Means and Effects Section
Standard
Term Count Mean Error Effect Term Count Mean Error Effect
All 27 478.2592 478.2592
A: C2
1 9 468.7778 4.418442 -9.481482
2 9 706.5555 4.418442 228.2963
3 9 259 4445 4 418442 218 8148 3 9 259.4445 4.418442 -218.8148
B: C3
1 9 305.4445 4.418442 -172.8148
2 9 595.7778 4.418442 117.5185
3 9 533.5555 4.418442 55.2963
AB: C2,C3
1,1 3 16.33333 7.652967 -279.6296
1,2 3 796.6667 7.652967 210.3704
1,3 3 593.3333 7.652967 69.25926
2 1 3 538 6667 7 652967 4 925926 2,1 3 538.6667 7.652967 4.925926
2,2 3 708 7.652967 -116.0741
2,3 3 873 7.652967 111.1481
3,1 3 361.3333 7.652967 274.7037
3,2 3 282.6667 7.652967 -94.2963
3 3 3 134 3333 7 652967 180 4074
3,3 3 134.3333 7.652967 -180.4074
Factorials with More Than Two Factors
Basic procedure is similar to the two-factor case; all abc kn Basic procedure is similar to the two factor case; all abckn
treatment combinations are run in random order
ANOVA identity is also similar:
T A B AB AC
ABC AB K E
SS SS SS SS SS
SS SS SS
= + + + + +
+ + + +
L L
L
Complete three-factor example in text, Example 5-5
ABC AB K E
SS SS SS + + + +
L
Readi ngs g
Readings:
Ch t 3 d 5 Chapters 3 and 5

Anova Lecture

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Anova Lecture

Uploaded by

Copyright:

Available Formats

ME 311 ME 311

Engi neer i ng Ex per i ment at i on I I

You might also like