ANOVA Assumptions

The Assumptions of ANOVA
Dennis Monday Gary Klein Sunmi Lee
May 10, 2005
Major Assumptions of Analysis of Variance

The Assumptions
Independence Normally distributed Homogeneity of variances
Our Purpose
Examine these assumptions Provide various tests for these assumptions
Theory Sample SAS code (SAS, Version 8.2)
Consequences when these assumptions are not met Remedial measures
Normality
Why normal?
ANOVA is an Analysis of Variance Analysis of two variances, more specifically, the ratio of two variances Statistical inference is based on the F distribution which is given by the ratio of two chi-squared distributions No surprise that each variance in the ANOVA ratio come from a parent normal distribution
Calculations can always be derived no matter what the distribution is. Calculations are algebraic properties separating sums of squares. Normality is only needed for statistical inference.
Normality Tests
Wide variety of tests we can perform to test if the data follows a normal distribution. Mardia (1980) provides an extensive list for both the univariate and multivariate cases, categorizing them into two types
Properties of normal distribution, more specifically, the first four moments of the normal distribution
Shapiro-Wilks W (compares the ratio of the standard deviation to the variance multiplied by a constant to one)
Goodness-of-fit tests,
Kolmogorov-Smirnov D Cramer-von Mises W2 Anderson-Darling A2
Normality Tests
proc univariate data=temp normal plot; var expvar; run;
Tests for Normality
proc univariate data=temp normal plot; var normvar; run;

Tests for Normality
Test
Shapiro-Wilk Kolmogorov-Smirnov Cramer-von Mises Anderson-Darling
Normal Probability Plot 8.25+ | | | | | | | 4.25+ | | | | | | | 0.25+* +++* ++**** ++++ ** ++++***** ++****** * ****************** ** ** +++ *+++ * *
--Statistic--W D W-Sq A-Sq 0.731203 0.206069 1.391667 7.797847
-----p Value-----Pr Pr Pr Pr < > > > W D W-Sq A-Sq <0.0001 <0.0100 <0.0050 <0.0050
Test
Shapiro-Wilk Kolmogorov-Smirnov Cramer-von Mises Anderson-Darling
--Statistic--W D W-Sq A-Sq 0.989846 0.057951 0.03225 0.224264
-----p Value-----Pr Pr Pr Pr < > > > W D W-Sq A-Sq 0.6521 >0.1500 >0.2500 >0.2500
+ ++++
+----+----+----+----+----+----+----+----+----+----+
Stem 8 7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 0
Leaf 0
# 1
Boxplot *
1 2 5 4 588 3 59 00112234 56688 00011122223444 55555566667777778999999 000011111111111112222222233333334444444 ----+----+----+----+----+----+----+----
1 1 1 1 3 1 2 8 5 14 23 39
* * 0 0 0 0 | | | +--+--+ *-----* +-----+
Normal Probability Plot 2.3+ ++ * | ++* | +** | +** | **** | *** | **+ | ** | *** | **+ | *** 0.1+ *** | ** | *** | *** | ** | +*** | +** | +** | **** | ++ | +* -2.1+*++ +----+----+----+----+----+----+----+----+----+----+ -2 -1 0 +1 +2
Stem 22 20 18 16 14 12 10 8 6 4 2 0 -0 -2 -4 -6 -8 -10 -12 -14 -16 -18 -20
Leaf 1 7 90 047 6779 469002 2368 005546 228880077 5233446 3458447 366904459 52871 884318651 98619 60 98557220 963 584 853 0 4 8 ----+----+----+----+ Multiply Stem.Leaf by 10**-1
# 1 1 2 3 4 6 4 6 9 7 7 9 5 9 5 2 8 3 3 3 1 1 1
Boxplot | | | | | | | +-----+ | | | | *-----* | + | | | | | +-----+ | | | | | | | |
Consequences of Non-Normality
F-test is very robust against non-normal data, especially in a fixed-effects model Large sample size will approximate normality by Central Limit Theorem (recommended sample size > 50) Simulations have shown unequal sample sizes between treatment groups magnify any departure from normality A large deviation from normality leads to hypothesis test conclusions that are too liberal and a decrease in power and efficiency
Remedial Measures for NonNormality

Data transformation Be aware - transformations may lead to a fundamental change in the relationship between the dependent and the independent variable and is not always recommended. Dont use the standard F-test.
Modified F-tests
Adjust the degrees of freedom Rank F-test (capitalizes the F-tests robustness)
Randomization test on the F-ratio Other non-parametric test if distribution is unknown Make up our own test using a likelihood ratio if distribution is known
Independence
Independent observations
No correlation between error terms No correlation between independent variables and error
Positively correlated data inflates standard error

The estimation of the treatment means are more accurate than the standard error shows.
Independence Tests
If we have some notion of how the data was
collected, we can check if there exists any autocorrelation. The Durbin-Watson statistic looks at the correlation of each value and the value before it
Data must be sorted in correct order for meaningful results For example, samples collected at the same time would be ordered by time if we suspect results could depend on time
Independence Tests
proc glm data=temp; class trt; model y = trt / p; output out=out_ds r=resid_var; run; quit; data out_ds; set out_ds; time = _n_; run; proc gplot data=out_ds; plot resid_var * time; run; quit; proc glm data=temp; class trt; model y = trt / p; output out=out_ds r=resid_var; run; quit; data out_ds; set out_ds; time = _n_; run; proc gplot data=out_ds; plot resid_var * time; run; quit;
First Order Autocorrelation 0.90931 Durbin-Watson D 0.12405
First Order Autocorrelation 0.00479029 Durbin-Watson D 1.96904290
Remedial Measures for Dependent Data

First defense against dependent data is proper study design and randomization
Designs could be implemented that takes correlation into account, e.g., crossover design
Look for environmental factors unaccounted for

Add covariates to the model if they are causing correlation, e.g., quantified learning curves
If no underlying factors can be found attributed to the autocorrelation

Use a different model, e.g., random effects model Transform the independent variables using the correlation coefficient
Homogeneity of Variances
Eisenhart (1947) describes the problem of unequal variances as follows
the ANOVA model is based on the proportion of the mean squares of the factors and the residual mean squares The residual mean square is the unbiased estimator of 2, the variance of a single observation The between treatment mean squares takes into account not only the differences between observations, 2, just like the residual mean squares, but also the variance between treatments If there was non-constant variance among treatments, we can replace the residual mean square with some overall variance, a2, and a treatment variance, t2, which is some weighted version of a2 The neatness of ANOVA is lost
Homogeneity of Variances
The omnibus (overall) F-test is very robust against heterogeneity of variances, especially with fixed effects and equal sample sizes. Tests for treatment differences like t-tests and contrasts are severely affected, resulting in inferences that may be too liberal or conservative.
Tests for Homogeneity of Variances

Levenes Test
computes a one-way-anova on the absolute value (or sometimes the square) of the residuals, |yij i| with t-1, N t degrees of freedom Considered robust to departures of normality, but too conservative
Brown-Forsythe Test
a slight modification of Levenes test, where the median is substituted for the mean (Kuehl (2000) refers to it as the Levene (med) Test)
The Fmax Test
Proportion of the largest variance of the treatment groups to the smallest and compares it to a critical value table Tabachnik and Fidell (2001) use the Fmax ratio more as a rule of thumb rather than using a table of critical values.
Fmax ratio is no greater than 10 Sample sizes of groups are approximately equal (ratio of smallest to largest is no greater than 4)
No matter how the Fmax test is used, normality must be assumed.

proc glm class model means run; quit; data=temp; trt; y = trt; trt / hovtest=levene hovtest=bf; proc glm class model means run; quit; data=temp; trt; y = trt; trt / hovtest=levene hovtest=bf;
Homogeneous Variances The GLM Procedure Levene's Test for Homogeneity of Y Variance ANOVA of Squared Deviations from Group Means Sum of Squares 10.2533 1663.5 Mean Square 10.2533 16.9747
Heterogenous Variances The GLM Procedure Levene's Test for Homogeneity of y Variance ANOVA of Squared Deviations from Group Means Sum of Squares 10459.1 27921.5 Mean Square 10459.1 284.9
Source TRT Error
DF 1 98
F Value 0.60
Pr > F 0.4389
Source trt Error
DF 1 98
F Value 36.71
Pr > F <.0001
Brown and Forsythe's Test for Homogeneity of Y Variance ANOVA of Absolute Deviations from Group Medians Sum of Squares 0.7087 124.6 Mean Square 0.7087 1.2710
Brown and Forsythe's Test for Homogeneity of y Variance ANOVA of Absolute Deviations from Group Medians Sum of Squares 318.3 333.8 Mean Square 318.3 3.4065
Source TRT Error
DF 1 98
F Value 0.56
Pr > F 0.4570
Source trt Error
DF 1 98
F Value 93.45
Pr > F <.0001

SAS (as far as I know) does not have a procedure to obtain Fmax (but easy to calculate) More importantly:
VARIANCE TESTS ARE ONLY FOR ONE-WAY ANOVA
WARNING: Homogeneity of variance testing and Welch's ANOVA are only available for unweighted one-way models.
Tests for Homogeneity of Variances (Randomized Complete Block Design and/or Factorial Design) In a CRD, the variance of each treatment group is checked for homogeneity In factorial/RCBD, each cells variance should be checked
H0: ij2 = ij2, For all i,j where i i, j j
Tests for Homogeneity of Variances (Randomized Complete Block Design and/or Factorial Design)
Approach 1
Approach 2
Recall Levenes Test and BrownForsythe Test are ANOVAs based on residuals Find residual for each observation Run ANOVA
Code each row/column to its own group Run HOVTESTS as before

data newgroup; set oldgroup; if block = 1 and treat = 1 then newgroup if block = 1 and treat = 2 then newgroup if block = 2 and treat = 1 then newgroup if block = 2 and treat = 2 then newgroup if block = 3 and treat = 1 then newgroup if block = 3 and treat = 2 then newgroup run; proc glm data=newgroup; class newgroup; model y = newgroup; means newgroup / hovtest=levene hovtest=bf; run; quit; = 1; = 2; = 3; = 4; = 5; = 6;
proc sort data=oldgroup; by treat block; run; proc means data=oldgroup noprint; by treat block; var y; output out=stats mean=mean median=median; run; data newgroup; merge oldgroup stats; by treat block; resid = abs(mean - y); if block = 1 and treat = 1 then newgroup = 1; run; proc glm data=newgroup; class newgroup; model resid = newgroup; run; quit;
Tests for Homogeneity of Variances (Repeated-Measures Design)

Recall the repeated-measures set-up:
Treatment a1 s1 s2 s3 s4 a2 s1 s2 s3 s4 a3 s1 s2 s3 s4

As there is only one score per cell, the variance of each cell cannot be computed. Instead, four assumptions need to be tested/satisfied
Compound Symmetry
Homogeneity of variance in each column
a12 = a22 = a32
Homogeneity of covariance between columns

a1a2 = a2a3 = a3a1
No A x S Interaction (Additivity) Sphericity

Variance of difference scores between pairs are equal
Ya1-Ya2 = Ya1-Ya3 = Ya2-Ya3

Usually, testing sphericity will suffice Sphericity can be tested using the Mauchly test in SAS
proc glm data=temp; class sub; model a1 a2 a3 = sub / nouni; repeated as 3 (1 2 3) polynomial / summary printe; run; quit;
Sphericity Tests Mauchly's Criterion Det = 0 Det = 0
Variables Transformed Variates Orthogonal Components
DF 2 2
Chi-Square 6.01 6.03
Pr > ChiSq .056 .062
Tests for Homogeneity of Variances (Latin-Squares/Split-Plot Design)

If there is only one score per cell, homogeneity of variances needs to be shown for the marginals of each column and each row
Each factor for a latin-square Whole plots and subplots for split-plot
If there are repititions, homogeneity is to be shown within each cell like RCBD If there are repeated-measures, follow guidelines for sphericity, compound symmetry and additivity as well
Remedial Measures for Heterogeneous Variances

Studies that do not involve repeated measures
If normality is not violated, a weighted ANOVA is suggested (e.g., Welchs ANOVA) If normality is violated, the data transformation necessary to normalize data will usually stabilize variances as well If variances are still not homogeneous, non-ANOVA tests might be your option
Studies with repeated measures

For violations of sphericity
modify the degrees of freedom have been suggested.
Greenhouse-Geisser Huynh and Feldt
Only do specific comparisons (sphericity does not apply since only two groups sphericity implies more than two) MANOVA Use an MLE procedure to specify variance-covariance matrix
Other Concerns
Outliers and influential points
Data should always be checked for influential points that might bias statistical inference
Use scatterplots of residuals Statistical tests using regression to detect outliers
DFBETAS Cooks D
References
Casella, G. and Berger, R. (2002). Statistical Inference. United States: Duxbury. Cochran, W. G. (1947). Some Consequences When the Assumptions for the Analysis of Variances are not Satisfied. Biometrics. Vol. 3, 22-38. Eisenhart, C. (1947). The Assumptions Underlying the Analysis of Variance. Biometrics. Vol. 3, 1-21. Ito, P. K. (1980). Robustness of ANOVA and MANOVA Test Procedures. Handbook of Statistics 1: Analysis of Variance (P. R. Krishnaiah, ed.), 199-236. Amsterdam: NorthHolland. Kaskey, G., et al. (1980). Transformations to Normality. Handbook of Statistics 1: Analysis of Variance (P. R. Krishnaiah, ed.), 321-341. Amsterdam: North-Holland. Kuehl, R. (2000). Design of Experiments: Statistical Principles of Research Design and Analysis, 2nd edition. United States: Duxbury. Kutner, M. H., et al. (2005). Applied Linear Statistical Models, 5th edition. New York: McGraw-Hill. Mardia, K. V. (1980). Tests of Univariate and Multivariate Normality. Handbook of Statistics 1: Analysis of Variance (P. R. Krishnaiah, ed.), 279-320. Amsterdam: North-Holland. Tabachnik, B. and Fidell, L. (2001). Computer-Assisted Research Design and Analysis. Boston: Allyn & Bacon.

ANOVA Assumptions

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ANOVA Assumptions

Uploaded by

Copyright:

Available Formats

The Assumptions of ANOVA

Dennis Monday Gary Klein Sunmi Lee

May 10, 2005

Major Assumptions of Analysis of Variance

Consequences when these assumptions are not met Remedial measures

proc univariate data=temp normal plot; var normvar; run;

--Statistic--W D W-Sq A-Sq 0.731203 0.206069 1.391667 7.797847

--Statistic--W D W-Sq A-Sq 0.989846 0.057951 0.03225 0.224264

1 2 5 4 588 3 59 00112234 56688 00011122223444 55555566667777778999999 000011111111111112222222233333334444444 ----+----+----+----+----+----+----+----

* * 0 0 0 0 | | | +--+--+ *-----* +-----+

Stem 22 20 18 16 14 12 10 8 6 4 2 0 -0 -2 -4 -6 -8 -10 -12 -14 -16 -18 -20

Boxplot | | | | | | | +-----+ | | | | *-----* | + | | | | | +-----+ | | | | | | | |

Remedial Measures for NonNormality

Positively correlated data inflates standard error

First Order Autocorrelation 0.90931 Durbin-Watson D 0.12405

First Order Autocorrelation 0.00479029 Durbin-Watson D 1.96904290

Remedial Measures for Dependent Data

Look for environmental factors unaccounted for

If no underlying factors can be found attributed to the autocorrelation

Tests for Homogeneity of Variances

The Fmax Test

No matter how the Fmax test is used, normality must be assumed.

Tests for Homogeneity of Variances

Source TRT Error

Source trt Error

Source TRT Error

Source trt Error

Tests for Homogeneity of Variances

Code each row/column to its own group Run HOVTESTS as before

Tests for Homogeneity of Variances (Repeated-Measures Design)

Tests for Homogeneity of Variances (Repeated-Measures Design)

Homogeneity of covariance between columns

No A x S Interaction (Additivity) Sphericity

Tests for Homogeneity of Variances (Repeated-Measures Design)

Sphericity Tests Mauchly's Criterion Det = 0 Det = 0

Variables Transformed Variates Orthogonal Components

Chi-Square 6.01 6.03

Pr > ChiSq .056 .062

Tests for Homogeneity of Variances (Latin-Squares/Split-Plot Design)

Remedial Measures for Heterogeneous Variances

Studies with repeated measures

You might also like

* * 0 0 0 0 | | | +--+--+ ----- +-----+

Boxplot | | | | | | | +-----+ | | | | ----- | + | | | | | +-----+ | | | | | | | |