Professional Documents
Culture Documents
Group
Analysis of Variance and
Covariance
Level : E2
Contents
Key concepts
Analysis of Variance
Analysis of Covariance
GLM Procedure
Analysis of Variance
ANOVA
used to uncover the main and interaction effects of categorical
independent variables (called "factors") on interval dependent
variable (s).
Example:
An experiment may measure weight change (the dependent variable) for men
and women who participated in two different weight-loss programs. The 4
cells of the design are formed by the 4 combinations of sex (men, women) and
program (A, B).
Example:
Let us have a situation where we have three means A, B and C. We want to test the
H0 : A = B = C
Against H1 : at least one of them is different than others.
Assumptions
The scale on which the dependent variable is measured has the properties of an equal interval
scale.
The k samples are independently and randomly drawn from the source population(s)
Main Effect
Interaction Effect
the effects of one factor differs according to the levels of another factor
The key statistic in ANOVA is the F-test of difference of group means, testing if the
means of the groups formed by values of the independent variable (or combinations of
values for multiple independent variables) are different enough not to have occurred by
chance.
However, when ANOVA is used for comparing two or more different samples, the real
means are unknown. The researcher wants to know if the difference in sample means is
enough to conclude the real means do in fact differ among two or more groups.
If the group means do not differ significantly then it is inferred that the independent
variable(s) did not have an effect on the dependent variable.
If the F test shows that overall the independent variable(s) is (are) related to the
dependent variable, then multiple comparison tests of significance are used to explore
just which values of the independent(s) have the most to do with the relationship.
Post-hoc Comparisons
If null hypothesis in ANOVA is rejected then go for the multiple comparison (Post-hoc Comparisons)
test.
Duncan
Dunnett
Bonferroni, Scheffe
Suppose we are testing the null hypothesis that the four sample means are equal
H0 : m1 = m 2 = m3 = m 4
H1 : m1 m 2 m3 m 4
this hypothesis is rejected.
The F test in ANOVA tells that at least one mean is not same to the other but it
does not specify which particular mean it is
One of the possible ways to detect which particular sample mean is different may to
conduct the following six tests-
Unbalanced Designs
If the sample sizes for the treatment combinations are not all equal.
confounding is the condition that the effects of two (or more) explanatory variables cannot be
distinguished from each other
Type II sum of square are the reduction in the SSE due to adding the effect to a model that contains all other
effects except effects that contains the effect being tested.
Type III SS are each adjusted for all other effects in the model
If our model does not contain any interaction term then both will lead to same output
For the highest order interaction term the two methods will always provide same estimate
If interaction can be safely ignored then Type II provides more powerful than that obtained from Type III to test
the significance of main effect
If there are not sufficient reasons to ignore interactions then we should use Type III. This is the default type in
most of the softwares for Statistical Analysis
SAS Implementation
Example
title1 'Nitrogen Content
data Clover;
input Strain $ Nitrogen
datalines;
1 19.4
1 32.6
1
5 17.7 5
24.8
5
4 17.0 4 19.4
4
7 20.7 7 21.0
7
13 14.3 13 14.4
13
15 17.3 15 19.4
15
1
5
4
7
13
15
32.1
25.2
11.9
18.8
11.6
16.9
1
5
4
7
13
15
33.0
24.3
15.8
18.6
14.2
20.8 ;
F value Pr>F
14.37 <.0001
The degrees of freedom (DF) column should be used to check the analysis
results. The model degrees of freedom for a one-way analysis of variance
are the number of levels minus 1; in this case, 6-1=5. The Corrected Total
degrees of freedom are always the total number of observations minus one;
in this case 30-1=29. The sum of Model and Error degrees of freedom
equal the Corrected Total.
The overall F test is significant (F=14.37, p<0.0001), indicating that the
model as a whole accounts for a significant portion of the variability in the
dependent variable. The F test for Strain is significant, indicating that some
contrast between the means for the different strains is different from zero.
Notice that the Model and Strain F tests are identical, since Strain is the
only term in the model.
The F test for Strain (F=14.37, p<0.0001) suggests that there are
differences among the bacterial strains, but it does not reveal any
information about the nature of the differences. Mean comparison methods
can be used to gather further information.
Analysis of Covariance
A combination of linear Regression and ANOVA.
Failure to do this often means that some portion of the effect of the predictor is
removed from the dependent when the covariate adjustment is calculated.
The rules like that for sum of squares etc remain as they were in the case of
ANOVA.
GLM Procedures
The general linear model (GLM) is a statistical
linear model. It may be written as
Y = XB + U
where Y is a matrix with series of multivariate measurements, X is a matrix that might be a design matrix,
B is a matrix containing parameters that are usually to be estimated and U is a matrix containing residuals
(i.e., errors or noise). The residual is usually assumed to follow a multivariate normal distribution. If the
residual is not a multivariate normal distribution, Generalized linear models may be used to relax
assumptions about Y and U.
The GLM procedure uses the method of least squares to fit general linear models.
GLM handles models relating one or several continuous dependent variables to one or several
independent variables. The independent variables may be either classification variables, which divide the
observations into discrete groups, or continuous variables.
Thus, the GLM procedure can be used for many different analyses, including
simple regression
multiple regression
response-surface models
weighted regression
polynomial regression
partial correlation
PROC GLM handles models relating one or several continuous dependent variables
to one or several independent variables.
MEANS computes means of the dependent variable for each value of the specified
effect
LSMEANS produces means for the outcome variable, broken out by the variable
specified and adjusting for any other explanatory variables included on the MODEL
statement.
OUTPUT specifies an output data set that contains all variables from the input data
set and variables representing statistics from the analysis.
Example
Result
Analysis of Unbalanced 2-by-2 Factorial
The GLM Procedure
Dependent Variable: Y
Source
DF
Sum of Squares Mean Square F Value
Model
3
91.71428571
30.57142857
15.29
Error
3
6.00000000
2.00000000
Corrected Total 6
97.71428571
R-Square
0.938596
Source
A
B
A*B
Source
A
B
A*B
DF
1
1
1
DF
1
1
1
Coeff Var
9.801480
Pr > F
0.0253
Root MSE
Y Mean
1.414214
14.42857
Type I SS
Mean Square
F Value
Pr > F
80.04761905
80.04761905
40.02
0.0080
11.26666667
11.26666667
5.63
0.0982
0.40000000
0.40000000
0.20
0.6850
Type III SS
Mean Square
F Value
Pr > F
67.60000000
67.60000000
33.80
0.0101
10.00000000
10.00000000
5.00
0.1114
0.40000000
0.40000000
0.20
0.6850
Interpretation
The degrees of freedom may be used to check your data. The Model degrees of freedom for a 2 2
factorial design with interaction are (ab-1), where a is the number of levels of A and b is the
number of levels of B; in this case, (22-1) = 3. The Corrected Total degrees of freedom are always
one less than the number of observations used in the analysis; in this case, 7-1=6.
The overall F test is significant (F=15.29, p=0.0253), indicating strong evidence that the means for
the four different AB cells are different. You can further analyze this difference by examining the
individual tests for each effect.
Four types of estimable functions of parameters are available for testing hypotheses in PROC GLM.
For data with no missing cells, the Type III and Type IV estimable functions are the same and test
the same hypotheses that would be tested if the data were balanced. Type I and Type III sums of
squares are typically not equal when the data are unbalanced; Type III sums of squares are
preferred in testing effects in unbalanced cases because they test a function of the underlying
parameters that is independent of the number of observations per treatment combination.
According to a significance level of 5% , the A*B interaction is not significant (F=0.20, p=0.6850).
This indicates that the effect of A does not depend on the level of B and vice versa. Therefore, the
tests for the individual effects are valid, showing a significant A effect (F=33.80, p=0.0101) but no
significant B effect (F=5.00, p=0.1114).
QUESTIONS ? ?
Key Concepts
Non-Parametric Tests
Parametric
These methods needs distributional
assumption from which samples are drawn.
Require a sufficiently large sample size.
Non Parametric
These methods needs no distributional assumption from which
samples are drawn i.e. to say it is Distribution Free Test.
It should be used when the sample size is small.
Assumptions
Dependent variable is continuous, capable of producing measures carried out to the nth decimal
place.
Measures within the two samples have the properties of at least an ordinal scale of measurement, so
that it is meaningful to speak of "greater than," "less than," and "equal to."
Data can be ranked including tied rank values wherever appropriate. Ranks helps to focus only on the
ordinal relationships among the raw measures"greater than," "less than," and "equal to.
Options are:
Task
Options
Description
DATA=
MISSING
Treats missing values of the CLASS variable as a
valid class level.
NOPRINT
Request analyses
WILCOXON
The CLASS variable identifies groups (or samples) in the data. The variable
can be character or numeric.
The FREQ statement names a numeric variable that provides a frequency for
each observation in the DATA= data set.
The VAR statement names the response or dependent variables to be analyzed. These
variables must be numeric. If the VAR statement is omitted, the procedure analyzes all
numeric variables in the data set except for the CLASS variable, the FREQ variable,
and the BY variables.
Computation-Options are:
Options
Description
ALPHA= value
specifies the level of the confidence limits for Monte Carlo p-value
estimates. The value of the ALPHA= option must be between 0 and 1,
and the default is 0.01 which produces produces 99% confidence limits
for the Monte Carlo estimates.
MAXTIME=value
specifies the maximum clock time (in seconds) that PROC NPAR1WAY
can use to compute an exact p-value. If the procedure does not complete
the computation within the specified time, the computation terminates.
MC
N=n
specifies the number of samples for Monte Carlo estimation. The value of
the N= option must be a positive integer, and the default is 10,000
samples. Larger values of n produce more precise estimates of exact pvalues.
POINT
SEED=number
specifies the initial seed for random number generation for Monte Carlo
estimation. The value of the SEED= option must be an integer.
Examples
Global Evaluations of drug A & drug B in back pain: In a treatment it was found that patients
with low back pain experienced a decrease in pain after 6 to 8 weeks of daily treatment. So, a
study was conducted to determine whether this phenomenon is a drug related response or
coincidental. For this patients were asked to provide a global rating of their pain, relative to
baseline, on the following scale
Used to compare population location parameters among two or more groups based on independent
samples.
Used to test the null hypothesis that all populations have identical distribution functions against the
alternative hypothesis that at least two of the samples differ only with respect to location .
Assumptions
Friedman Test
Introduction
Models the ratings of n judges (rows) on k treatments (column).
Generalization of sign test and spearman rank correlation test as it reduces to
sign test if there are two columns and reduces to spearman rank correlation
test if there are two rows.
Also called two-way analysis on ranks as is used for two=way repeated
measures analysis of variance by ranks.
Used to test null hypothesis that treatment effects have identical effects
against the alternative hypothesis that at least one treatment is different from
at least one other treatment.
Assumptions
n rows are mutually independent. (i.e. results within one row do not affect the results within other
rows)
SAS Implementation
Proc freq with cmh2 option in table statement.
Friedman Test
Syntax
BY
calculates separate frequency or crosstabulation tables for each BY group.
EXACT requests exact tests for specified statistics.
OUTPUT
creates an output data set that contains specified statistics.
TABLES specifies frequency or crosstabulation tables and requests tests and measures of
association.
TEST requests asymptotic tests for measures of association and agreement.
WEIGHT
identifies a variable with values that weight each observation.
Friedman Test
Options
AGREE
McNemar's test for 2 2 tables, simple kappa coefficient, and weighted kappa
coefficient
BINOMIAL
binomial proportion test for one-way tables
CHISQ
chi-square goodness-of-fit test for one-way tables; Pearson chi-square, likelihoodratio chi-square, and Mantel-Haenszel chi-square tests for two-way tables
COMOR
confidence limits for the common odds ratio for h 2 2 tables; common odds ratio
test
FISHER
Fisher's exact test
JT
Jonckheere-Terpstra test
KAPPA
test for the simple kappa coefficient
LRCHI
likelihood-ratio chi-square test
MCNEM
McNemar's test
MEASURES tests for the Pearson correlation and the Spearman correlation, and the odds ratio
confidence limits for 2 2 tables
MHCHI
Mantel-Haenszel chi-square test OR confidence limits for the odds ratio for 2 2
tables
PCHI
Pearson chi-square test
PCORR
test for the Pearson correlation coefficient
SCORR
test for the Spearman correlation coefficient
TREND
Cochran-Armitage test for trend
WTKAP
test for the weighted kappa coefficient
Options
AGREE
McNemar's test for 2 2 tables, simple kappa coefficient, and weighted kappa
coefficient
BINOMIAL
binomial proportion test for one-way tables
CHISQ
chi-square goodness-of-fit test for one-way tables; Pearson chi-square, likelihoodratio chi-square, and Mantel-Haenszel chi-square tests for two-way tables
COMOR
confidence limits for the common odds ratio for h 2 2 tables; common odds ratio
test
FISHER
Fisher's exact test
JT
Jonckheere-Terpstra test
KAPPA
test for the simple kappa coefficient
LRCHI
likelihood-ratio chi-square test
MCNEM
McNemar's test
MEASURES tests for the Pearson correlation and the Spearman correlation, and the odds ratio
confidence limits for 2 2 tables
MHCHI
Mantel-Haenszel chi-square test OR confidence limits for the odds ratio for 2 2
tables
PCHI
Pearson chi-square test
PCORR
test for the Pearson correlation coefficient
SCORR
test for the Spearman correlation coefficient
TREND
Cochran-Armitage test for trend
WTKAP
test for the weighted kappa coefficient
McNemar Test
Introduction
Determine whether the row and column marginal frequencies are equal or not.
Used when dichotomous outcomes are recorded twice for each patient under different conditions
(Eg different treatments or different measurement times).
Assumptions
Data consists of paired observations of labels (A,B).
Applied to 2x2 contingency tables with a dichotomous trait with matched pairs of subjects.
Used only when the conditions for the normal approximation apply.
SAS Implementation
Example
Comparing response rates (Eg. normal & abnormal of group of patients where data
are collected for pre and post study laboratory results) when patients are treated
under a particular drug say A. (Here, we need to test whether there is a change in
the pre - to - post - treatment rates of abnormalities.)
Suppose following program has been run where aim is to compare response
rates (yes/no) of case & control.
Log-Rank Test
Introduction
Used for comparing distributions of time until the occurrence of an event (Eg death, cure, failure,
relapse etc.) of interest occur among independent groups.
Used to test the null hypothesis that there is no difference between the populations in the
probability of an event at any time point.
Used when Wilcoxon test fails. (i.e. censoring condition is not satisfied)
Most likely to detect a difference between groups when the risk of an event is consistently greater
for one group than another.
Assumptions
Survival probabilities are the same for subjects recruited early and late in the study, and the events
happened at the times specified.
SAS Implementation
Proc lifetest
Output shows Chi-Square p-value.
PROC LIFETEST < options > ;
TIME variable < *censor(list) > ;
BY variables ;
FREQ variable ;
ID variables ;
STRATA variable < (list) > < ... variable < (list) > > ;
SURVIVAL options ;
TEST variables ;
Run;
The variable in the FREQ statement identifies a variable containing the frequency
of occurrence of each observation.
The ID variable values are used to label the observations of the product-limit
survival function estimates.
The STRATA statement indicates which variables determine strata levels for
the computations. The strata are formed according to the nonmissing values of
the designated strata variables.
Options available with STRATA statement
MISSING
used to allow missing values as a valid stratum level.
GROUP=variable specifies the variable whose formatted values identify the various
samples whose underlying survival curves are to be compared.
NODETAIL
suppresses the display of the rank statistics and the corresponding
covariance matrices for various strata.
NOTEST
suppresses the k-sample tests, stratified tests, and trend tests
TREND
computes the trend tests for testing the null hypothesis that the
k
population hazards rate are the same versus an ordered alternatives
TEST=(list)
enables you to select the weight functions for the k-sample tests,
stratified tests, or trend tests. You can specify a list containing one
or more of the following keywords