You are on page 1of 4

Analysis of Variance Introduction: The analysis of variance is a powerful statistical tool for tests of significance.

The test of significance based on t-distribution is an adequate procedure only for testing the significance of the difference between two sample means. In a situation when we have three or more samples to consider at a time an alternative procedure is needed for testing the hypothesis that all the samples are drawn from the same population, i.e., they have the same mean. For eample, five fertilizers are applied to four plots each of wheat and yield of wheat on each of the plot is given. We may be interested in finding out whether the effect of these fertilizers on the yields is significantly different or in other words, whether the samples have come from the same normal population. The answer to this problem is provided by the technique of analysis of variance. Thus basic purpose of the analysis of variance is to test the homogeneity of several means. The term Analysis of Variance was introduced by Prof. R. A. Fisher in 1920s to deal with problem in the analysis of agronomical data. Variation in inherent in nature. The total variation in any set of numerical data is due to number of causes which may be classified as: Assignable causes Chance causes The variation due to assignable causes can be detected and measured whereas the variation due to chance causes is beyond the control of human hand and cannot be traced separately. Definition: According to Prof. R. A. Fisher, Analysis of Variance (ANOVA) is the Seperation of variance ascribable to one group of causes from the variance ascribable to other group. By this technique the total variation in the sample data is expressed as the sum of its nonnegative components where each of these components is a measure of the variation due to some specific independent source of factor or cause. ANOVA for one way classification When the numerical measurements across the groups are continuous and certain assumptions are met, a methodology known as analysis of variance (ANOVA) is used to compare the means of the groups. In a sense, the term analysis of variance appears to be inappropriate because the objective is to analyze differences among the group means. However, through the analysis of variation in the data, both among and within the groups, conclusions can be made about possible differences in group means. In ANOVA, the total variation in the outcome measurements is subdivided into variation that is attributable to differences among the groups and variation that is attributable to inherent variation within the groups. Within group variation is considered experimental error. Among-group variation is attributable to treatment effects. The symbol c is used to indicate the number of groups. Assuming that the c groups represent populations whose measurements are randomly and independently drawn, follow a normal distribution, and have equal variances, the null hypothesis of no differences in the population means

Analysis of Variance

H0: 1 = 2 =... = c is tested against the alternative that not all the c population means are equal. H1: 1 2 ... c With the above hypothesis, we should calculate: 1. the sum of the squares of variations between the samples 2. the sum of squares of variation within the samples 3. the total sum of squares of variation. This will be the sum of (1) and (2) and should be calculated directly also for variation of calculations 4. calculate the F-ratio 5. compare the F-ratio so calculated with the critical value of F-ratio as given in Snedecars tables 6. draw inference whether the null hypothesis is accepted or rejected Short Cut Method The previously mentioned method if in case it is tedious and time consuming then it is always advisable to use a short cut method. The variation steps in the calculations of variance ratio by the short cut method are as follows: 1. Find the sum of values of all the items of all the samples. Let this total be represented by T 2. Calculate the correction factor (CF) which is equal to
T2 N

3. Find the square of all the items (not deviations) of all the samples and add them together 4. Find out the total sum of squares (SST) by subtracting the correction factor from the sum of squares of all the items of the sample [(3)-(2)] 5. Find out the sum of squares between the samples (SSC) by the following method: a. Square the totals and divide by the numbers of items in each sample. Add these figures b. Subtract the correction factor from (a) and the resulting figure would be the sum of squares between the sample or (SSC) 6. Find the sum of squares within the samples (SSE) by subtracting the sum of squares between the samples (SSC) from the total sum of squares or (SST) 7. Set up the table of analysis of variance and calculate F Analysis of Variance Table (One way classification) Source of Sum of Degrees of Mean Squares F-Ratio Variation Squares freedom SSC Between SSC C-1 MSC = M SC Samples C 1 F= SSE MSE Within Samples SSE N-C MSE=
N C

Where SSC = Sum of Squares between samples SSE = Sum of Squares within samples

Analysis of Variance SST = Total Sum of Squares MSC = Mean Sum of Squares between samples MSE = Mean Sum of Squares within samples F = Ratio of MSE to MSC

SST = Where
X =
c nj

( X
j =1 i =1

nj

ij

X )2

(1)

X
j =1 i =1

ij

= overall or grand mean

n Xij = ith observation in group j nj = number of observations in group j n = total number of observations in all groups combined c = number of groups

VARIANCE ANALYSIS IN TWO-WAY CLASSIFICATION In a one-way classification, we take into account the effect of only one variable. If there is a two way classification, the effect of two variables can be studied. The procedure of analysis of variance in a two-way classification is, total of both the columns and rows. The effect of one factor is studied through the columns-wise figures and totals and of the other through the row-wise figures and totals. The variances are calculated for both the columns and rows and they are compared with the residual variance or error. The table of variance analysis in a two-way classification takes the following form Analysis of Variance Table (Two way classification) Sum of Degrees Mean Squares F-Ratio Squares of freedom SSC Between SSC c-1 MSC = M SC Columns c 1 F= SSR M SE Between Rows SSR r-1 MSR= M SR r 1 F= M SE Residual or SSE (c-1)(r-1) MSE= SSE Error Source of Variation
(c 1)( r 1)

Total SST n-1 Where SSC = Sum of Squares between columns SSR = Sum of Squares between rows SSE = Sum of Squares due to errors

Analysis of Variance SST = Total Sum of Squares MSC = Mean Sum of Squares between columns MSR = Mean Sum of Squares between rows MSE = Mean Sum of Squares due to errors F = Ratio of MSE to MSC F = Ratio of MSE to MSR In two-way classification, one should be careful in finding out the degrees pf freedom. For column figures, it is (c-1) and for rows, it is (r-1) and for the residual, it is (c-1)(r-1). While calculating F ratio for columns and for rows, the degrees of freedom for the numerator may not be the same. In one case, it is (c-1) and in the other (r-1).

You might also like