You are on page 1of 11

Chi squared test Values of 2

Back to chi squared test?

A significant difference from your null hypothesis (i.e. difference from your expectation) is indicated when your calculated X2 value is greater than the 2 value shown in the 0.05 column of this table (i.e. there is only a 5% probability that your calculated X2value would occur by chance). You can be even more confident if your calculated value exceeds the 2 values in the 0.01 or 0.001 probability columns. If your calculated X2 value is equal to, or less than, the tabulated 2 value for 0.95 then your results give you no reason to reject the null hypothesis. In a few special circumstances (though not generally) a calculated X2 value lower than the 2value in the 0.95 or 0.99 columns provides evidence that your results agree well with a null hypothesis.

Degrees of Freedom 0.99


1 2 3 4 5 6 7 8 9 10 11 12 13

Probability, p 0.95 0.004 0.103 0.352 0.711 1.145 1.635 2.167 2.733 3.325 3.940 4.58 5.23 5.89 0.05 3.84 5.99 7.82 9.49 11.07 12.59 14.07 15.51 16.92 18.31 19.68 21.03 22.36 0.01 6.64 9.21 11.35 13.28 15.09 16.81 18.48 20.09 21.67 23.21 24.73 26.22 27.69 0.001 10.83 13.82 16.27 18.47 20.52 22.46 24.32 26.13 27.88 29.59 31.26 32.91 34.53

0.000 0.020 0.115 0.297 0.554 0.872 1.239 1.646 2.088 2.558 3.05 3.57 4.11

14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

4.66 5.23 5.81 6.41 7.02 7.63 8.26 8.90 9.54 10.20 10.86 11.52 12.20 12.88 13.57 14.26 14.95

6.57 7.26 7.96 8.67 9.39 10.12 10.85 11.59 12.34 13.09 13.85 14.61 15.38 16.15 16.93 17.71 18.49

23.69 25.00 26.30 27.59 28.87 30.14 31.41 32.67 33.92 35.17 36.42 37.65 38.89 40.11 41.34 42.56 43.77

29.14 30.58 32.00 33.41 34.81 36.19 37.57 38.93 40.29 41.64 42.98 44.31 45.64 46.96 48.28 49.59 50.89

36.12 37.70 39.25 40.79 42.31 43.82 45.32 46.80 48.27 49.73 51.18 52.62 54.05 55.48 56.89 58.30 59.70

Chi-squared test for categories of data


Background: The Student's t-test and Analysis of Variance are used to analyse measurement data which, in theory, are continuously variable. Between a measurement of, say, 1 m and 2 m there is a continuous range from 1.0001 to 1.9999 m. But in some types of experiment we wish to record how many individuals fall into a particular category, such as blue eyes or brown eyes, motile or non-motile cells, etc. These counts, orenumeration data, are discontinuous (1, 2, 3 etc.) and must be treated differently from continuous data. Often the appropriate test is chi-squared (2), which we use to test whether the number of individuals in different categories fit a null hypothesis (an expectation of some sort). Chi squared analysis is simple, and valuable for all sorts of things - not just Mendelian crosses! On this page we build from the simplest examples to more complex ones.

When you have gone through the examples you should consult the checklist of procedures and potential pitfalls. A simple example Suppose that the ratio of male to female students in the Science Faculty is exactly 1:1, but in the Pharmacology Honours class over the past ten years there have been 80 females and 40 males. Is this a significant departure from expectation? We proceed as follows (but note that we are going to overlook a very important point that we shall deal with later). Set out a table as shown below, with the "observed" numbers and the "expected" numbers (i.e. our null hypothesis). Then subtract each "expected" value from the corresponding "observed" value (O-E) Square the "O-E" values, and divide each by the relevant "expected" value to give (OE)2/E Add all the (O-E)2/E values and call the total "X2"
Female Observed numbers (O) 80 Expected numbers (E) 60*3 O-E (O-E)2 (O-E)2 / E 20 400 6.67 Male 40 60*3 -20 400 6.67 13.34 = X2 Total 120 120 *1 0 *2

Notes: *1 This total must always be the same as the observed total *2 This total must always be zero *3 The null hypothesis was obvious here: we are told that there are equal numbers of males and females in the Science Faculty, so we might expect that there will be equal numbers of males and females in Pharmacology. So we divide our total number of Pharmacology students (120) in a 1:1 ratio to get our expected values. Now we must compare our X2 value with a 2 (chi squared) value in a table of 2 with n-1 degrees of freedom (where n is the number of categories, i.e. 2 in our case - males

and females). We have only one degree of freedom (n-1). From the 2 table, we find a "critical value of 3.84 for p = 0.05. If our calculated value of X2 exceeds the critical value of 2 then we have a significant difference from the expectation. In fact, our calculated X2 (13.34) exceeds even the tabulated 2 value (10.83) for p = 0.001. This shows an extreme departure from expectation. It is still possible that we could have got this result by chance - a probability of less than 1 in 1000. But we could be 99.9% confident that some factor leads to a "bias" towards females entering Pharmacology Honours. [Of course, the data don't tell us why this is so - it could be self-selection or any other reason] Now repeat this analysis, but knowing that 33.5% of all students in the Science Faculty are males
Female Observed numbers (O) 80 Expected numbers (E) 79.8*3 O-E (O-E)2 (O-E)2 / E 0.2 0.04 0.0005 Male 40 40.2 -0.2 0.04 0.001 0.0015 = X2 Total 120 120*1 0*2

Note *1: We know that the expected total must be 120 (the same as the observed total), so we can calculate the expected numbers as 66.5% and 33.5% of this total. Note *2: This total must always be zero. Note *3: Although the observed values must be whole numbers, the expected values can be (and often need to be) decimals. Now, from a 2 table we see that our data do not depart from expectation (the null hypothesis). They agree remarkably well with it and might lead us to suspect that there was some design behind this! In most cases, though, we might get intermediate X2 values, which neither agree strongly nor disagree with expectation. Then we conclude that there is no reason to reject the null hypothesis. Some important points about chi-squared

Chi squared is a mathematical distribution with properties that enable us to equate our calculated X2 values to 2 values. The details need not concern us, but we must take account of some limitations so that 2 can be used validly for statistical tests. (i) Yates correction for two categories of data (one degree of freedom) When there are only two categories (e.g. male/female) or, more correctly, when there is only one degree of freedom, the 2 test should not, strictly, be used. There have been various attempts to correct this deficiency, but the simplest is to apply Yates correction to our data. To do this, we simply subtract 0.5 from each calculated value of "O-E", ignoring the sign (plus or minus). In other words, an "O-E" value of +5 becomes +4.5, and an "O-E" value of -5 becomes -4.5. To signify that we are reducing the absolute value, ignoring the sign, we use vertical lines: |O-E|-0.5. Then we continue as usual but with these new (corrected) O-E values: we calculate (with the corrected values) (O-E)2, (O-E)2/E and then sum the (O-E)2/E values to get X2. Yates correction only applies when we have two categories (one degree of freedom). We ignored this point in our first analysis of student numbers (above). So here is the table again, using Yates correction:
Female Observed numbers (O) 80 Expected numbers (E) 60*3 O-E |O-E|-0.5 (|O-E|-0.5)2 (|O-E|-0.5)2 / E 20 19.5 380.25 6.338 Male 40 60*3 -20 -19.5 380.25 6.338 12.676 = X2 Total 120 120 *1 0 *2 0

In this case, the observed numbers were so different from the expected 1:1 ratio that Yates correction made little difference - it only reduced the X2 value from 13.34 to 12.67. But there would be other cases where Yates correction would make the difference between acceptance or rejection of the null hypothesis. (ii) Limitations on numbers in "expected" categories Again to satisfy the mathematical assumptions underlying 2, the expected values should be relatively large. The following simple rules are applied:

no expected category should be less than 1 (it does not matter what the observed values are) AND no more than one-fifth of expected categories should be less than 5.

What can we do if our data do not meet these criteria? We can either collect larger samples so that we satisfy the criteria, or we can combine the data for the smaller "expected" categories until their combined expected value is 5 or more, then do a 2 test on the combined data. We will see an example below. Chi squared with three or more categories Suppose that we want to test the results of a Mendelian genetic cross. We start with 2 parents of genotype AABB and aabb (where A and a represent the dominant and recessive alleles of one gene, and B and b represent the dominant and recessive alleles of another gene). We know that all the F1 generation (first generation progeny of these parents) will have genotype AaBb and that their phenotype will display both dominant alleles (e.g. in fruit flies all the F1generation will have red eyes rather than white eyes, and normal wings rather than stubby wings). This F1 generation will produce 4 types of gamete (AB, Ab, aB and ab), and when we self-cross the F1 generation we will end up with a variety of F2 genotypes (see the table below).
Gametes AB AB Ab aB ab AABB AABb AaBB AaBb Ab AABb AAbb AaBb Aabb Gametes aB AaBB AaBb aaBB aaBb ab AaBb Aabb aaBb aabb

All these genotypes fall into 4 phenotypes, shown by colours in the table: double dominant, single dominant A, single dominant B and double recessive. We know that in classical Mendelian genetics the expected ratio of these phenotypes is 9:3:3:1 Suppose we got observed counts as follows
Phenotype

AB Observed numbers (O) Expected numbers (E) O-E (O-E)2 (O-E)2 / E 40 45 -5 25 0.56

Ab 20 15 5 25 1.67

aB 16 15 1 1 0.07

ab 4 5 -1 1 0.20

Total 80 80*1 0

2.50 = X2

[Note: *1. From our expected total 80 we can calculate our expected values for categories on the ratio 9:3:3:1.] From a 2 table with 3 df (we have four categories, so 3 df) at p = 0.05, we find that a 2 value of 7.82 is necessary to reject the null hypothesis (expectation of ratio 9:3:3:1). So our data areconsistent with the expected ratio. Combining categories Look at the table above. We only just collected enough data to be able to test a 9:3:3:1 expected ratio. If we had only counted 70 (or 79) fruit flies then our lowest expected category would have been less than 1, and we could not have done the test as shown. We would break one of the "rules" for 2 - that no more than one-fifth of expected categories should be less than 5. We could still do the analysis, but only after combining the smaller categories and testing against a different expectation. Here is an illustration of this, assuming that we had used 70 fruit flies and obtained the following observed numbers of phenotypes.
Phenotype AB Observed numbers (O) Expected numbers (E) O-E 34 39.375 -5.375 Ab 18 13.125 4.875 aB 15 13.125 ab 3 4.375 Combined Total aB + ab 18 17.5 0.5 70 70*1 0

(O-E)2 (O-E)2 / E

28.891 0.734

23.766 1.811

0.25 0.014 2.559 = X2

One of our expected categories (ab) is less than 5 (shown in bold italics in the table). So we have combined this category with one of the others and then must analyse the results against an expected ratio of 9:3:4. The numbers in the expected categories were entered by dividing the total (70) in this ratio. Now, with 3 categories we have only 2 degrees of freedom. The rest of the analysis is done as usual, and we still have no reason to reject the null hypothesis. But it is a different null hypothesis:the expected ratio is 9:3:4 (double dominant: single dominant Ab: single dominant aB plus double recessive ab). Chi-squared: double classifications Suppose that we have a population of fungal spores which clearly fall into two size categories, large and small. We incubate these spores on agar and count the number of spores that germinate by producing a single outgrowth or multiple outgrowths. Spores counted: 120 large spores, of which 80 form multiple outgrowths and 40 produce single outgrowths 60 small spores, of which 18 form multiple outgrowths and 42 produce single outgrowths Is there a significant difference in the way that large and small spores germinate? Procedure: 1. Set out a table as follows
Large spores Multiple outgrowth Single outgrowth Total 80 40 120 Small spores 18 42 60 Total 98 82 180

2. Decide on the null hypothesis.

In this case there is no "theory" that gives us an obvious null hypothesis. For example, we have no reason to suppose that 55% or 75% or any other percentage of large spores will produce multiple outgrowths. So the most sensible null hypothesis is that both the large and the small spores will behave similarly and that both types of spore will produce 50% multiple outgrowths and 50% single outgrowths. In other words, we will test against a 1:1:1:1 ratio. Then, if our data do not agree with this expectation we will have evidence that spore size affects the type of germination. 3. Calculate the expected frequencies, based on the null hypothesis. This step is complicated by the fact that we have different numbers of large and small spores, and different numbers of multiple versus single outgrowths. But we can find the expected frequencies (a, b, c and d) by using the grand total (180) and the column and row totals (see table below).
Large spores Multiple outgrowth Observed (O) Expected (E) Single outgrowth Observed (O) Expected (E) Column totals 80 a 40 c 120 Small spores 18 b 42 d 60 Row totals 98 (expected 98) 82 (expected 82) 180

To find the expected value "a" we know that a total 98 spores had multiple outgrowths and that 120 of the total 180 spores were large. So a is 98(120/180) = 65.33. Similarly, to find b we know that 98 spores had multiple outgrowths and that 60 of the total 180 spores were small. So, b is 98(60/180) = 32.67. [Actually, we could have done this simply by subtracting a from the expected 98 row total - the expected total must always be the same as the observed total] To find c we know that a 82 spores had single outgrowths and that 120 of the total 180 spores were large. So c is 82(120/180) = 54.67. To find d we know that 82 spores had single outgrowths and that 60 of the total 180 spores were small. So d is 82(60/180) = 27.33. [This value also could have been obtained by subtraction]

4. Decide the number of degrees of freedom You might think that there are 3 degrees of freedom (because there are 4 categories). But there is actually one degree of freedom! The reason is that we lose one degree of freedom because we have 4 categories, and we lose a further 2 degrees of freedom because we used two pieces of information to construct our null hypothesis - we used a column total and a row total. Once we had used these we would have needed only one data entry in order to fill in the rest of the values (therefore we have one degree of freedom). Of course, with one degree of freedom we must use Yates correction (subtract 0.5 from each O-E value). 5. Run the analysis as usual. Calculating O-E, (O-E)2 and (O-E)2/E for each category, then sum the (O-E)2/E. values to obtain X2 and test this against 2 . The following table shows some of the working. The sum of the values shown in red gives X2 of 20.23
Large spores Multiple outgrowth Observed (O) Expected (E) O-E Yates correction |O-E|-0.5 (O-Ecorrected) /E Single outgrowth Observed (O) Expected (E) O-E Yates correction |O-E|-0.5 (O-Ecorrected)2/E Column totals
2

Small spores 18 32.67 -14.67 -14.17 6.14 42 27.33 +14.67 -14.17 7.35 60

Row totals 98 98

80 65.33 +14.67 +14.17 3.07 40 54.67 -14.67 +14.17 3.67 120

82 82

0 X2 = 20.23 180

We compare the X2 value with a tabulated 2. with one degree of freedom. Our calculated X2 exceeds the tabulated 2 value (10.83) for p = 0.001. We conclude that

there is a highly significant departure from the null hypothesis - we have very strong evidence that large spores and small spores show different germination behaviour. Checklist: procedures and potential pitfalls
Chi squared is a very simple test to use. The only potentially difficult things about it are:

calculating the expected frequencies when we have double classifications - use the marginal subtotals and totals to work out these frequencies determining the number of degrees of freedom, especially when we have to use some of the data to construct the null hypothesis.

If you follow the examples given on this page you should not have too many difficulties.

Some points to watch:

Always work with "real numbers" in the observed categories, not with proportions. To illustrate this, consider a simple chi squared test on tossing of coins. Suppose that in 100 throws you get 70 "heads" and 30 "tails". Using Yates correction (for one degree of freedom) you would find an X2 value of 15.21, equating to a 2 probability less than 0.001. But if you got 7 "heads" and 3 "tails" in a test of 10 throws it would be entirely consistent with random chance. The ratio is the same (7:3), but the actual numbers determine the level of significance in a chi squared test. Observed categories must have whole numbers, but expected categories can have decimals. Follow the rules about the minimum numbers in expected categories. These rules do not apply to the observed categories. Remember Yates correction for one degree of freedom.

You might also like