You are on page 1of 17

A common statistical technique for determining if differences exist between between three or more "groups" is Oneway Analysis of Variance

(ANOVA), and the associated F test: The F test and subsequent ANOVA methodology involves the determination of differences for: 1. 2. one group with multiple (typically, three or more) variations, as well as one variable, compared to multiple groups.

When using Oneway ANOVA for three or more groups, an immediate concern is how to interpret findings if the hypothesis is not accepted (consult an appropriate statistics text to review why there are those who consider it more appropriate to declare "The null hypothesis was not accepted" instead of "The null hypothesis was rejected"). When only two groups are compared and if the Null Hypothesis is not accepted, then you know that the difference between Group #1 and Group #2 is a true difference (at the declared level of significance, or p level). What happens, however, if you reject the null hypothesis for a Oneway ANOVA design involving three groups: -- Is the difference between Group A and Group B the reason for failure to accept the null hypothesis? -- Is the difference between Group A and Group C the reason for failure to accept the null hypothesis? -- Is the difference between Group B and Group C the reason for failure to accept the null hypothesis? There are certainly many techniques for determining multiple comparisons between the means of each group. The following mean comparison tests are found in SPSS for the purpose of comparing differences between means in a Oneway ANOVA design: 1. 2. 3. 4. 5. 6. 7. LSD ............ DUNCAN ......... SNK ............ TUKEYB ......... TUKEY .......... LSDMOD ......... SCHEFFE ........ Least-significant difference Duncan's multiple range test Student-Newman-Keuls Tukey's alternate procedure Honestly significant difference Modified LSD Scheffe's test

Be sure to remember that Oneway ANOVA methodology, as opposed to Student's t-test, can serve as a useful tool in the development of processes for understanding "real-world" problems. Most "realworld" problems are related to complex issues. Statistical tests that can account for this complexity are needed if meaningful decisions are to be effected. Scenario: This study examines if there are differences in final examination test scores between four groups of students in a software engineering course: -- The first group of students was taught by traditional lecture. -- The second group of students was taught by Computer Based Training. -- The third group of students was taught by the use of instructional videotapes. -- The last group of students was enrolled through independent study. Students were all from a university seniorlevel software engineering course who were assigned, through random selection, to placement into one of four groups: instruction by traditional lecture, instruction by CBT (Computer Based Training), instruction by the use of instructional videotapes, and independent study. Because the teacher was confident that final examination scores represented interval data (i.e., the data are parametric, with the difference between "89" and "90" equal to the difference between "75" and "76"), Oneway ANOVA (Analysis of Variance) was correctly judged to be the appropriate test for this analysis of summative differences in final examination scores between three or more groups. Final examination test scores are summarized in Table 1. Table 1 Final Examination Test Scores in a Senior-Level Software Engineering Course by Instructional Method: Traditional Lecture, Computer Based Training, Instructional Videotape, and Independent Study

==================================================== Instructional Method ============= 1 = Lecture 2 = CBT 3 = Video 4 = Independent Student Number Study (IDS) Final Score ---------------------------------------------------01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 089 081 073 084 070 056 070 081 078 069 089 088 045 083 095 077 069 080 093 086 089 095 089 088 098 089 094 095 095 098 087 085 098 093 087 095 093 093 095 096 083 089 088 087 094

46 3 097 47 3 095 48 3 093 49 3 085 50 3 095 51 3 092 52 3 082 53 3 086 54 3 087 55 3 089 56 3 097 57 3 100 58 3 093 59 3 096 60 4 084 61 4 085 62 4 073 63 4 092 64 4 057 65 4 063 66 4 069 67 4 073 68 4 091 69 4 065 70 4 074 71 4 071 72 4 068 73 4 062 74 4 056 75 4 085 ---------------------------------------------------Note. Notice how the N (i.e., number of subjects or group members) for each instructional group does not have to be equal.

Ho:

Null Hypothesis: There is no difference in the final examination test scores of students enrolled in a university senior-level software engineering course after students were assigned, through random selection, to placement into one of four groups: instruction by traditional lecture, instruction by CBT (Computer Based Training), instruction by the use of instructional videotapes, and independent study (p <= .01). Note. The p (i.e., probability or alpha level) value is declared as p <= .01 instead of the more liberal p <= .05.

Files:

1. 2.

one_anov.doc one_anov.dat

3. 4. 5. 6. Command:

one_anov.r01 one_anov.o01 one_anov.con one_anov.lis

At the Unix prompt (%), key: %spss -m < one_anov.r01 > one_anov.o01

************ one_anov.dat ************ 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 089 081 073 084 070 056 070 081 078 069 089 088 045 083 095 077 069 080 093 086 089 095 089 088 098 089 094 095 095 098 087 085 098 093 087 095 093 093 095 096

41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 ************ one_anov.r01 ************ SET WIDTH SET LENGTH SET CASE SET HEADER TITLE COMMENT

3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

083 089 088 087 094 097 095 093 085 095 092 082 086 087 089 097 100 093 096 084 085 073 092 057 063 069 073 091 065 074 071 068 062 056 085

= = = = = =

80 NONE UPLOW NO Oneway Analysis of Variance (ONEWAY ANOVA) This file examines if there are differences in final examination test scores between four groups of students in a software engineering course: the first group of students was taught by traditional lecture, the second group of students was taught by Computer Based Training, the third group of students was taught by the use of instructional videotapes, and the last group of students were enrolled through independent study. Students were all from a university senior-

level software engineering course who were assigned, through random selection, to placement into one of four groups: instruction by traditional lecture, instruction by CBT (Computer Based Training), instruction by the use of instructional videotapes, and independent study. Because the teacher was confident that final examination scores represented interval data (i.e., the data are parametric, with the difference between "89" and "90" equal to the difference between "75" and "76"), Oneway ANOVA (Analysis of Variance) was correctly judged to be the appropriate test for this analysis of summative differences in final examination scores between three or more groups. DATA LIST FILE = 'one_anov.dat' FIXED / Stu_Code 20-21 Method 35 Score 50-52 Variable Labels Stu_Code / Method / Score Value Labels Method "Student Code" "Method: Lecture, CBT, Video, IDS" "Final Examination Score" 1 2 3 4 'Lecture: Traditional Lecture' 'CBT: Computer-Based Training' 'Video: Instructional Videotape' 'IDS: Independent Study'

ONEWAY / / /

Score BY Method(1,4) STATISTICS = ALL RANGES = SCHEFFE (.01) FORMAT = LABELS

************ one_anov.o01 ************ 1 SET WIDTH 2 SET LENGTH 3 SET CASE 4 SET HEADER 5 TITLE 6 COMMENT 7 8 9 10 11 12 13 14 15 16

= = = = = =

80 NONE UPLOW NO Oneway Analysis of Variance (ONEWAY ANOVA) This file examines if there are differences in final examination test scores between four groups of students in a software engineering course: the first group of students was taught by traditional lecture, the second group of students was taught by Computer Based Training, the third group of students was taught by the use of instructional videotapes, and the last group of students were enrolled through independent study.

17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

Students were all from a university seniorlevel software engineering course who were assigned, through random selection, to placement into one of four groups: instruction by traditional lecture, instruction by CBT (Computer Based Training), instruction by the use of instructional videotapes, and independent study. Because the teacher was confident that final examination scores represented interval data (i.e., the data are parametric, with the difference between "89" and "90" equal to the difference between "75" and "76"), Oneway ANOVA (Analysis of Variance) was correctly judged to be the appropriate test for this analysis of summative differences in final examination scores between three or more groups. DATA LIST FILE = 'one_anov.dat' FIXED / Stu_Code 20-21 Method 35 Score 50-52

This command will read 1 records from one_anov.dat Variable STU_CODE METHOD SCORE 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 Rec 1 1 1 Start 20 35 50 End 21 35 52 Format F2.0 F1.0 F3.0

Variable Labels Stu_Code / Method / Score Value Labels Method

"Student Code" "Method: Lecture, CBT, Video, IDS" "Final Examination Score" 1 2 3 4 'Lecture: Traditional Lecture' 'CBT: Computer-Based Training' 'Video: Instructional Videotape' 'IDS: Independent Study'

ONEWAY / / /

Score BY Method(1,4) STATISTICS = ALL RANGES = SCHEFFE (.01) FORMAT = LABELS

ONEWAY problem requires 504 bytes of memory. - - - - Variable By Variable SCORE METHOD O N E W A Y - - - - -

Final Examination Score Method: Lecture, CBT, Video, IDS Analysis of Variance

Source Between Groups Within Groups Total

D.F. 3 71 74

Sum of Squares 5372.3343 5383.4524 10755.7867

Mean Squares 1790.7781 75.8233

F Ratio

F Prob.

23.6178 .0000

Group Lecture: CBT: Com Video: I IDS: Ind Total

Count 18 20 21 16 75

Mean 76.5000 92.0000 91.3810 73.0000 84.0533

Standard Deviation 12.2774 4.1675 5.1037 11.4601 12.0561 8.7077

Standard Error 95 Pct Conf Int for Mean 2.8938 .9319 1.1137 2.8650 1.3921 1.0055 4.9191 70.3946 90.0495 89.0578 66.8934 TO TO TO TO 82.6054 93.9505 93.7041 79.1066

81.2795 TO 86.8272 82.0485 to 86.0582 68.3987 to 99.7080

Fixed Effects Model Random Effects Model

Random Effects Model - estimate of between component variance 91.79

GROUP Lecture: CBT: Com Video: I IDS: Ind TOTAL

MINIMUM 45.0000 85.0000 82.0000 56.0000 45.0000

MAXIMUM 95.0000 98.0000 100.0000 92.0000 100.0000

Levene Test for Homogeneity of Variances Statistic 6.5470 df1 3 df2 71 - - - - Variable By Variable SCORE METHOD 2-tail Sig. .001 O N E W A Y - - - - -

Final Examination Score Method: Lecture, CBT, Video, IDS Scheffe test with significance level .01

Multiple Range Tests:

The difference between two means is significant if MEAN(J)-MEAN(I) >= 6.1572 * RANGE * SQRT(1/N(I) + 1/N(J)) with the following value(s) for RANGE: 4.94

(*) Indicates significant differences which are shown in the lower triangle I D S : L e c t u I r n e d : Mean 73.0000 76.5000 91.3810 92.0000 METHOD IDS: Ind Lecture: Video: I CBT: Com V i d e o : C B T :

C o I m

* * * *

************ one_anov.con ************ Outcome: Computed F

= 23.6178

Criterion F = 4.13 (alpha = .01, df = 3,60) Note. Although df = 3,71 the table values for the F distribution increase from df = 3,40 (F = 4.31), to df = 3,60 (F = 4.13), to df = 3,120 (F = 3.95). Occasionaly, it is necessary to extrapolate the F statistic when determining the Criterion F statistic. Computed F (23.62) > Criterion F (4.13) Therefore, the null hypothesis is rejected. That is to say, there are differences in final examination test scores in a senior-level software engineering course, based on instructional method (p <= .01). The p value is another way to view differences in the final examination test scores: -- The calculated p value is .000. -- The delcared p value is .01. The calculated p value is less than the declared p value and there is, accordingly, a difference in test scores. Conclusion: Although you now know that differences exist, the F statistic does not tell you where the difference(s) exist between instructional methods.

Instead, review the following section of the SPSS output file: (*) Indicates significant differences which are shown in the lower triangle I D S : L e c t u I r n e d : Mean 73.0000 76.5000 91.3810 92.0000 METHOD IDS: Ind Lecture: Video: I CBT: Com V i d e o : C B T :

C o I m

* * * *

Using traditional methodology, you could also visually present on your own the mean comparisons among groups by using underscores, as presented below: IDS Lecture Video CBT

73.00 76.50 _____________________

91.38 92.00 __________________

Although it is not possible at this point to suggest "why" differences exist, there is sufficient evidence from this one-time study to: -- There is no difference in final examination test scores between students who received instruction through IDS and Lecture. -- There is no difference in final examination test scores between students who received instruction through Video and CBT. -- There is a difference in final examination test scores between students who received instruction through CBT and either IDS or Lecture, with CBT students receiving a higher score. -- There is a difference in final examination test scores between students who received instruction through Video and either IDS or Lecture, with video students receiving a higher score. You will notice that these complex outcomes have a graphic representation in MINITAB that is fairly

easy to understand, as opposed to the more complex graphical representation in SPSS. ************ one_anov.lis ************ % minitab MTB > outfile 'one_anov.lis' Collecting Minitab session in file: one_anov.lis MTB > # MINITAB Addendum to 'one_anov.dat' MTB > # MTB > read 'one_anov.dat' c1 c2 c3 Entering data from file: one_anov.dat 75 rows read. MTB > print c1 c2 c3 ROW C1 C2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 C3 89 81 73 84 70 56 70 81 78 69 89 88 45 83 95 77 69 80 93 86 89 95 89 88 98 89 94 95 95 98 87 85 98 93 87

1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 16 17 17 18 18 Continue? y 19 19 20 20 21 21 22 22 23 23 24 24 25 25 26 26 27 27 28 28 29 29 30 30 31 31 32 32 33 33 34 34 35 35

36 36 37 37 38 38 39 39 40 40 41 41 Continue? y 42 42 43 43 44 44 45 45 46 46 47 47 48 48 49 49 50 50 51 51 52 52 53 53 54 54 55 55 56 56 57 57 58 58 59 59 60 60 61 61 62 62 63 63 64 64 Continue? y 65 65 66 66 67 67 68 68 69 69 70 70 71 71 72 72 73 73 74 74 75 75

2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

95 93 93 95 96 83 89 88 87 94 97 95 93 85 95 92 82 86 87 89 97 100 93 96 84 85 73 92 57 63 69 73 91 65 74 71 68 62 56 85

MTB > # I'll unstack the data in c3 and c2 and then use MTB > # the two commands to effect the Oneway ANOVA calculations. MTB > # MTB > # If at all possible, stack and or unstack data but MTB > # never re-key data. MTB > # MTB > unstack (c2-c3) (c5-c6) (c7-c8) (c9-c10) (c11-c12); SUBC> subscripts c2. MTB > print c1-c12 ROW C1 C2 C3 C5 C6 C7 C8 C9 C10 C11 C12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

89 81 73 84 70 56 70 81 78 69 89 88 45 83 95 77 69 80 93 86 89 95 89 88 98 89 94 95 95 98 87 85 98 93 87 95 93 93 95 96 83 89 88 87 94 97 95 93 85 95 92 82 86 87

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

89 81 73 84 70 56 70 81 78 69 89 88 45 83 95 77 69 80

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

93 86 89 95 89 88 98 89 94 95 95 98 87 85 98 93 87 95 93 93

3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

95 96 83 89 88 87 94 97 95 93 85 95 92 82 86 87 89 97 100 93 96

4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

84 85 73 92 57 63 69 73 91 65 74 71 68 62 56 85

Continue? y 19 19 20 20 21 21 22 22 23 23 24 24 25 25 26 26 27 27 28 28 29 29 30 30 31 31 32 32 33 33 34 34 35 35 36 36 37 37 38 38 39 39 40 40 41 41 Continue? y 42 42 43 43 44 44 45 45 46 46 47 47 48 48 49 49 50 50 51 51 52 52 53 53 54 54

55 55 56 56 57 57 58 58 59 59 60 60 61 61 62 62 63 63 64 64 Continue? y 65 65 66 66 67 67 68 68 69 69 70 70 71 71 72 72 73 73 74 74 75 75 * NOTE

3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

89 97 100 93 96 84 85 73 92 57 63 69 73 91 65 74 71 68 62 56 85

* One or more variables are undefined.

MTB > describe c6 c8 c10 c12 C6 C8 C10 C12 C6 C8 C10 C12 N 18 20 21 16 MIN 45.00 85.000 82.00 56.00 MEAN 76.50 92.000 91.38 73.00 MAX 95.00 98.000 100.00 92.00 MEDIAN 79.00 93.000 93.00 72.00 Q1 69.75 88.250 87.00 63.50 TRMEAN 77.31 92.056 91.42 72.86 Q3 85.00 95.000 95.50 84.75 STDEV 12.28 4.168 5.10 11.46 SEMEAN 2.89 0.932 1.11 2.87

MTB > histogram c6 c8 c10 c12 Histogram of C6 Midpoint 45 50 55 60 65 70 75 80 85 90 95 Count 1 0 1 0 0 4 2 4 2 3 1 N = 18 * * **** ** **** ** *** *

Continue? y Histogram of C8 Midpoint 85 86 87 88 89 90 91 92 93 94 95 96 97 98 Continue? y Histogram of C10 Midpoint 82 84 86 88 90 92 94 96 98 100 Continue? y Histogram of C12 Midpoint 55 60 65 70 75 80 85 90 Count 2 1 2 3 3 0 3 2 N = 16 ** * ** *** *** *** ** Count 1 1 2 3 2 1 3 5 2 1 N = 21 * * ** *** ** * *** ***** ** * Count 1 1 2 1 3 0 0 0 4 1 4 0 0 3 N = 20 * * ** * ***

**** * **** ***

MTB > # I will now use the MINITAB command for STACKED data. MTB > # MTB > oneway c3 c2 ANALYSIS OF VARIANCE ON C3 SOURCE DF SS MS C2 3 5372.3 1790.8 F 23.62 p 0.000

ERROR TOTAL LEVEL 1 2 3 4

71 74 N 18 20 21 16

5383.5 10755.8 MEAN 76.500 92.000 91.381 73.000

75.8 INDIVIDUAL 95 PCT CI'S FOR MEAN BASED ON POOLED STDEV -----+---------+---------+---------+(----*----) (----*----) (----*----) (----*-----) -----+---------+---------+---------+72.0 80.0 88.0 96.0

STDEV 12.277 4.168 5.104 11.460

POOLED STDEV = 8.708 MTB > # MTB > # And I will now use the MINITAB command for UNSTACKED data. MTB > aovoneway c6 c8 c10 c12 ANALYSIS OF VARIANCE SOURCE DF SS FACTOR 3 5372.3 ERROR 71 5383.5 TOTAL 74 10755.8 LEVEL C6 C8 C10 C12 N 18 20 21 16 MEAN 76.500 92.000 91.381 73.000 MS 1790.8 75.8 F 23.62 p 0.000

STDEV 12.277 4.168 5.104 11.460

POOLED STDEV = 8.708 MTB > # MTB > # Although I'm keen on the use of SPSS, the graphic MTB > # output with MINITAB on a Oneway ANOVA is very MTB > # useful and easy to understand. MTB > # MTB > # Here, you can easilty see that C6 (lecture) and c12 MTB > # (independent study) share the same pooled mean score MTB > # on the final examination. Equally, you can also see MTB > # that c8 (CBT) and c10 (videotape instruction) also MTB > # share the same pooled mean. MTB > # MTB > # Finally, you can also see that lecture and independent MTB > # study final examination scores are totally different, MTB > # with no overlap, from CBT and videotape instruction MTB > # final examination scores. MTB > stop --------------------------

INDIVIDUAL 95 PCT CI'S FOR MEAN BASED ON POOLED STDEV -----+---------+---------+---------+(----*----) (----*----) (----*----) (----*-----) -----+---------+---------+---------+72.0 80.0 88.0 96.0

You might also like