You are on page 1of 20

Chapter 14: Repeated Measures Analysis of Variance (ANOVA)

First of all, you need to recognize the difference between a repeated measures (or dependent groups) design and the between groups (or independent groups) design. In an independent groups design, each participant is exposed to only one of the treatment levels and then provides one response on the dependent variable. However, in a repeated measures design, each participant is exposed to every treatment level and provides a response on the dependent variable after each treatment. Thus, if a participant has provided more than one score on the dependent variable, you know that you're dealing with a repeated measures design. Comparing the Independent Groups ANOVA and the Repeated Measures ANOVA An Independent Groups Analysis The fact that the scores in each treatment condition come from the same participants has an important impact on the between-treatment variability found in the MSBetween (MSTreatment). In an independent groups design, the variability in the MSBetween arises from three sources: treatment effects, individual differences, and random variability. Imagine, for instance, a single-factor independent groups design with three levels of the factor. As seen below, the three group means vary.
a1 3 5 2 6 4 3 3.83 a2 7 6 9 7 8 7 7.33 a3 9 8 9 7 9 8 8.33

Mean

As you should recall, the variability among the group means determines the MSBetween. In this case, MSBetween = 33.5, which is the variance of the group means (5.583) times the sample size (6). Why do the group means differ? One source of variabilityindividual differencesemerges because the scores in each group come from different people. Thus, even with random assignment to conditions, the group means could differ from one another because of individual differences. And the more variability due to individual differences in the population, the greater the variability both within groups and between groups. Another source of variabilityrandom effectsshould play a fairly small role. Nonetheless, because there will be some random variability, it could influence the three group means. Finally, you should imagine that your treatment will have an impact on the means, which is the treatment effect that you set out to examine in your experiment. Given the sources of variability in the MSBetween, you need to construct a MSError that involves individual differences and random variability. Thus, your F-ratio would be:
F = Treatment Effect + Individual Differences + Random Variability Individual Differences + Random Variability
Ch. 14 Repeated Measures ANOVA - 1

When treatment effects are absent, your F-ratio would be roughly 1.0. As the treatment effects increased, your F-ratio would grow larger than 1.0. In the case of these data, the F-ratio would be fairly large, as seen in the SPSS source table below:

A Repeated Measures Analysis Imagine, now, that you have the same three conditions and the same 18 scores, but now presume that they come from only 6 participants in a repeated measures design. Even though the MSBetween would be identical, in a repeated measures design that variability is not influenced by individual differences. Thus, the MSBetween of 33.5 would come from treatment effects and random effects. In order to construct an appropriate F-ratio, you now need to develop an error term that contains only random variability. The logic of the procedure we will use is to take the error term that would be constructed were these data from an independent groups design (and would include individual differences and random variability) and remove the portion due to individual differences, which leaves behind the random variability that we want in our error term. The process is illustrated schematically in the pie charts below.

Independent Groups

Repeated Measures

Ch. 14 Repeated Measures ANOVA - 2

Conceptually, then, our F-ratio would be comprised of the components seen below:
F = Treatment Effect + Random Variability Random Variability

Remember, however, that even though the components in the numerator of the F-ratio differ in the independent groups and repeated measures ANOVAs, the computations are identical. ! That is, regardless of the nature of the design, the formula for SSBetween is:
SSTreatment T 2 G2 =" # n N

And the formula for dfBetween is:


!
dfTreatment = k "1

Furthermore, youll still need to compute the SSError for the independent groups ANOVA (which is just the sum of the SS for each condition) and the dfError for the independent groups ANOVA! (which is just n - 1 for each condition times the number of conditions). However, because this old error term contains both individual differences and random variability, we need to estimate and remove the contribution of individual differences. We estimate the contribution of individual differences using the same logic as we use when computing the variability among treatments. That is, we treat each participant as the level of a factor (think of the factor as Subject or Participant). If you think of the computation this way, youll immediately notice that the formulas for SSBetween and SSSubject are identical, with the SSBetween working on columns while the SSSubject works on rows. The actual formula would be:
SSSubject = # P 2 G2 " k N

If youll look at our data again, to complete your computation you would need to sum across each of the participants and then square those sums before adding them and dividing ! by the number of treatments.
a1 3 5 2 6 4 3 3.83 a2 7 6 9 7 8 7 7.33 a3 9 8 9 7 9 8 8.33 P 19 19 20 20 21 18

Mean

Ch. 14 Repeated Measures ANOVA - 3

Your computation of SSSubject would be:


SSSubject = 19 2 + 19 2 + 20 2 + 20 2 + 212 + 18 2 117 2 2287 " = " 760.5 = 1.83 3 18 3

You would then enter the SSSubject into the source table and subtract it from the SSWithin (which is the error term from the independent groups design). As seen in the source table below, ! when you subtract that SSSubject, you are left with SSError = 17.67. The SS in the denominator of the repeated measures design will always be less than that found in an independent groups design for the same scores. Source Between Within Groups Subject Error Total SS 67 19.5 1.83 17.67 86.5 df 2 15 5 10 17 MS 33.5 F 18.93

1.77

Of course, you need to apply the same procedure to the degrees of freedom. The dfWithinGroups for the independent groups design must be reduced by the dfSubject. The dfSubject is simply:

df Subjects = n "1
Just as you should note the parallel between the SSBetween and the SSSubject, you should also note the parallel between the dfBetween and the dfSubject. Because you remove the dfSubject, the df in the ! error term for the repeated measures design will always be less than the df in the error term for an independent groups design for the same scores. Furthermore, it will always be true that the dfError in a repeated measures design is the product of the dfBetween and the dfSubject. For completeness, below is the source table that SPSS would generate for these data using a repeated measures ANOVA:

Ch. 14 Repeated Measures ANOVA - 4

You should note the differences between the source tables that you would generate doing the analyses as shown in your Gravetter & Wallnau textbook and that generated by SPSS. Note that SPSS doesnt show the Subject effect, but just the Treatment effect (A) and the Error term. Moreover, it produces a whole set of F-ratios. You can ignore the lower three lines (Greenhouse-Geisser, Huynh-Feldt, and Lower-bound). Thus, you would focus on the Sphericity Assumed line. Ill say more about the SPSS output later in the handout. You should also note a perplexing result. Generally speaking, the repeated measures design is more powerful than the independent groups design. Thus, you should expect that the F-ratio would be larger for the repeated measures design than it is for the independent groups design. For these data, however, thats not the case. Note that for the independent groups ANOVA, F = 25.8 and for the repeated measures ANOVA, F = 18.9. (For the repeated measures analysis, the difference between the SPSS F and the calculator-computed F is due to rounding error.) What happened? Think, first of all, of the formula for the F-ratio. The numerator is identical, whether the analysis is for an independent groups design or a repeated measures design. So for any difference in the F-ratio to emerge, it has to come from the denominator. Generally speaking, as seen in the formula below, larger F-ratios would come from larger dfError and smaller SSError.
F= MSTreatment SSError df Error

But, for identical data, the dfError will always be smaller for a repeated measures analysis! So, how does the increased power emerge? Again, for identical data, its also true that the SSError will always be smaller for a ! repeated measures analysis. As long as the SSSubject is substantial, the F-ratio will be larger for the repeated measures analysis. For these data, however, the SSSubject is actually fairly small, resulting in a smaller F-ratio. Thus, the power of the repeated measures design emerges from the presumption that people will vary. That is, youre betting on substantial individual differences. As you look at the people around you, that presumption is not all that unreasonable. Use the source table below to determine the break-even point for this data set. What SSSubject would need to be present to give you the exact same F-ratio as for the independent groups ANOVA? Source Between Within Groups Subject Error Total SS 67 19.5 df 2 15 5 10 17 MS 33.5 F 25.8

86.5

Ch. 14 Repeated Measures ANOVA - 5

So, as long as you had more than that level of SSSubject you would achieve a larger F-ratio using the repeated measures design. Testing the Null Hypothesis and Post Hoc Tests for Repeated Measures ANOVAs You would set up and test the null hypothesis for a repeated measures design just as you would for an independent groups design. That is, for this example, the null and alternative hypotheses would be identical for the two designs: H0 : 1 = 2 = 3 H1: Not H0

To test the null hypothesis for a repeated measures design, you would look up the FCritical with the dfBetween and the dfError found in your source table. That is, for this example, FCrit(2,10) = 4.10. If you reject H0, as you would in this case, you would then need to compute a post hoc test to determine exactly which of the conditions differed. Again, the computation of Tukeys HSD would parallel the procedure you used for an independent groups analysis. In this case, for the independent groups design, your Tukeys HSD would be:
HSD = 3.67 1.3 = 1.71 6

For the repeated measures design, your Tukeys HSD would be:
! HSD = 3.88 1.77 = 2.1 6

Ordinarily, of course, your HSD would be smaller for the repeated measures design, due to the typical reduction in the MSError. For this particular data set, given the lack of individual ! differences, thats not the case. Estimating Effect Size The measure of effect size is computed slightly differently for the repeated measures design. The numerator stays the same (which should make sense to you), but the denominator changes (just as is true for the F-ratio), so that it has no variability due to individual differences.

"2 =

SSTreatment 67 = = .79 SSTotal # SSSubjects 86.5 #1.83

Ch. 14 Repeated Measures ANOVA - 6

A Computational Example RESEARCH QUESTION: Does behavior modification (response-cost technique) reduce the outbursts of unruly children? EXPERIMENT: Randomly select 6 participants, who are tested before treatment, then one week, one month, and six months after treatment. The IV is the duration of the treatment. The DV is the number of unruly acts observed. H0: Before = 1Week = 1Month = 6Months H1: Not H0 If FObt FCrit, Reject H0. FCrit(3,15) = 3.29

STATISTICAL HYPOTHESES: DECISION RULE: DATA:


P1 P2 P3 P4 P5 P6 Before 8 4 6 8 7 6 6.5 39 265 11.5 1 Week 2 1 1 3 4 2 2.3 13 35 6.8

X
T (X) X2 SS

1 Month 1 1 0 4 3 1 1.5 10 28 11.3

6 Months 1 0 2 1 2 1 1 7 11 2.8

P 12 6 9 16 16 10 SUM 69 339 32.4

SOURCE TABLE: SOURCE Between Within grps


!

SS Formula T G2 "n#N
2

SS

df

MS

SS in each group

Between subjs Error


!

"

P 2 G2 # k N

(SSWithin Groups SSBetween subjects)

Total

!X

"

G2 N

Ch. 14 Repeated Measures ANOVA - 7

DECISION:

POST HOC TEST:

INTERPRETATION:

EFFECT SIZE:

Ch. 14 Repeated Measures ANOVA - 8

Suppose that you continued to assess the amount of unruly behavior in the children after the treatment was withdrawn. You assess the number of unruly acts after 12 months, 18 months, 24 months and 30 months. Suppose that you obtain the following data. What could you conclude?
P1 P2 P3 P4 P5 P6 T (X) 2 X 12 Months 1 2 1 3 2 1 10 20 18 Months 2 2 3 4 2 2 15 41 24 Months 2 3 3 4 3 4 19 63 30 Months 5 4 4 6 5 4 28 134 P 10 11 11 17 12 11 72

SOURCE Between Within grps


!

SS Formula T G2 "n#N
2

SS

df

MS

SS in each group

Between subjs Error


!

"

P 2 G2 # k N

(SSWithin Groups SSBetween subjects)

Total

!X "
2

G2 N

DECISION: POST HOC TEST:

INTERPRETATION:

EFFECT SIZE:

Ch. 14 Repeated Measures ANOVA - 9

An Example to Compare Independent Groups and Repeated Measures ANOVAs Independent Groups ANOVA
A1 1 1 2 4 8 22 6 2 A2 2 3 3 3 11 31 .75 .25 A3 3 4 4 5 16 66 2 .67 A4 4 5 6 6 21 113 2.75 .92

T (X) 2 X

56 (G) 11.5

SS
s
2

SOURCE Between Error

SS

df

MS

Total Repeated Measures ANOVA


A1 A2 A3 Exactly the same as above A4

SOURCE Between Within Groups

SS

df

MS

Between Subjs Error Total

Ch. 14 Repeated Measures ANOVA - 10

Repeated Measures Analyses: The Error Term In a repeated measures analysis, the MSError is actually the interaction between participants and treatment. However, that wont make much sense to you until weve talked about two-factor ANOVA. For now, well simply look at the data that would produce different kinds of error terms in a repeated measures analysis, to give you a clearer understanding of the factors that influence the error term. These examples are derived from the example in your textbook (G&W, 14.4). Imagine a study in which rats are given each of three types of food rewards (2, 4, or 6 grams) when they complete a maze. The DV is the time to complete the maze. As you can see in the graph below, Participant1 is the fastest and Participant6 is the slowest. The differences in average performance represent individual differences. If the 6 lines were absolutely parallel, the MSError would be 0, so an F-ratio could not be computed. So, Ive tweaked the data to be sure that the lines were not perfectly parallel. Nonetheless, if performance was as illustrated below, the MSError would be quite small. The data are seen below in tabular form and then in graphical form.
P1 P2 P3 P4 P5 P6 Mean s2 2 grams 1.0 2.0 3.0 4.0 5.0 6.0 3.5 3.5 4 grams 1.5 2.5 3.5 5.0 6.5 7.5 4.42 5.44 6 grams 2.0 3.5 5.0 6.0 7.0 9.0 5.42 6.24 P 4.5 8.0 11.5 15.0 18.5 22.5
Participant1 Participant2 Participant3 Participant4 Participant5 Participant6

Small MSError
10

Speed of Response

0 2 4 Amount of Reward (grams) 6

The ANOVA on these data would be as seen below. Note that the F-ratio would be significant (FCrit(2,10) = 4.1). Source Between Treat Within Subject Error Total SS 11.03 75.9 74.43 1.47 86.93 df 2 15 5 10 17 MS 5.51 F 37.45

0.147

Moderate MSError Next, keeping all the data the same (so SSTotal would be unchanged), and only rearranging data within a treatment (so that the 2 for each treatment would be unchanged), Ive created greater interaction between participants and treatment. Note that the participant means would now be closer together, which means that the SSSubject is smaller. In the data table below, youll note that the sums across participants (P) are more similar than in the earlier example.
Ch. 14 Repeated Measures ANOVA - 11

P1 P2 P3 P4 P5 P6 Mean s2

2 grams 1.0 2.0 3.0 4.0 5.0 6.0 3.5 3.5

4 grams 1.5 3.5 2.5 6.5 5.0 7.5 4.42 5.44

6 grams 3.5 5.0 2.0 6.0 9.0 7.0 5.42 6.24

Speed of Response

P 6.0 10.5 7.5 16.5 19.0 20.5

Moderate MSError
10

Participant1 Participant2 Participant3 Participant4 Participant5 Participant6

0 2 4 Amount of Reward 6

Note that the F-ratio is still significant (FCrit(2,10) = 4.1), though it is much reduced. Note, also, that the MSTreatment is the same as in the earlier example. Source Between Treat Within Subject Error Total SS 11.03 75.9 63.09 12.81 86.93 df 2 15 5 10 17 MS 5.51 F 4.31

1.28

Large MSError Next, using the same procedure, Ill rearrange the scores even more, which will produce an even larger MSError. Note, again, that the SSSubject grows smaller (as the Participant means grow closer to one another) and the SSError grows larger.
P1 P2 P3 P4 P5 P6 Mean s2 2 grams 1.0 2.0 3.0 4.0 5.0 6.0 3.5 3.5 4 grams 3.5 6.5 7.5 1.5 2.5 5.0 4.42 5.44 6 grams 6.0 9.0 3.5 5.0 7.0 2.0 5.42 6.24 P 10.5 17.5 14.0 10.5 14.5 13.0
Participant1 Participant2 Participant3 Participant4 Participant5 Participant6

Large MSError
10

Speed of Response

0 2 4 Amount of Reward 6

Source Between Treat Within Subject Error Total

SS 11.03 75.9 11.76 64.14 86.93

df 2 15 5 10 17

MS 5.51

F .86

6.41

Ch. 14 Repeated Measures ANOVA - 12

Varying Individual Differences It is possible to keep the MSError constant, while increasing the MSSubject, as the two examples below illustrate. As you see in the first example, the SSSubject is fairly small and the MSError is quite small.
P1 P2 P3 P4 P5 P6 M Sum (T) SS 2 grams 2.0 3.0 4.0 5.0 6.0 7.0 4.5 27.0 17.5 4 grams 3.0 4.0 5.0 6.0 7.0 8.0 5.5 33.0 17.5 6 grams 4.0 5.5 6.0 7.5 8.0 9.5 6.75 40.5 19.375 P 9.0 12.5 15.0 18.5 21.0 24.5 100.5
Participant1 Participant2 Participant3 Participant4 Participant5 Participant6

Small Individual Differences


10

Speed of Response

0 2 4 Amount of Reward (grams) 6

Source Between Treat Within Subject Error Total

SS 15.25 54.375 54.125 .25 69.625

df 2 15 5 10 17

MS 7.625

F 305

0.025

Next, Ive decreased the first two participants scores by a constant amount and increased the last two participants scores by a constant amount. Because the interaction between participant and treatment is the same, the MSError is unchanged. However, because the means for the 6 participants are more different than before (greater individual differences), the SSSubject increases. Nonetheless, the F-ratio is the same, because those individual differences are removed from the error term.
P1 P2 P3 P4 P5 P6 M Sum (T) SS 2 grams 1.0 2.0 4.0 5.0 7.0 8.0 4.5 27.0 37.5 4 grams 2.0 3.0 5.0 6.0 8.0 9.0 5.5 33.0 37.5 6 grams 3.0 4.5 6.0 7.5 9.0 10.5 6.75 40.5 39.375 P 6.0 9.5 15.0 18.5 24.0 27.5 100.5
Participant1 Participant2 Participant3 Participant4 Participant5 Participant6

Moderate Individual Differences


12

10

Speed of Response

0 2 4 Amount of Reward (grams) 6

Source Between Treat Within Subject Error Total

SS 15.25 114.375 114.125 .25 129.625

df 2 15 5 10 17

MS 7.625

F 305

0.025

Ch. 14 Repeated Measures ANOVA - 13

SPSS for Repeated Measures ANOVA: G&W 457 First, enter as many columns (variables) as you have levels of your independent variable. Below left are the data, with each column containing scores for a particular level of the IV. For the analysis, choose General Linear Model->Repeated Measures from the Analyze menu. Doing so will produce the window seen below right. Note that Ive given the WithinSubject Factor Name (sleepdep) and the number of levels (3). Once I click on Add, I would click on the Define button.

The next window that appears has all your variables on the left. Ive moved the appropriate ones to the right, as seen below left. As was true for the independent groups ANOVA, youd probably want to know the group means, etc. Thus, youd click on the Options button and check the Descriptive Statistics box. As you see in the window below right, Ive also checked the boxes for effect size and power.

Clicking on the OK button will produce the analysis seen below. The first information will be the descriptive statistics.

Ch. 14 Repeated Measures ANOVA - 14

Next will be some output (multivariate analyses, sphericity test) that you can ignore.

Next will be the actual source table for the ANOVA. As we noted earlier, the source table appears to be relatively complicated, but you can simplify the output with the proper focus. First, note that there are two basic rows of interest: the Treatment row (sleepdep) containing the F-ratio and the Error row. Within each row will be other information you can ignore. For instance, for our purposes, you can focus entirely on the Sphericity Assumed line.

Finally, there are some other parts of the output that you can ignore, as seen below:

Ch. 14 Repeated Measures ANOVA - 15

Practice Problems Drs. Dewey, Stink, & Howe were interested in memory for various odors. They conducted a study in which 6 participants were exposed to 10 common food odors (orange, onion, etc.) and 10 common non-food odors (motor oil, skunk, etc.) to see if people are better at identifying one type of odorant or the other. The 20 odors were presented in a random fashion, so that both classes of odors occurred equally often at the beginning of the list, at the end of the list, etc. (Thus, this randomization is a strategy that serves the same function as counterbalancing.) The dependent variable is the number of odors of each class correctly identified by each participant. The data are seen below. Analyze the data and fully interpret the results of this study.
Food Odors 7 8 6 9 7 5 42 304 10 Non-Food Odors 4 6 4 7 5 3 29 151 10.8

X (T) 2 X SS

Ch. 14 Repeated Measures ANOVA - 16

Suppose that Dr. Belfry was interested in conducting a study about the auditory capabilities of bats, looking at bats abilities to avoid wires of varying thickness as they traverse a maze. The DV is the number of times that the bat touches the wires. (Thus, higher numbers indicate an inability to detect the wire.) Complete the source table below and fully interpret the results.

Ch. 14 Repeated Measures ANOVA - 17

Dr. Richard Noggin is interested in the effect of different types of persuasive messages on a persons willingness to engage in socially conscious behaviors. To that end, he asks his participants to listen to each of four different types of messages (Fear Invoking, Appeal to Conscience, Guilt, and Information Laden). After listening to each message, the participant rates how effective the message was on a scale of 1-7 (1 = very ineffective and 7 = very effective). Complete the source table and analyze the data as completely as you can.

Ch. 14 Repeated Measures ANOVA - 18

Dr. Beau Peep believes that pupil size increases during emotional arousal. He was interested in testing if the increase in pupil size was a function of the type of arousal (pleasant vs. aversive). A random sample of 5 participants is selected for the study. Each participant views all three stimuli: neutral, pleasant, and aversive photographs. The neutral photograph portrays a plain brick building. The pleasant photograph consists of a young man and woman sharing a large ice cream cone. Finally, the aversive stimulus is a graphic photograph of an automobile accident. Upon viewing each photograph, the pupil size is measured in millimeters. An incomplete source table resulting from analysis of these data is seen below. Complete the source table and analyze the data as completely as possible.

Ch. 14 Repeated Measures ANOVA - 19

As before, given that old exams use StatView, here is an example of a repeated measures analysis in StatView. Note that there are several differences between the SPSS output and StatView output (reversal of df and SS columns, inclusion of a row for the Subject effect). Suppose you are interested in studying the impact of duration of exposure to faces on the ability of people to recognize faces. To finesse the issue of the actual durations used, I'll call them Short, Medium, and Long durations. Participants are first exposed to a set of 30 faces for one duration and then tested on their memory for those faces. Then they are exposed to another set of 30 faces for a different duration and then tested. Finally, they are given a final set of 30 faces for the final duration and then tested. The DV for this analysis is the percent Hits (saying Old to an Old item). Suppose that the results of the experiment come out as seen below. Complete the analysis and interpret the results as completely as you can. If the results turned out as seen below, what would they mean to you?

Means Table for Duration Effect: Category for Duration Count Short Medium Long 24 24 24 Mean 43.833 47.792 49.917 Std. Dev. 7.257 7.342 6.978 Std. Err. 1.481 1.499 1.424

Ch. 14 Repeated Measures ANOVA - 20

You might also like