You are on page 1of 28

EUROPEAN JOURNAL OF COGNITIVE PSYCHOLOGY, 2002, 14 (4), 521547

The induction of solution rules in Ravens Progressive Matrices Test


Tom Verguts and Paul De Boeck
K.U. Leuven, Belgium
In this paper, we study the rule induction process in a popular intelligence test, Ravens Advanced Progressive Matrices test (RPM; Raven, 1962). Carpenter, Just, and Shell (1990) have shown that only a few rule types are necessary to describe all items of the test. An independent question that has not been investigated is whether participants profit from this, that is, do they learn to apply these rules more fluently throughout the test? We show that this is the case and look in detail at different aspects of this learning process. The relevance of our findings is discussed for a process theory on solving RPM items.

The Ravens Advanced Progressive Matrices (RPM) test is an intelligence test widely used both in applied and research settings (e.g., Arthur & Day, 1994; Jensen, 1987, respectively). The test correlates with many (other) indices of intellectual functioning (Marshalek, Lohman, & Snow, 1983). Therefore, it is often treated as a measure of g, or general intelligence. For example, in the cognitive correlates tradition, some researchers have investigated the role of processing speed in elementary cognitive tasks (ECTs) on intelligence by correlating the reaction time on these ECTs with RPM performance (e.g., Carlson, Jensen, & Widaman, 1983; Jensen, 1987). Other researchers have added working memory measures to this battery and compared correlations between, on the one hand, ECT performance and working memory measures, and, on the other hand, RPM performance (e.g., Fry & Hale, 1996; Salthouse, 1991). Still others have investigated the role of nerve conduction velocity (NCV) on intelligence by correlating NCV with RPM scores (Reed & Jensen, 1992). Still, despite its presence in many studies directly or indirectly related to intelligence, few attempts have been made to study the RPM solution process in great detail. Such an account would be useful, however, since it could shed a
Requests for reprints should be addressed to Tom Verguts who is now at Department of Psychology, Ghent University, H. Dunantlaan 2, 9000 Ghent, Belgium. Email: Tom.Verguts@rug.ac.b e We wish to thank Frank Rijmen, Gilbert Vander Steene, and Johan Wagemans for their useful comments. # 2002 Psychology Press Ltd http://www.tandf.co.uk/journals/pp/09541446.html DOI:10.1080/09541440143000230

522

VERGUTS AND DE BOECK

light on why and under what circumstances one should expect the RPM test to correlate with these other measures. Two exceptions are the theories provided by Carpenter, Just, and Shell (1990) and Embretson (1995). We will here focus on Carpenter et al.s work. Carpenter et al.s task analysis of the RPM test reveals that a few (specifically, five) rules can be used to solve all items of the test. Starting from that idea, the authors propose a computer simulation model that can find the relevant rules in an item by trying to apply these different rules subsequently. Since many rules are often required in order to solve one particular item, sub-results must be stored during the problemsolving process also. The authors propose two factors of individual differences in RPM solving ability: first, the number of rules a person has available; and second, the number of sub-results that can be stored simultaneously by a person. An independent question that arises naturally from Carpenter et al.s (1990) study is whether participants also actually use these same rules repeatedly while solving the RPM test. Furthermore, given that this is the case, it might be hypothesised that these rules become activated during test taking and therefore people become biased toward using these particular rules. Such an effect would be in line with the Einstellung , or mental set effect that has been found in the water jars task (e.g., Dover & Shore, 1991; Luchins & Luchins, 1954) and similar tasks (Lovett & Anderson, 1996). Following Luchins and Luchins study, the influence of a problem solving set on current problem solving has been extended to a variety of tasks, comprising anagrams (Kaplan & Schoenfeld, 1966; Lemay, 1972; White, 1988), insight problems (Duncker, 1945; Maier, 1931) and baseball knowledge (Wiley, 1998). The present paper is an attempt to investigate the mental set effect in RPMlike items. As noted in the previous paragraph, this is not the first paper to report a mental set effect in cognitive tasks. However, what is new is that (1) the effect is investigated with and obtained with RPM (or RPM-like) items, and (2) a formal model is described that allows testing different mental-set aspects in the data, such as the existence of a lag effect (i.e., differential influence of earlier versus more recent items on current item solving). Given that the mental set effect has been reported before, it might seem obvious to expect one in the RPM data as well. However, almost all research in the (individual difference) literature concerned with the RPM assumes, usually implicitly, that no learning effects occur. For example, different factor analyses have been performed on the RPM test to discover its underlying dimensionality (e.g., Alderton & Larson, 1990). Such an analysis assumes that one or more constant person abilities and item difficulties are involved during test taking. The presence of learning in the RPM would violate these assumptions. Although this mental set effect may sometimes hinder in finding the correct rule, we think a mental set can often actually be beneficial for the following reason. The rules of the first items of the test are usually very easy to find. It is believed that these rules are then activated in the mind of the testee. Progressing

INDUCTION OF SOLUTION RULES

523

through the test, the items become more and more difficult,1 and it is no longer the case that the correct rule is immediately elicited. However, this situation only occurs after some solution rules have already been activated, namely, those of the earlier items. We assume that the testee will try out the rules she has tried before, and the probability (or insistence) of trying out a certain rule depends on the activation of that rule, that is, on its occurrence in previous trials. This positive outlook on the mental set effect is in line with the sequence effect (Sweller & Gee, 1978). This is the effect that if problems are ordered from easy to hard, the problems are much easier to solve than when they are ordered from hard to easy. More generally, Ross (1984, 1987) has given a similar account of learning a cognitive skill, in which he emphasises the role of remindings of earlier problem-solving episodes on current problem solving. Lovett and Anderson (1996), in the context of their building sticks task, emphasize that both past and present situation aspects determine the probability that a particular rule is chosen. The present paper concentrates on the first aspect, namely, the influence from past experience on current problem solving. More specifically, we will study the influence of previous items (in the same test) on the current item. The second aspect, the influence from the present situation aspects (i.e., stimulus features in the presented item), will be discussed shortly in Experiment 2, but it is not in the focus of our empirical attention. The remainder of this paper is built up as follows. First, we explain the structure of typical RPM items in some detail. Second, we develop a formal model that will allow us to investigate different aspects of the solution process. Third is a description of the first application of the theory in Experiment 1. Finally, we describe a second application in Experiment 2, which is followed by a General Discussion.

RPM ITEMS
An example of an RPM item is given in Figure 1. (For copyright reasons, the item is invented, but it is similar to a real RPM item.) The goal is to pick one of the eight response alternatives in the lower part that according to a logical rule completes the 3 6 3 matrix of elements in the upper part. There are 36 such items. Usually, a time limit of 40 minutes is imposed on the total test. In principle, an item could be solved rowwise, columnwise, or by a combination of the two. However, we assume here and in the following that each item is solved rowwise. Also, participants in our Experiments are always instructed to solve items rowwise. The correct answer alternative is 3 in this case; it is the addition of the black parts of the first and second elements in the third row of the
As an illustration, in a previous study of ours concerned with the RPM (Verguts, De Boeck, & Maris, 1999) the full RPM test was administered with a time limit of 40 minutes (which is usual). The first 10 items had an average probability of success of .957, so they are quite easy as claimed. The last 10 items had a probability of success of only .442. The total test consists of 36 items.
1

524

VERGUTS AND DE BOECK

Figure 1.

Example of an easy RPM item.

matrix. The relevant (logical) rule in this item can therefore be stated as add the parts of the first and second element to obtain the third one. A second, and slightly more difficult item is shown in Figure 2. This item obeys the following rule: In each row, the fork is making a rotational movement. If this rule is applied to the third row, it is clear that answer alternative 3 is the correct one. In this case, the relevant rule may be stated as a figure is rotating over the columns.

INDUCTION OF SOLUTION RULES

525

Figure 2.

Second example of an easy RPM item.

Yet a third and more difficult item is given in Figure 3. This item resembles the last, and most difficult item of the RPM test. The item can be solved as follows. For the outer lines (i.e., the diagonal lines), the lines that element 1 and 2 (within a row) have in common also appear in element 3. For the inner lines (i.e., the ones attached to the central dot), only the lines that are unique in element 1 or 2 do appear in element 3. Applying this rule to the third row, one sees that alternative 2 is the correct response. This third

526

VERGUTS AND DE BOECK

Figure 3.

Example of a difficult RPM item.

item will serve as the basis of all items in our first Experiment, as will be explained later in the paper.

A MODEL FOR MENTAL SET


Denote by Ypi the variable that indicates for item i which rule was chosen by participant p. Further, let the value xik indicate whether item i is of rule type k and this information was available for the participant after having responded.

INDUCTION OF SOLUTION RULES

527

Hence, if both item i was of type k and this information was available, then xik = 1; otherwise, xik = 0. The relevant information (i.e., that item i was of type k) can, for example, become available by the participant finding the appropriate rule k. Another way this information can become available is when the correct rule is told by someone else (e.g., the experimenter). We then make the following assumption on the probability that rule k is chosen
exp lk bp xjk j1 " # PrYpi k K i 1 P P exp lm bp xjm
m 1 j1

"

i 1 P

if K rules are involved. The parameter lk denotes the initial strength of rule k. The parameter bp is a (person dependent) learning parameter and scales the P 1 effect of previous usage of rule k (factor ij 1 xjk ) on the current probability of using rule k. Note that one expects b to be non-negative. If b equals zero, then no learning occurs. Each person is assigned his/her own learning parameter bp . The sizes of estimated b s will inform us about the extent of learning in the data and individual differences in learning rate. The numerator may be regarded as the activation value of rule k, for person p, upon presentation of item i. In the denominator, the summation is taken over all such rule activation values. Hence, the probability that rule k is chosen depends on the activation of that rule relative to all other rules. It can be noted that the present model (1) is similar (but not identical) to the dynamic Rasch model proposed by Verhelst and Glas (1993). To investigate a lag effect (i.e., a differential influence of items that were shown recently versus some time ago), the model can be extended as follows:
exp lk bp xjk i j j1 " # K i 1 P P g exp lm bp xjm i j
j1

PrYpi k

"

i1 P

m 1

In this extended model, if the parameter g is smaller than zero, then a recency effect appears in that recent items have more influence than earlier items (note that i > j and so i j > 0). If the parameter g is larger than zero, a primacy effect occurs because in that case, earlier items have a stronger influence than recently presented items. If the parameter g equals zero, then model (2) reduces to model (1). There is only one g parameter for all persons. All parameters are unrestricted, that is, they can in principle take any value between minus and plus infinity.

528

VERGUTS AND DE BOECK

In reality, a solution is often found at the end of a string of (preliminary) responses, i.e., a sequence of rules (or temporary hypotheses) that are temporarily accepted, tried out, and possibly rejected. With equations (1) and (2), we will model only the very first response on every item, so that every item effectively consists of one trial only (see Lovett & Anderson, 1996). These responses are obtained by examining the think-aloud protocol of each participant (see Procedure for the first experiment, later). The reason is that studying the other responses as well, does have the effect that dependencies between responses must be introduced, which would make an interpretation of the data less straightforward. However, we will also analyse the accuracy data (correct/ incorrect scores) without trying to formally model these, as described later on. It should be emphasised that these models are not so much intended to be new cognitive models. Instead, they are intended to be the simplest models that incorporate learning and that can inform us about learning aspects in the current data, with or without a lag effect (for models 2 and 1, respectively).

EXPERIMENT 1
Administering the RPM itself to test the learning effect yields a number of problems. First, the items of the RPM are not designed to have a balanced or systematic representation of the rules, nor are they designed to test the kind of hypothesis we want to test. Second, the solution process in the RPM entails more than just finding the correct rule, whereas our focus is only on the latter process. For example, in the RPM, once the rule is found it needs to be applied correctly. For these two reasons, we will devise a test with a modified response design in order to concentrate on the aspects that are most relevant for our purpose. This makes the model more realistic for the data whereas the essential learning aspects of solving the RPM still apply. Similar modifications were made by Lovett and Anderson (1996) (also Lovett & Schunn, 1999) to the water jars task in order to study that particular task more clearly. The four modifications in our case are the following. (1) In the real RPM items, a rule should be found for two rows, which should then be applied to the third row (see Figures 1, 2, and 3). In the items of this experiment, on the other hand, the participant is given only one row, in which the correct rule is to be found. The advantage is that the emphasis of solving an item is directed toward finding the correct rule. (2) We will ask participants to talk aloud while thinking, so that we can know what the first activated rule is. Ericsson and Simon (1984) have argued that as long as participants are questioned during the problem-solving process, the talk-aloud technique is a useful one to find out how people solve complex tasks (see also Veenman, Elshout, & Groen, 1993). On the other hand, DeShon, Chan, and Weissbein (1995) have shown that concurrent verbalisation may have

INDUCTION OF SOLUTION RULES

529

a (small) effect on the performance of some RPM items. Nevertheless, many studies have profitably used the concurrent speech procedure (e.g., Carpenter et al., 1990; Kotovsky & Simon, 1973), and it is, in our opinion, one of the most direct ways to discover how participants solve such items. (3) The RPM test is given either without a time limit (e.g., Jensen, 1987) or with a general time limit over items (usually 40 minutes, e.g., Raven, 1962). In the present test, on the other hand, a limit of 30 seconds per item was imposed for purposes of standardisation. (4) If the item is solved correctly in time, the experimenter indicates that the item was solved correctly (i.e., Thats right, go on). If the item is not solved or not solved correctly in the allotted 30 seconds, the experimenter tells the participant the correct rule. This way, all participants can be assumed to receive the same information after every item, either from solving the item themselves or from the solution given by the experimenter. This will be called an explicit feedback procedure. More specifically, we can assume that for all items i of type k it holds that xik = 1.

Method
Description of the items. The test consists of 5 series of items. The idea here is to gradually introduce the different possible rules in different amounts for two conditions. This allows us to investigate the role of pre-exposure to the different rules. Series 1 consists of three types of items, denoted Unique items, Common items, and Addition items. An example of a unique item is given in Figure 4(a). This is denoted a unique item because the third element keeps the unique parts (lines) from elements 1 and 2. This is the same rule as the one for the interior lines of the item in Figure 3, as discussed earlier in the text. Similarly, in a common item of series 1, only the common parts of elements 1 and 2 appear in element 3. This is analogous to the rule for the exterior lines of the item in Figure 3. Finally, we have addition items, where all lines of elements 1 and 2 are added in element 3. Series 1 contains 10 items, namely, two addition, four unique, and four common items. Common and unique items appear more often (four exemplars each) because they are used in the critical series 5 (see discussion later). Items of series 1 are used to familiarise the participant with the rules addition, unique, and common. A typical series 2 item is given in Figure 4(b). Here, a distinction should be made between the outer and the inner lines. On the outside, the unique rule holds, while on the inside, the common rule is valid. Items where the inside/ outside distinction is relevant will be coined inside items. Moreover, the item in Figure 4(b) will be called an inside-common-unique item because rule common is necessary on the inside, unique on the outside. With this terminology, the item of Figure 3 is an inside-unique-common item because unique is the appropriate

Figure 4. Examples of an item in series 1 (a), series 2 (b) and (c), and series 3, 4, or 5 (d) in Experiment 1.

530

INDUCTION OF SOLUTION RULES

531

rule on the inside, common on the outside. Note, however, the important difference between the item of Figure 3 and the item in Figure 4(b) in that the latter strongly prompts the inside/outside distinction by the separation of the inner and the outer parts, while the same is not true for the item of Figure 3. The idea here is that the items of series 1 and 2 activate the rules needed in the later series, where the items are more difficult. A second item of series 2 is given in Figure 4(c). Here, the upper/lower distinction is the relevant one, as is clearly indicated in the item by the separation of upper and lower parts. We will denote such items as above items. The reader can determine that, analogously to what has been mentioned before, this is a above-common-addition item. Note that the middle horizontal lines should be treated as upper lines, rather than as lower lines; the choice, of course, is arbitrary, but they are consistently treated as upper lines throughout the test. Series 2 consists of 10 items, namely, 5 above and 5 inside items. The focus of our analysis is on the rules inside versus above. However, a pure above or inside item is not possible, and these items are therefore always completed with two rules for the two parts. These completing rules are of the type unique, common, or addition (e.g., the first item of series 2 is insideaddition-unique, the second inside-addition-common, and so on). Items of series 3, 4, and 5 look similar to one another. A typical item is given in Figure 4(d). This is an inside-unique-common item, but the difference with series 2 items is that the correct distinction is not indicated by a gap separating the interior and exterior, or the upper and lower parts of the item. Hence, these items are analogous to the difficult items in the real RPM, in which the correct rule is not easy to find, but in which one profits from the rules used earlier on in the test. Series 3 and 4 each contain 10 items, series 5 contains 12 items. How many items of the different types there are in each series is discussed next. Design. There are no inside or above rules in series 1. Each of the rules inside and above is presented five times in Series 2. Series 3 and 4 are most critical. In condition 1, participants receive a large number of inside items in series 3 and 4. In condition 2, they are given a large number of above items in series 3 and 4. More specifically, participants in condition 1 receive 8 inside items and 2 above items for the total of series 3 and 4 (and 10 filler items, items in which neither rule is necessary), while persons in condition 2 receive 2 inside items and 8 above items for the total of series 3 and 4 (and 10 filler items). These filler items are always (complete) unique items and are the same for the two conditions. Series 5 consists of an equal amount of inside and above items, six items each, in each condition. The crucial prediction is that people of condition 1 will more often think of applying the inside rule when solving the items of series 5, whereas people of condition 2 will more often think of the above rule.

532

VERGUTS AND DE BOECK

For the analysis and modeling of the data, we will consider three kinds of rules: inside, above, and a rest category. The rest category contains all other responses given by participants. Hence, the variable Ypi can take on three values (inside, above, and other). Participants. Six persons participated in each condition (so N = 6 6 2 = 12). Each received a small amount of money for participation. Procedure. The test is a computerised one. After the instructions are given, participants solve series 1 to 5 (each separated by a short break) by thinking and talking aloud in a microphone. If the item is solved (i.e., the rule is found) within 30 seconds, the experimenter notes that the item was correctly solved (by telling them Thats right), and the participant is asked to press the space bar to proceed to the next item. If the item is not solved in the allotted 30 seconds, the rule involved in the item is explained. Then, the participant is asked to press the space bar in order to go to the next item. The responses are written down afterwards by listening to the audiotape. Data analysis. Four types of analysis will be presented. First, since we report both choice data (e.g., Equation 1) and accuracy data, we first check whether it is useful to analyse choice and accuracy data separately. Therefore, we calculate the chi-square test for association (Siegel & Castellan, 1989, p. 111) between the binary variables Is the item solved correctly? (i.e., success on a particular item, or accuracy) and Is the first rule chosen the correct one? Remember that we model the very first choice on any particular item. Hence, if choice data and accuracy data contain the same information, the correlation between these two variables (accuracy and accuracy of the first choice) should be close to 1. We calculate the correlation between these two variables as a descriptive index of the amount of association between choice and accuracy. This analysis will be called the association data analysis. Second, we report the results for the accuracy data. Here, we investigate the condition effect by studying the interaction between item type (inside or above rule) and condition with accuracy as a dependent variable. Third, we report whether participants in different conditions consider different rules on their first confrontation with an item. This will be done by checking the main effect of the frequency of inside responses. The frequency of above responses is not included in this analysis since it is almost linearly dependent on the number of inside responses. Hence, taking (inside, above) as a within-subject variable would be problematic because of the very strong dependence between the number of inside and above responses. Since inside and above are almost linearly dependent (almost because one can also choose an other response; see Table 2 and its discussion later), a preference of condition 1 for inside relative to condition 2 should show up as a main effect of the

INDUCTION OF SOLUTION RULES

533

condition with number of inside choices as a dependent variable. This analysis may be misleading in case the number of other responses differs greatly between conditions. However, this number is low and exactly equal for both condition (see Table 2 and its discussion). This analysis will be referred to as the choice data analysis. Note that in principle the choice and accuracy data provide different types of information, because we investigate the first choice upon presentation of an item (although there may be a correlation between first choice accuracy and final accuracy, as discussed earlier in the first paragraph of Data analysis). In case we would investigate the last choice, these two types of information would be redundant. The fourth type of analysis is discussed in the following subsection. Model-based analysis. As noted earlier, the equations (1) and (2) only incorporate information from the past, not from the item itself. This is quite unrealistic for items of series 1 and 2, since the rule is easily seen there, based upon the information in the item. On the other hand, for items of series 3 to 5, it may very well be plausible that stimulus information (from the present item) does not influence the rule sampling probabilities; as the reader can check, it is difficult to use stimulus information in these items to find the correct rule. Hence, the model as given in (1) and (2) will be used to analyse items of series 3, 4, and 5 only. Nevertheless, feedback Pi1obtained in series 2 will be incorporated into the model via the factor j1 xjk (in equation 1) or via the factor Pi1 g (in equation 2) since the index j is taken to start from the first x i j jk j1 item in series 2. Data will be analysed for each condition separately, so there are (10 + 10 + 12) 6 6 = 192 data points per analysis. Parameters will be estimated by maximising the likelihood function, and the standard error of each parameter will be estimated.2 Further, for each model the Akaike Information Criterion (AIC; Akaike, 1974) will be calculated, which is defined as AIC = 2 6 ln(L) + 2M, where L denotes the likelihood function (evaluated in the estimated parameters) and M the number of freely estimated parameters. It is common to select the model with the lowest AIC value as the best fitting one. Further, since the models (1) and (2) are nested (model 1 equals model 2 with g restricted to zero), it is possible to test the relative fit of the two models. Specifically, in the present case, the variable X = 2 6 ln(L1) [2 6 ln(L2 )] should follow a chi-square distribution with one degree of freedom under model (1), where L1 and L2 denote the likelihood functions for model (1) and (2) respectively. A high value of X is indicative of the presence of a lag effect (that is, g = 0).

2 Standard errors can be estimated by evaluating the (Hessian) matrix of second-orde r derivatives at the maximum likelihood estimators. The square root of the diagonal values of minus the inverse of this matrix provide an estimate of the standard errors (e.g., Schervisch, 1995).

534

VERGUTS AND DE BOECK

Results
Association data. In condition 1, the Pearson chi-square value for association between accuracy on the first response and (final) accuracy equals X 2 = 9.852, p = .002. This is statistically significant, but the correlation equals only .273. Hence, it seems warranted to study choice and accuracy separately. In condition 2, we obtain X 2 = 0.738, p = .394, with a correlation of .075. Hence, a separate analysis is required here as well. Accuracy data. Table 1 shows the mean accuracies for both response types and both conditions. The interaction between condition and item type is significant, F(1, 10) = 12.500, p = .005. Choice data. Table 2 shows, for all participants and both conditions, the number of inside, above, and other responses. It can be seen that participants in condition 1 prefer response inside, whereas participants in condition 2 prefer response above. The number of other responses is low and is equal for both conditions (4/72). An independent samples t-test on the inside responses yields an effect for the condition, t(10) = 4.969, p = .001.

TABLE 1 Proportions of success (accuracy data) in series 5, Experiment 1 Condition 1 Inside .94 Above .72 Condition 2 Inside .89 Above .94

TABLE 2 Frequencies of responses (choice data) in series 5, Experiment 1 Condition 1 Participant 1 2 3 4 5 6 Inside 9 10 7 8 8 7 Above 2 2 5 4 4 2 Other 1 0 0 0 0 3 Participant 1 2 3 4 5 6 Condition 2 Inside 3 3 4 6 5 6 Above 7 8 8 6 7 5 Other 2 1 0 0 0 1

INDUCTION OF SOLUTION RULES

535

Model-based analysis. A first point of interest is whether the lag parameter is needed: Do people just add all information from previous items, or do they weigh previous items according to recency? To investigate this, we calculated the AIC value for models (1) and (2) for the condition 1 data. These are 331.428 and 329.244 respectively, suggesting that the lag model performs better. Further, since the models are nested, it is possible to statistically test the necessity of the lag parameter as described earlier. The corresponding statistic X equals 4.184, p = .041. In condition 2, the AIC values are 385.800 and 364.146, for models (1) and (2), respectively. Further the statistic X equals 23.654, p < .001. Hence, in both conditions a lag parameter seems to be needed. In the following, the estimated parameters of this lag model will be considered. Table 3 shows the estimated parameters of the lag model together with their estimated confidence intervals, for each condition separately. Several things are to be noted about this table. First, regarding the initial strength parameters l , note that we have restricted one of these for purposes of identification. Second, the experimental analysis already indicated that there is a learning effect in the data. The present model-based analysis suggests that learning occurs for all participants, since no b confidence interval contains zero. Furthermore, the analysis shows that there are also individual differences in learning speed, since some estimated b parameters are outside the confidence intervals of other (b) parameters. However, comparison of the magnitudes of the b values over conditions is not meaningful since the g parameter is very different across conditions, which is discussed next. Third, the g parameter has a positive value in condition 1, which seems to indicate that earlier items have a stronger impact

TABLE 3 Estimated parameters and confidence intervals for lag model, Experiment 1 Condition 1 Parameter l1 l2 l3 b1 b2 b3 b4 b5 b6 g * Restricted. Estimate 0.655 1.072 0* 0.012 0.028 0.019 0.017 0.012 0.006 1.006 95% CI (1.023, 0.287) (1.511, 0.633) (0, 0)* (0.006, 0.018) (0.012, 0.044) (0.009, 0.029) (0.007, 0.027) (0.006, 0.018) (0.002, 0.010) (0.928, 1.084) Parameter l1 l2 l3 b1 b2 b3 b4 b5 b6 g Condition 2 Estimate 1.026 1.043 0* 0.472 0.961 1.347 1.317 0.822 1.075 0.872 95% CI (1.377, 0.675) (1.351, 0.735) (0, 0)* (0.125, 0.819) (0.557, 1.365) (0.835, 1.859) (0.815, 1.819) (0.444, 1.200) (0.644, 1.506) (1.001, 70.743)

536

VERGUTS AND DE BOECK

than more recent items. This seems counterintuitive, but it is perhaps best explained by the fact that the very first two items over which the index j in equation (2) ranges, are of type inside, and so the model seems to pick up the influence of these two items. In condition 2, on the other hand, the parameter g is negative and thus a recency effect occurs in the sense that more recent items have a larger influence than earlier items.

Discussion
We have found a reasonably strong learning, or mental set phenomenon in these simplified RPM items using rules inside and above, in both choice and accuracy data. Both conditions of the experiment had received both rules earlier (7 and 13 times for the infrequent and frequent rule respectively), so the effect cannot be due to the fact that people in each group knew only one of the particular rules (inside or above in condition 1 and 2 respectively). Our finding is that, if one rule is used more often than the other one, the first one will be considered more frequently than the other (but not always) later on. An account of this finding in terms of activation of the rules as we have given in this paper, seems a plausible one. Moreover, it was found that earlier items may be weighted differently than more recent ones. It is useful to mention that this experiment rules out the possibility that people learn in the RPM test simply because they get used to the general test format. If that were so, then it would not matter exactly which rules people would be trained on, and no main effect in the choice data or interaction in the accuracy data would have been found. The fact that there is an effect indicates that learning in this test is rule-specific, that is, it is due to getting used to using specific rules. A possible drawback of the current experiment is that the items and the procedure differ from the RPM. We will concentrate on two points. First, the testee always receives explicit information about whether or not the item was solved correctly. This is different from the procedure in the real RPM: There, one solves the items in silence without ever receiving explicit feedback about the correctness of the chosen alternative. Nevertheless, we think that in the RPM test as well, partial feedback is operating, namely for items where a correct response was given. Indeed, choosing a certain rule to solve the item and finding a response alternative that matches this rule provides the information that the rule chosen is the correct one. The idea is that, if one chooses an incorrect rule, the probability is low that one also finds a matching response alternative below. Hence, we believe that the results of Experiment 1 transfer to the RPM situation since in case the response is correct, implicit feedback is given about the correctness of the rule (note that the presence of feedback is critical in learning the rules). Still, it remains to be seen whether our results generalise to situations without explicit feedback.

INDUCTION OF SOLUTION RULES

537

Second, the items in Experiment 1 were chosen such that it was very difficult to determine, from a quick glance at the item, which rule (inside or above) is the relevant one. This made model development quite easy, since no perceptual components were required in the model to account for the effect features of the items may have. However, the items in the RPM often give a clue as to which rule is the relevant one, so that our no perceptual cues assumption may not be valid for real RPM items. The models (1) and (2) cannot adequately handle this situation. More generally, the items of Experiment 1 were inspired by the RPM test, but they were different in important ways. In Experiment 2, we will start from the 36 real RPM items and only make small adjustments to these items if necessary. More specifically, concerning the two points noted earlier, no explicit feedback is given in the new items and some of the items hint clearly as to which rule should be used. We expect a similar pattern of results in this new situation as in the previous experiment, even though the testing situation is a more complex one.

EXPERIMENT 2
Here, we start from the task analysis performed by Carpenter et al. (1990). These authors have distinguished five rule types that are used in different RPM items: distribution of 3, distribution of 2, progression, constant in a row, and addition. The addition rule was divided in three different rules in the previous experiment, namely, addition, unique, and common. Almost all RPM items can be described using this set of principles. The constant rule will not be used in our experiment. The other rules not mentioned before will be explained in Description of the items later on. Starting from the RPM, we make two conditions, one in which one set of rules is learned (e.g., addition, unique) and the other in which the complementary set of rules is learned (e.g., distribution of 3, progression). This is done by adapting real RPM items as needed, see Method later. After both groups have received their (different) set of items, we present items of both types (e.g., both addition and distribution of 3 items) to both groups. Then, we predict that there will be an effect on the dependent variables choice of rule and probability of success as before. Specifically, each group will choose principles they have used before and will be better on the items that follow these principles. Since in this case the items differ largely as to their complexity, and hence, in how easy it is to see the rule, an item complexity factor (or perceptual factor) will be included. This will be done by adding to equations (1) and (2) a factor that takes into account complexity effects. We extend model (1) in the following way:

538

VERGUTS AND DE BOECK

" # i 1 P exp lk bp xpjk aI k; i j1 " # PrYi k K i 1 P P exp lm bp xpjm aI m ; i


m1 j1

Since we do not give explicit feedback in the present experiment, the variable xpik is now made person dependent, referring to whether item i was of type k and person p solved it correctly. The indicator variable I(k, i) takes the value of 1 if only rule k has to be applied and that rule has to be applied only once in the item in question. For example, if rule 1 is applied once in item i and there are no other rules involved in this item, then I(1, i) = 1 and I(k, i) = 0 for k = 1. If rule 1 is applied twice in item i then I(k, i) = 0 for all k. If rule 1 and rule 2 are both necessary in item i, then again I(k, i) = 0 for all k. The idea here is that, if many rule tokens appear in the same item, the correct rule is less easily noticeable, so the probability of response k is lower if other rule tokens are present. Hence, this extra factor introduces a kind of perceptual component into the model, which is scaled by the parameter a. Suppose that a is larger than zero: Then, if I = 1, the probability of success will be higher than if I = 0. Admittedly, the indicator variable I does not refer to explicit item features hinting at the correct rule, but the feature effects are nevertheless modelled indirectly with this procedure. Indeed, our focus is on the learning effects (b) so we thought it would not be necessary to analyse the item features in too much detail. To introduce the lag effect, model (2) is extended as follows:
exp lk bp xpjk i j aI k; i j1 " # PrYi k K i 1 P P g exp lm bp xpjm i j aI m; i
m1 j 1

"

i 1 P

in complete analogy with equation (3).

Method
Description of the items. We will now discuss the Carpenter et al. (1990) RPM rule system in more detail and discuss the adaptations we made for our test. We made those adaptations based on informal assessments concerning the level at which participants generally describe a rule. For example, if two rules R 1 and R 2 are not usually distinguished by participants, they are treated as one rule. However, the exact formulation of a rule is not too critical since participants were pre-exposed to either one of two sets of rules, and two rules of different

INDUCTION OF SOLUTION RULES

539

sets were always conceptually very different. This will become more clear later on. The distribution of 3 (D3) rule involves the fact that three figures (e.g., a circle, a square, and a triangle) appear in the three elements of every row: One of these figures appears in each element. For example, if the first row consists of the sequence circle-square-triangle, the second row of the sequence triangle-square-circle, and the third row of the sequence square-circletriangle, this might be an instantiation of the D3 rule. The distribution of 2 (D2) rule means that the same figures appear in only two out of three elements, as in the sequence square-triangle-, where denotes that no figure is shown. The D2 and D3 rules will not be distinguished in the following analysis and will (together) be referred to as the D3 rule. Indeed, the element may be considered a third figure, so that formally speaking D2 and D3 can be seen as equivalent. Another type of rule is progression, and involves the fact that one of the figures is undergoing some kind of transformation throughout the row (e.g., it becomes progressively smaller). This rule is divided in two separate rules, rotation and progression (where progression covers all Carpenter et al. (1990) progression instances except rotation). Finally, we have the rules that were already used in the previous experiment, namely, addition, unique , and common , the last two of which are treated as one rule by Carpenter et al. (1990). Hence, we will work with the set of rules {addition, common, unique, progression, rotation, D3}. See examples Figures 1, 2, and 3; Figures 1 and 2 show actual items of the newly constructed test, and Figure 3 shows an item that is very similar to an actual item (see also the discussion of these items earlier in the paper). Design. Items of condition 1 are governed by the following three principles: progression, rotation, or D3. If the ith item of the (original) RPM test is a progression, rotation, or D3 item, then this item is taken as the ith item of condition 1. If none of these three rules were needed in the original RPM item, a new item was constructed from the same figural parts as item i, but governed by one of these three rules. Similarly, condition 2 items are governed by the principles addition, common, and unique. Also, if the ith item of the (real) RPM belongs to this class, it is incorporated as the ith item of condition 2. If none of these three rules are needed (in the original RPM item), a new item is constructed from the same figural parts as item i, but governed by the addition, common, or unique rule. If an item is governed by rules from both the sets {progression, rotation, D3} and {addition, common, unique}, the item is split in two, such that each splitted item consisted of rules of one of both sets only. The splitted items are then assigned to the corresponding condition.

540

VERGUTS AND DE BOECK

All participants are first given 34 condition-specific items. Afterwards, the same four items are presented to every participant. Two of these four are built using the unique and common rules, whereas the other two items follow the D3 and rotation principles. The unique and common items are the items 35 and 36 of the (original) RPM test. The other two items were constructed by ourselves, following the principles discussed earlier. One of these is the rotation item which is shown in Figure 2. We expect that participants should be better on the items governed by rule types for which they have been activated. Two matching items of condition 1 and 2 are always built from the same figural elements. In condition 1, there are 17 items with single rule instantiations, i.e., items such that I(k, i) = 1; in condition 2, there are 34 such items (the four common items are included in this count). This asymmetry is the price we pay for staying as close as possible to the real RPM items. Participants in condition 1 have been primed toward the progression, rotation, and D3 rule, the rules which are involved also in items 36 and 38. Participants in condition 2, on the other hand, have received training in the addition, common, and unique rules, which are involved also in items 35 and 37. Hence, we predict that participants in condition 1 are better on items 36 and 38 (than participants in condition 2), whereas participants in condition 2 are better on items 35 and 37 (than participants in condition 1). Hence, we again predict an interaction effect. In a similar vein, we expect a main effect in the choice data. The set effect of items 35 and 36 is not necessarily equal to that in items 37 and 38. The difference is that, while solving items 35 and 36, the problemsolving set is probably stronger than that in items 37 and 38. This is because, while solving the items 35 and 36, the participants have seen only 0 or 1 item(s) that do not conform to the created set, whereas they have seen 1 or 2 item(s) do not conform to the set while solving items 37 and 38. Indeed, the problem solving set literature indicates that presenting even one non-set item can be enough to eliminate its effect (e.g., Luchins & Luchins, 1954). Therefore, we will investigate the expected interaction for items 35 and 36 and items 37 and 38 separately. Participants. Eight persons participated in each condition (so N = 8 6 2 = 16). Each of these received course credit. Procedure. Participants received a computerised test with either the 34 items of condition 1 or the 34 items of condition 2, followed by the same last four items, as described previously. In order to choose a response alternative, participants had to move the mouse arrow to a response alternative and click it. Then they move the arrow to a button that says Next. Clicking this button brings up the next item. Participants were selected to have at least a minimal mouse clicking ability. Contrary to the previous test, the items were not

INDUCTION OF SOLUTION RULES

541

presented in series; that is, they are presented one after the other without breaks. Also in contrast with the previous experiment, the experimenter never intervened during the test taking. Finally, no time limits were imposed since there is no (item specific) time limit in the real RPM test either. Each solved 38 items as described earlier. Participants were required to do two things: Think aloud about the solution and click the answer alternative they think is the correct one. The former process is recorded on a tape recorder. Furthermore, the computer recorded which completion response was chosen. Data analysis. Again, four types of analysis are performed. First, the association data results are calculated as before. Second, for the accuracy data, define A1 to be the number of successes on items 35 and 37 together (a score ranging from 0 to 2), and A2 to be the number of successes on items 36 and 38 (a score from 0 to 2). Then, we predict an interaction between the (within-subject) variable (A1 , A2) and condition. We can also investigate the effect for items 35 and 36 only, or for items 37 and 38 only. This makes sense because items 35 and 36 are the ones appearing immediately after the set-inducing items (1, . . . , 34). Items 37 and 38 appear only later, after the set is possibly broken. Since in the latter case we have a binary dependent variable, it is not very appropriate to use an ANOVA. Therefore, we will also use a permutation test (in addition to the ANOVA). Let us say the test is to be performed for items 35 and 36. (The analysis for items 37 and 38 is similar.) We first calculate the statistic T = n1,35 6 n2,36 /(n1,36 6 n2,35), where nk,i denotes the number of 1s in condition k on item i. The total number of 1s and 0s is fixed and a random permutation of the data is generated. This procedure is repeated 1000 times, and the proportion of Ts calculated in the replicated data that is smaller than the observed T is the resultant p-value. Third, for the choice data, we will only consider the very first rule response on every item, as in the previous experiment. The range of possible rule responses is potentially much larger in this experiment, since we did not intervene during the testing process. We restrict our attention to seven possible rules: the six rules mentioned previously, plus a rest category, which applies for all remaining responses. We calculate, per item, the number of responses coming from set 1, that is, from the set {progression, rotation, D3}. Similary, set 2 contains the rules {addition, common, unique}. Again, there is an almost linear dependence between the number of rules chosen from set 1 and 2, so we do not incorporate set 2 in the analysis and simply look at the main effect of condition on the number of responses from set 1 per item. These numbers can be aggregated (over the items 35, 36, 37, and 38), resulting in a score of 0 to 4 per person. Finally, the fourth type of analysis is discussed in the following paragraph.

542

VERGUTS AND DE BOECK

Model-based analysis. We will estimate the parameters and their confidence intervals for models (3) and (4) in the way described earlier. However, since there are more possible responses here than in the first experiment, and since these data indicated that initial strength estimation is unstable, the initial strengths of all rules except the remainder category were assumed to be equal. Hence, there are now two initial strength parameters to be estimated. Concerning model fit, we again calculate the AIC for both models and investigate their relative fit by comparing their respective values of 2 ln(L) with the statistic X introduced earlier.

Results and discussion


Association data. For condition 1, the Pearson association measure between final success and performance on the first response on an item equals X 2 = 28.34, df = 1, p < .001. The correlation equals r = .305. In condition 2, X 2 = 100.38, df = 1, p < .001, r = .575. Hence, the accuracy on the first response is related to success on the item, but the association is certainly not perfect. Accuracy data. Table 4 shows the accuracies for both conditions and all four items common to condition 1 and 2. As noted earlier, A1 is the variable denoting the number of successes on items 35 and 37 per person. A2 is the analogous variable for items 36 and 38. The interaction between the variables (A1 , A2 ) and condition is not significant, F(1, 14) = 1.000, p =.334. However, if the effect is investigated for items 35 and 36 only, this does result in a significant effect, F(1, 14) = 7.000, p = .019. The corresponding permutation test yields p = .047. For items 37 and 38, the ANOVA yields, F(1, 14) = 0.226, p = .642, and the permutation test p = .233. Hence, the effect seems to be present in items 35 and 36 but not in 37 and 38. The discrepancy is not unexpected, as we analysed the items 35, 36 and 37, 38 separately, because set-breaking items have appeared at the moment when the latter two are being solved. It appears that one or two items not conforming to the rule may break a set. Such an effect,

TABLE 4 Proportions of success (accuracy) aggregated over persons, on items 3538, Experiment 2 Item 35 Condition 1 Condition 2 .375 .750 36 .875 .750 37 .125 .375 38 .625 1.00

INDUCTION OF SOLUTION RULES

543

however, was not present in Experiment 1. We discuss this discrepancy more fully in the General Discussion. Choice data. Table 5 shows the number of times that a first response was from the set {progression, rotation, D3} (denoted condition 1 rules in Table 5 because condition 1 was pre-exposed to these) or from the set {addition, common, unique} (denoted condition 2 rules). The number of other responses is higher than in Experiment 1, but about equal for both conditions (10/32 and 8/32 for condition 1 and 2 respectively). Define S1 to be the number of responses from the set {progression, rotation, D3}. Then, with S1 as a dependent measure, an independent samples t-test for condition 1 versus condition 2 yields t(14) = 2.729, p = .016. Since there is a difference for items 35 and 36 on the one hand and 37 and 38 on the other in the accuracy data, it is worthwhile to study this difference here also. Hence, S1 is restricted to items 35 and 36. This results in t(14) = 2.017, p = .063. For items 37 and 38, t(14) = 2.546, p = .023. Hence, for the choice data the effect seems to be present for both item sets {35, 36} and {37, 38}. Model-based analysis. First of all, models without the a parameters have a much worse fit than either model (3) or (4) (with an a parameter), and will not be discussed here. In condition 1, the AIC values for models (3) and (4) are 772.092 and 771.802 respectively. Hence, according to this criterion, the lag model should be chosen again. However, the statistic X equals 2.290, p = .130, so the addition of the lag parameter is probably not that important in this condition. In condition 2, the corresponding AIC values are 756.996 and 753.416. The statistic X reaches a value of 5.580, p = .018. So the conclusion is here again that it makes sense to introduce the lag parameter. Finally, we discuss the parameter estimates of the lag model, displayed in Table 6. As in the previous experiment, learning occurs and there are individual
TABLE 5 Frequencies of responses (choice), aggregated over persons, on items 3538, Experiment 2 Item Condition 1 rules {Progression, Rotation, D3} 35 Condition 1 Condition 2 4 0 36 6 5 37 3 0 38 8 6 Condition 2 rules {Addition, Common, Unique} 35 0 6 36 0 1 37 1 4 38 0 2

544

VERGUTS AND DE BOECK TABLE 6 Estimated parameters and confidence intervals for lag model, Experiment 2 Condition 1 Condition 2 95% CI (0.501, 0.043) (1.617, 2.161) (0.304, 0.630) (0.090, 0.360) (0.311, 0.231) (0.276, 0.692) (0.031, 0.353) (0.148, 0.198) (0.427, 0.873) (0.189, 0.675) (2.782, 3.578) (0.311, 0.143) Parameter l1 l2 b1 b2 b3 b4 b5 b6 b7 b8 g a Estimate 1.068 1.358 0.605 0.377 0.062 0.513 0.542 0.476 0.310 0.083 2.523 0.440 95% CI (1.327, 0.809) (1.099, 1.617) (0.384, 0.827) (0.152, 0.602) (0.252, 0.376) (0.313, 0.713) (0.285, 0.800) (0.286, 0.666) (0.038, 0.582) (0.191, 0.357) (2.270, 2.776) (0.534, 0.346)

Parameter l1 l2 b1 b2 b3 b4 b5 b6 b7 b8 g a

Estimate 0.229 1.889 0.467 0.225 0.040 0.484 0.192 0.025 0.650 0.432 3.180 0.227

differences in learning, as is evident from the fact that some of the b parameters are not contained in the confidence intervals of some other bs. Further, few b confidence intervals contain the value of zero, suggesting that these people did not learn at all in this test. Also, in this experiment, the lag parameter g is in both conditions negative, which is indicative of a recency effect.

GENERAL DISCUSSION
In this paper, we have analysed data of participants solving RPM-like items. Specifically, we have looked at whether people profit from rules used earlier in the test and whether a lag effect occurs. In Experiment 1, it was shown that people will try out a rule with higher probability if it has been activated in previous items than when it has not. The probability of success is influenced similarly. In Experiment 2, a test which closely followed the RPM format was constructed. It was shown that the learning effect was present here too, but less strongly so than in Experiment 1. Indeed, the effect was no longer visible in the accuracies of items 37 and 38, although the choice data of these items still showed the expected trend. This suggests that the set effect is still present, but more weakly so. So why is the set so easily broken (or weakened) in Experiment 2 but not in Experiment 1? One possible reason is that explicit feedback was provided by the experimenter in Experiment 1 but not in Experiment 2. Participants are probably more confident about rules that are provided by the experimenter than about rules they have

INDUCTION OF SOLUTION RULES

545

discovered themselves. Hence, if such a self-discovered rule no longer appears to apply, participants may reject this rule, or at least consider other rules as well, after the initial consideration of the earlier rules. This is consistent with our finding that, for items 37 and 38, the set effect in Experiment 2 is present in the choice data but not in the accuracy data. Although these data are too limited in scope to allow generalisations, this result suggests that at least in some cases people may profit more from rules that are provided explicitly rather than found by themselves because the former rules are used more confidently. It remains an open question whether this holds in other problem solving tasks as well. We have started this paper by mentioning the research of Carpenter et al. (1990). Our findings seem to support their assumption: A small set of rules is repeatedly applied (over items) by participants. Moreover, they become more fluent over repeated applications. Also, in three (out of four) conditions, more recent items were more active than earlier ones. In the fourth condition, earlier items seemed to have a stronger effect than later ones, although this may have been due to the artefact that the model tried to pick up the influence from the very first two items (see the discussion at that point in Experiment 1). Carpenter et al. (1990) also suggested that two factors are important in solving the RPM test: Ability to induce abstract relations (rules) and working memory capacity. Earlier we (Verguts et al., 1999) have conceptualised the rule induction process as a sequential sampling of rules where each rule has a certain probability to be sampled. In that paper, we have investigated one source of individual differences in this rule-finding process, namely, the speed at which a participant can sample rules. In the present paper, we have investigated previous experience as a factor which influences the sampling probabilities. In the intelligence literature, there is a research line that considers dynamic test situations, that is, test situations in which people learn something while solving the test (Ferrara, Brown, & Campione, 1986; Ferretti & Butterfield, 1992). These authors advocate the study of the role of transfer, or learning, in intelligence. Specifically, they found relations between IQ and number of hints needed to apply an earlier used principle to later items (i.e., transfer), also in the type of material we considered in this paper (RPM items). Ferrara et al. and Ferretti and Butterfield defend the position that a large part of intelligence is possibility of transfer, an aspect that has been neglected, possibly because it is difficult to assess. The same aspect, possibility of transfer, was incorporated in our model in the learning rate parameter b. It was suggested by our findings that there might be individual differences in this learning rate parameter. However, we did not investigate the relation of this learning rate variable to other measures, such as IQ or working memory capacity (Kyllonen & Christal, 1990). Investigation of such relations is a psychometric issue and stands out as a future concern. The main issue in the present paper was whether, and how, people

546

VERGUTS AND DE BOECK

recycle rules over items, with the aim of understanding more clearly the item solution processes involved here.
Manuscript received March 2001 Revised manuscript received August 2001

REFERENCES
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19 , 716723. Alderton, D.L., & Larson, G.E. (1990). Dimensionality of Ravens advanced progressive matrices items. Educational and Psychological Measurement, 150, 887900. Arthur, W., Jr., & Day, D.V. (1994). Developmen t of a short form for the Raven advanced progressive matrices test. Educational and Psychologica l Measurement, 54, 394403. Carlson, J.S., Jensen, C.M., & Widaman, K.F. (1983). Reaction time, intelligence, and attention. Intelligence , 7, 329334. Carpenter, P.A., Just, M.A., & Shell, P. (1990). What one intelligence test measures: A theoretical account of processing in the Raven progressive matrices test. Psychologica l Review, 97 , 404 431. DeShon, R.P., Chan, D., & Weissbein, D.A. (1995). Verbal overshadowing effects on Ravens progressive matrices: Evidence for multidimensional performance determinants. Intelligence , 21 , 135155. Dover, A., & Shore, B.M. (1991). Giftedness and flexibility on a mathematical set-breaking task. Gifted Child Quarterly, 35 , 99105. Duncker, K. (1945). On problem solving. Psychological Monographs, 58(5). Embretson, S.E. (1995). The role of working memory capacity and general control processes. Intelligence , 20, 175186. Ericsson, K.A., & Simon, H.A. (1984). Protocol analysis: Verbal reports as data. Cambridge, MA: MIT Press. Ferrara, R.A., Brown, A.L., & Campione, J.C. (1986). Childrens learning and transfer of inductive reasoning rules: Studies of proximal development . Child Development, 57 , 10871099. Ferretti, R.P., & Butterfield, E.C. (1992). Intelligence-related differences in the learning, maintenance , and transfer of problem-solving strategies. Intelligence , 16 , 207223. Fry, A.F., & Hale, S. (1996). Processing speed, working memory, and fluid intelligence: Evidence for a developmenta l cascade. Psychological Science, 7, 237241. Jensen, A.R. (1987). Process differences and individual differences in some cognitive tasks. Intelligence , 11, 107136. Kaplan, I.T., & Schoenfeld, W.N. (1966). Oculomotor patterns during the solution of visually displayed anagrams. Journal of Experimental Psychology, 72 , 447451. Kotovsky, K., & Simon, H.A. (1973). Empirical tests of a theory of human acquisition of concepts for sequential patterns. Cognitive Psychology, 4, 399424. Kyllonen, P., & Christal, R. (1990). Reasoning ability is (little more than) working memory capacity?! Intelligence , 14, 389434. Lemay, E.H. (1972). Anagram solutions as a function of task variables and solution word models. Journal of Experimental Psychology, 92, 6568. Lovett, M.C., & Anderson, J.R. (1996). History of success and current context in problem solving. Cognitive Psychology, 31 , 168217. Lovett, M.C., & Shunn, C.D. (1999). Task representations, strategy variability, and base-rate neglect. Journal of Experimental Psychology: General, 128 , 107130.

INDUCTION OF SOLUTION RULES

547

Luchins, A.S., & Luchins, E.H. (1954). The Einstellung phenomenon and effortfulness of task. Journal of General Psychology, 50, 1527. Maier, N.R.F. (1931). Reasoning in humans: II. The solution of a problem and its appearance in consciousness. Journal of Comparative Psychology, 12 , 181194. Marshalek, B., Lohman, D.F., & Snow, R.E. (1983). The complexity continuum in the radex and hierarchical models of intelligence. Intelligence , 7, 107127. Raven, J.C. (1962). Advanced progressive matrices, set II. London, UK: H.K. Lewis. Reed, T.E., & Jensen, A.R. (1992). Conduction velocity in a brain nerve pathway of normal adults correlates with intelligence level. Intelligence , 16 , 259272. Ross, B.H. (1984). Remindings and their effects in learning a cognitive skill. Cognitive Psychology, 16 , 371416. Ross, B.H. (1987). This is like that: The use of earlier problems and the separation of similarity effects. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13 , 629639. Salthouse, T.A. (1991). Mediation of adult age differences in cognition by reductions in working memory and speed of processing. Psychologica l Science, 2, 179183. Schervisch, M.J. (1995). Theory of statistics. New York: Springer-Verlag. Siegel, S., & Castellan, J.N., Jr. (1989). Nonparametric statistics for the behavioral sciences. New York: McGraw-Hill. Sweller, J., & Gee, W. (1978). Einstellung, the sequence effect, and hypothesis theory. Journal of Experimental Psychology, 4, 513526. Veenman, M.V.J., Elshout, J.J., & Groen, M.G.M. (1993). Thinking aloud: Does it affect regulatory processes in learning? Tijdschrift voor Onderwijsresearc h, 18 , 322330. Verguts, T., De Boeck, P., & Maris, E. (1999). Generation speed in Ravens Progressive Matrices Test. Intelligence , 27, 329345. Verhelst, N.D., & Glas, C.A.W. (1993). A dynamic generalization of the Rasch model. Psychometrika, 58, 395415. White, H. (1988). Semantic priming of anagram solutions. American Journal of Psychology, 101 , 383399. Wiley, J. (1998). Expertise as mental set: The effects of domain knowledge in creative problem solving. Memory and Cognition, 26, 716730.

You might also like