You are on page 1of 32

Studies in Second Language Acquisition, 2016, 38, 6596.

doi:10.1017/S0272263115000212

THE ROLE OF EXPOSURE


CONDITION IN THE EFFECTIVENESS
OF EXPLICIT CORRECTION

Yucel Yilmaz
Indiana University

This article reports on a study that investigated the effects of two


feedback exposure conditions on the acquisition of two Turkish morphemes. The study followed a randomized experimental design with
an immediate and a delayed posttest. Forty-two Chinese-speaking
learners of Turkish were randomly assigned to one of three groups:
receivers, nonreceivers, and control. All learners performed three
communication games with a Turkish native speaker in which their
errors on the Turkish plural and locative morphemes were treated
according to their group assignment. The receivers errors were corrected through explicit correction. The nonreceivers were allowed to
hear the feedback provided to the receivers; however, they did not
receive feedback on their own errors. The learners in the control
group neither received feedback on their own errors nor were allowed
to hear the feedback other learners received. Results indicated that
feedback exposure condition has an effect on the extent to which
learners benefit from feedback but that this effect may be moderated
by linguistic structure.

Corrective feedback research has recently received a great deal of


attention in the eld of SLA, as evidenced by the signicant number of
review articles and meta-analyses published on the topic within the last
I would like to thank Gisela Granena, Kathleen Bardovi-Harlig, and Aileen Bach for their
insightful suggestions on an earlier version of this article. I also thank Ylmaz Kyl for his
assistance with data collection and coding. I am grateful to the anonymous SSLA reviewers
for their helpful comments. Final responsibility for any errors remains my own.
Correspondence concerning this article should be sent to Yucel Yilmaz, Indiana
University, Memorial Hall 303, Bloomington, IN 47408. E-mail: yyilmaz@indiana.edu
Cambridge University Press 2015

65

66

Yucel Yilmaz

decade (Goo & Mackey, 2013; Li, 2010; Long, 2007; Lyster & Ranta, 2013;
Lyster & Saito, 2010; Mackey & Goo, 2007; Russell & Spada, 2006). Previous research on corrective feedback has concentrated on the relative
effects of corrective feedback on noticing and/or second language (L2)
development. Recently, researchers have started to investigate the role
of a wide array of factors in moderating the effectiveness of corrective
feedback. Studies contributing to this line of research have focused on
cognitive (e.g., working memory), affective (e.g., anxiety), and task-related
(e.g., contextual support) factors. Another factor that can be expected
to moderate feedback effectiveness is exposure condition. Previous
research has investigated the effectiveness of corrective feedback in
contexts in which learners experience corrective feedback by directly
receiving the feedback on producing inaccurate utterances. However,
this is not the only condition under which learners experience feedback. In classroom settings, learners also experience feedback indirectly by hearing the feedback that is provided to other learners. It
could even be the case that, in a classroom setting, learners hear the
feedback provided to other learners more often than they receive feedback addressed directly to them. Despite this variability in exposure
condition, no studies so far have investigated whether learners who
directly receive feedback on producing an inaccurate utterance benet
more than learners who are allowed to hear the feedback provided to
other learners.
Exploring this issue has important implications for corrective feedback research and practice. Research on this topic could help maximize
the benets of feedback in pedagogical practice by providing information about which exposure condition (direct, indirect, or both) should
be promoted for designing various classroom activities. In addition, this
research could contribute to the theory of corrective feedback by shedding light on the cognitive processes learners go through when decoding corrective feedback. Finally, the results of this research could inform
the corrective feedback research methodology as to whether feedback
should be categorized and reported separately depending on exposure
condition.
BACKGROUND LITERATURE
Corrective Feedback in L2 Acquisition
Although there is no dispute over the contribution of positive evidence
(i.e., targetlike exemplars of language) to L2 acquisition, there is disagreement over the role of negative evidence (i.e., information that
indicates what is not targetlike in the L2). A group of SLA researchers
(e.g., Krashen, 1981; Schwartz, 1993) has argued (with some variation

Exposure Condition and Explicit Correction

67

in terminology) that the knowledge obtained through negative evidence


is not the same as the knowledge that matters for L2 acquisition, and,
therefore, negative evidence cannot play a major role in SLA. Despite
this opposition, many SLA researchers (e.g., Ellis, 1991; Gass, 1997;
Long, 1996; Mackey, 1999; Pica, 1988) advocate that negative evidence
has a facilitative role in L2 acquisition, and some (e.g., White, 1991)
believe that negative evidence is necessary for the acquisition of certain L2 targets that cannot be learned through positive evidence
alone. Oral corrective feedback is a form of negative evidence, dened
as any reaction of the teacher which clearly transforms, disapprovingly refers to, or demands improvement of the learner utterance
(Chaudron, 1977, p. 31). Previous research has shown that oral corrective feedback is effective as a pedagogical technique. Comparisons
of oral corrective feedback groups with no-feedback control groups
(Ammar & Spada, 2006; Carroll & Swain, 1993; Doughty & Varela, 1998;
Ellis, Loewen, & Erlam, 2006; Goo, 2012; Loewen & Nabei, 2007; Lyster,
2004; Mackey & Philp, 1998; Sheen, 2007) or with positive-evidence-only
groups (Leeman, 2003; Ortega & Long, 1997) have shown that at least
one of the feedback groups in at least one of the investigated target
forms outperformed the nonfeedback groups. The overall magnitude
of the effect of corrective feedback on L2 acquisition has been reported
as medium to large (Li, 2010; Lyster & Saito, 2010; Mackey & Goo,
2007; Russell & Spada, 2006).
Previous research has also shown interest in the relative effectiveness of different corrective feedback types. The interaction hypothesis
(IH) and the focus-on-form (FonF) perspective have motivated some of
the studies comparing different types of feedback. According to the IH
(Long, 1981, 1996), conversational interaction facilitates language acquisition by triggering useful cognitive processes for L2 learning, such as
noticing (Schmidt, 2001). Focus on form (Long, 1991; Long & Robinson,
1998) suggests that learners attention should be drawn to language as
an object when the need for it arises during a language activity whose
primary focus is on meaning. Inspired by these theoretical perspectives, some researchers (e.g., Doughty, 2001; Gass & Mackey, 2006; Long,
2007) have defended the position that recasts (e.g., targetlike reformulations of L2 utterances) are an ideal form of corrective feedback because
learners can benet from the interaction-induced corrective feedback
without compromising the meaning-based nature of the task. Some
other researchers (e.g., Carroll, 2001; Lyster & Ranta, 1997) have put
forward different views, arguing that learners can have difculty identifying the negative evidence conveyed through recasts because the recasts
lack salience. These researchers have suggested alternative forms of
feedback that are potentially more salient either because they provide
metalinguistic information (Carroll, 2001) or because they push learners
to self-repair their errors (Lyster & Ranta, 1997).

68

Yucel Yilmaz

The two opposing views on the effectiveness of feedback types have


given impetus to empirical comparisons of the relative effectiveness of
recasts versus other feedback types, such as metalinguistic feedback
(i.e., metalinguistic comments and information about the accuracy of
learners utterances), explicit correction (i.e., explicit provision of the
targetlike form), and prompts (i.e., a general category covering feedback moves that push learners to self-repair). The studies comparing
recasts versus metalinguistic feedback and prompts have produced
mixed results. Some studies have shown an advantage for metalinguistic
feedback (Carroll & Swain, 1993; Ellis, 2007; Ellis et al., 2006; Sheen,
2007) or prompts (Ammar & Spada, 2006; Lyster, 2004), whereas others
have shown no difference between feedback groups (recasts vs. metalinguistic feedback: e.g., Goo, 2012; recasts vs. prompts: e.g., Lyster &
Izquierdo, 2009). Relatively fewer studies have compared recasts versus
explicit correction. The results of these studies have shown an advantage
for explicit correction over recasts (Yilmaz, 2012, 2013a). It has also
been suggested that, regardless of the results of the individual studies,
one should be cautious about making generalizations from the existing ndings because of the methodological variability among studies
(e.g., treatment tasks employed, operationalizations of feedback types,
outcome measures, target structure choice, and duration of the treatment
sessions; see Goo & Mackey, 2013, and Lyster & Ranta, 2013, for some
important methodological concerns in feedback research).
A second line of research has investigated whether the effectiveness
of corrective feedback changes depending on moderating variables.
The majority of the studies in this category have focused on one feedback type, unlike most comparative studies, which have included at
least two feedback types. Some of these studies have exclusively focused
on recasts and have investigated the various naturally occurring differences in feedback delivery (e.g., length of recasts and intonation) and
their effects on the noticing of feedback as measured by custom-made
tests or evidence of uptake (e.g., Egi, 2007; Loewen & Philp, 2006; Sheen,
2006). One nding from these studies has been that shorter recasts with
interrogative intonation are more noticeable than longer recasts with
declarative intonation.
Another group of studies has looked into whether cognitive factors
moderate the effectiveness of various feedback types. Cognitive factors
such as phonological short-term memory (Mackey, Philp, Egi, Fujii, &
Tatsumi, 2002; Tromovich, Ammar, & Gatbonton, 2007), working memory
capacity (Goo, 2012; Mackey et al., 2002; Yilmaz, 2013a), attention control (Tromovich et al., 2007), and language analytic ability (Sheen, 2007;
Yilmaz, 2013b) have been shown to be signicantly related to L2 outcomes after feedback treatments. Another potential moderating factor
that has received little attention in the corrective feedback literature
to date is feedback exposure condition. Despite the current lack of

Exposure Condition and Explicit Correction

69

research into this variable, several previous studies have investigated


the effects of a similar variable. These studies are summarized in the
following section.

Feedback Exposure Condition


Feedback exposure condition refers to whether a learners exposure to
corrective feedback is direct or indirect. It is considered direct if the
learner receives feedback on his or her own incorrect utterance, and it
is considered indirect if he or she witnesses another learner receiving
feedback on an incorrect utterance. To the best of the researchers
knowledge, no studies have isolated exposure condition as a variable in
a study in which a predetermined feedback type is consistently provided on learner errors. Nevertheless, some previous studies motivated
by the IH (Long, 1981, 1996) investigated a similar variablewhich
could be called active participation in interactionspecically exploring
its effect on learners listening comprehension, vocabulary acquisition,
and morphosyntactic development. In these studies, researchers compared the performance of a group that was allowed to interact with a
native speaker (NS) interlocutor (interactors) to the performance of
another group that was allowed to observe the interaction between the
interactor and a NS. Observing the interaction was dened as having
the opportunity to listen to the input produced during the interactor-NS
interaction. Ellis, Tanaka, and Yamazaki (1994) and Pica (1992) investigated the effect of active participation in interaction on learners listening comprehension.1 In Ellis et al. (1994), vocabulary acquisition was
an additional dependent variable. In both studies, learners were asked
to follow the directions given by a NS to complete a puzzle. Learners
selected pictures of objects from many alternatives and positioned
them in their correct locations in a bigger picture. Learners comprehension scores were calculated on the basis of the accuracy with which
they placed the pictures into their correct positions during the treatment tasks (Ellis et al., 1994; Pica, 1992). Vocabulary acquisition was
measured through a translation and picture-identication test in Ellis
et al. (1994). Neither on comprehension (Ellis et al., 1994; Pica, 1992)
nor on vocabulary acquisition (Ellis et al. 1994) did the results reveal a
difference between the groups. Because the treatment tasks in these
studies did not require learners to produce language forms orally, feedback on oral production was not a feature of the interactor-NS interaction in these studies. In two additional studies (Mackey, 1999; Muranoi,
2000), however, the treatment tasks did give learners the opportunity
to interact orally and receive feedback. These studies are examined
more closely in the following paragraphs.

70

Yucel Yilmaz

Mackeys (1999) study, following a pretest-posttest design with two


delayed posttests, looked into the effects of different types of conversational interaction on the acquisition of English question formation.
Thirty-four English as a L2 learners were randomly assigned to one of
ve groups: interactor readies, interactor unreadies, observers, scripteds, and control. The interactors were allowed to interact with the NS,
whereas the observers were not. The observers, however, were permitted to listen to the input that was given to the interactors. The interactor readies and interactor unreadies differed in their developmental
readiness to acquire the target. The scripteds received canned input
modied in such a way that would render negotiation for meaning
unlikely. In both the tests and treatment tasks, learners carried out
information gap tasks with a NS of English in which the learners asked
questions to get the information held by the NS. With regard to the
effects of active participation in interaction, the results of the study
were positive. The interactors were more likely than the observers
to demonstrate sustained stage increase (i.e., the production of at
least two developmentally advanced questions). In addition, the mean
number of question forms that were more advanced than the learners
current level was calculated. The interactors increased their scores
from pretest to posttest, whereas the observers did not.
Another study that reported ndings relevant to the current study is
Muranois (2000) quasi-experimental study, whose main focus was the
relative effectiveness of three types of instruction: interaction enhancement with meaning-oriented debrieng, interaction enhancement with
form-oriented debrieng, and no interaction enhancement with meaningoriented debrieng. The target structure was the English indenite
article. Interaction enhancement involved the provision of a variety
of interactional features, such as conrmation checks, clarication
requests, requests for repetition, recasts, and models (i.e., positive
evidence), during communicative (i.e., role-play) activities. Ninety-one
Japanese English as a foreign language learners from three intact classes
participated in the study. In each class, 10 students interacted with
the instructor to perform the role-play activities, and the rest observed the
interaction. The results of the study showed no differences between the
observers and interactors in the posttest measures (oral and written
production and grammaticality judgment tasks).
To summarize, previous literature shows that active participation in
interaction does not make a difference in terms of learners comprehension ability and vocabulary acquisition when interaction is operationalized as understanding a NSs instructions. However, the results of the
studies that are more closely related to the current study, in which
interaction involved oral production and corrective feedback, were
mixed. One study showed positive results for active participation
(Mackey, 1999), whereas another study showed no advantage for it

Exposure Condition and Explicit Correction

71

(Muranoi, 2000). It is important to note the key differences between the


research reviewed and the current study with regard to their focus variables. The previous studies, as required by their research questions,
treated interaction as a global variable, which was a combination of
many different interactional features such as clarication requests, conrmation checks, comprehension checks, and recasts. The interactors
in these studies were given not only the opportunity to be exposed to
these features but also the opportunity to produce output and hear the
positive evidence provided by the NS. Consequently, as the learners in
the observer and interactor groups in the previous studies differed with
respect to many interactional features, these studies cannot be taken
as tests of the effects of feedback exposure conditions (nor were they
intended to do so). To determine the potential differential effects of being
directly versus indirectly exposed to feedback, studies should isolate a
specic feedback type from other interactional features to minimize the
effects of other potential sources of corrective feedback (e.g., clarication requests and conrmation checks). Similarly, observers and interactors should be matched on their opportunities for output production so
that output production can be ruled out as a causal factor.
THE PRESENT STUDY
As laid out in the previous section, to date, research has not examined
the effect of feedback exposure condition on the extent to which
learners benet from feedback. To address this gap, the present study
aimed to compare the performance of a group of learners, called receivers,
who received corrective feedback after making an error to the performance of another group, called nonreceivers, whose members were
permitted to hear the feedback provided to the receivers but did not
receive feedback on their own errors.2 The feedback type chosen for
this study was explicit correction, dened as the explicit provision of
the reformulation of the learners error.
Despite the lack of research in this area, there are grounds for predicting that exposure to feedback condition may play a role in feedback
effectiveness. Previous attempts to explicate the mechanisms through
which learners benet from corrective feedback have underlined that a
brief attention shift from meaning to formor an interpretation of the
corrective feedback as a comment on the formal aspects of the language
is a critical step (Carroll, 2001; Doughty, 2001; Long, 2007). Long (2007),
targeting recasts, has argued that being the producer of the corrected
utterance plays a crucial role in this brief attention shift from meaning
to form as well as in any higher order noticing resulting from this initial
noticing. He has stated that in a recast situation, learners are vested in
the exchange, as it is their message that is at stake, and so will probably

72

Yucel Yilmaz

be motivated and attending, which are conditions to facilitate noticing of


any new linguistic information in the input (p. 114). One can extend
this argument to other feedback types, such as explicit correction,
because it is likely that having a message at stake during an explicit
correction instance, similar to a recast instance, can function as an
attention-enhancing factor. As a result, it can be expected that the learner
whose erroneous utterance is corrected through explicit correction
would benet more from explicit correction than the learner who hears
the feedback provided to the rst learner but does not receive feedback
on his or her own utterance. This study seeks to address the following
question: Is there a difference between the performance of a group that
receives corrective feedback after producing an error (i.e., receivers)
and the performance of a group that does not receive corrective feedback on their own errors but is exposed to the corrective feedback provided to the receivers (i.e., nonreceivers), as measured by recognition
and oral production test scores?
In light of the arguments previously outlined, the following hypothesis was formulated: The receivers will benet more from corrective
feedback than the nonreceivers.

Method
The present study followed a randomized experimental design with an
immediate and delayed posttest (see Figure 1). A pretest was not administered because it was possible to assume that learners had zero knowledge of the target forms, as they were absolute beginners with no previous
exposure to or knowledge of the target language (i.e., Turkish). Prior to
the beginning of the experiment, all learners were asked to study key
vocabulary using an instructional Web site to ensure that they had
enough vocabulary to begin the experiment. Learners were randomly
assigned to one of three groups: receivers, nonreceivers, and control.3

Participants
Participants were recruited at a large university in the midwestern
United States. The study was advertised in two ways: (a) via an electronic advertisement using the online classied ads service of the university and various university-related listservs and (b) via yers posted
on bulletin boards around campus. Two hundred and twenty participants volunteered for the study on the basis of the following criteria:
(a) being a native speaker of Mandarin Chinese, (b) not having been

Exposure Condition and Explicit Correction

73

Figure 1. Study design; The shaded parts indicate that paired participants from these groups were present during each others task
performance.
exposed to Turkish previously, and (c) not having taken any linguistics
courses. The researcher explained that they had to learn 39 Turkish
words to qualify for the study and provided them with a link to the instructional module for the vocabulary learning activities (see the Tasks and
Materials section for a description of the preexperimental stage materials). He asked them to contact him again when they had studied the
words and had passed an online vocabulary test by scoring 95%. Fortytwo participants (38 females and six males) contacted the researcher
after passing the test and constituted the nal pool of participants. The
reason for targeting a group of participants speaking the same rst
language (L1; i.e., Mandarin) was to prevent L1 differences from confounding the results. The target population was required to not have any
previous exposure to Turkish for the following reasons: (a) to represent
the learning processes of learners who are exposed to a L2 for the rst
time, (b) to better detect differences among the groups (i.e., learners

74

Yucel Yilmaz

with no prior knowledge would have more room to improve than


learners who have some knowledge), (c) to prevent prior knowledge
from interacting with treatment effects, and (d) to avoid testing effects
that might arise due to the administration of a pretest. It is important
to note that learners were not taking any Turkish language courses
at the time of the study, which served as an additional control feature to minimize learners possibility to receive extra input on the
target structures from resources outside the experiment. A background questionnaire revealed that no participant had ever been to
Turkey. The average age of the group was 23.58 years (SD = 5.31),
and the majority (86%) were university students. All of the participants spoke English as a L2 with an average TOEFL iBT score of 93.48
(SD = 13.72). Participants had received 9.76 years (SD = 3.63) of English
language instruction and had lived in the United States for 2.68 years
(SD = 1.82) on average. Nine participants had beginner-level knowledge
of a third language (L3; Japanese, n = 4; French, n = 2; German, n = 2;
Spanish, n = 1).

Target Structures
Two Turkish structures were selected for the study: the plural morpheme
/-lAr/, and the locative case morpheme /-DA/. These morphemes were
selected because of their predicted low form-meaning salience due to
allomorphy. Structures that have allomorphic variation can be a challenge for learners because the learners cannot rely on their tendency to
look for one-to-one form-meaning relationships to learn the structure
(Andersen, 1984). Turkish, being an agglutinating language, is rich in
inectional morphology, with most sufxes having phonologically conditioned allomorphs. Vowel harmony and devoicing determine the
allomorphs of the morphemes under study. Vowel harmony species
that a native Turkish word should include either exclusively nonfront
(i.e., central and back) vowels /a, , o, u/ or exclusively front vowels
/e, i, , y/. As shown in examples (1) and (2), the plural morpheme
/-lAr/ becomes [-ler] or [-lar] depending on vowel harmony; the choice
between /e/ and /a/ is determined by the preceding stem vowel. It is /e/
after front vowels and /a/ after nonfront vowels.
(1) kemer-ler
belt-PL
belts
(2) tabak-lar
plate-PL
plates

Exposure Condition and Explicit Correction

75

The sufx /-DA/ expresses the locative case in Turkish, and it has four
allomorphs, as shown in examples (3)(6). It becomes [-de], [-da], [-te],
or [-ta] depending on vowel harmony and devoicing. The locative
/-DA/ becomes [-te] or [-ta] after voiceless consonants and [-de] or
[-da] after vowels or voiced consonants. The preceding stem vowel
determines the vowel in the sufx, as in the plural morpheme. The
meaning of the locative case corresponds to the English prepositions
in, on, at, and by.
(3) ev-de
house-LOC
in the house
(4) masa-da
table-LOC
on the table
(5) sepet-te
basket-LOC
in the basket
(6) raf-ta
shelf-LOC
on the shelf

Tasks and Materials


Preexperimental Stage Materials and Vocabulary Tests. The purpose of
the preexperimental stage was to make sure that learners had enough
vocabulary knowledge to describe the pictures used in the treatment
and posttests. Learners were asked to learn the orthographic and phonological forms of 39 Turkish words (30 concrete nouns and nine adjectives; see Appendix A) through ashcard and matching activities, which
were parts of a Web-based instructional module. Learners were allowed
to study the words at their own pace and at their convenience in their
preferred location. After studying the words, learners took an online
vocabulary test through the same instructional module, during which
they matched the orthographic and phonological forms of the words
with pictures. Learners who reached a 95% cutoff level qualied to meet
with the researcher to carry out the rest of the study. At the beginning
of each treatment session, to make sure that learners still remembered
the required words, a picture-naming task requiring learners to say
the Turkish words for the objects shown in pictures was administered.
A perfect score was required on this test. If learners failed, they were
allowed to take the test again until they had a perfect score.

76

Yucel Yilmaz

Treatment Tasks. Three versions of a communication game were


used in the treatment. The versions shared the same procedure but
differed in content. All learners carried out one version in the rst session and two in the second. The versions of the tasks were counterbalanced across experimental units (i.e., receiver-nonreceiver pairs and
control participants). Each task included eight critical items creating
obligatory contexts for use of each target structure and eight distractors. For each task, three folders were created. Two of the folders were
exactly the same so that the receivers and nonreceivers could follow
each others interaction with the researcher. One folder was different
from the others and was given to the experimenter. Each sheet in the
learners folder contained one picture, whereas the corresponding sheet
in the experimenters folder contained three pictures, one of which was
identical to the picture in the learners folder. The pictures for the critical
items showed more than one object located on top of a different object.
The pictures for the distractor items showed two different objects side
by side, not touching each other. Learners were asked to describe the
pictures on each sheet of their folder to help the experimenter select the
same picture on his sheet. During the receiver-nonreceiver sessions, each
learner was given one of the two identical folders and was told to pay
attention to the pictures being described and the interaction between
their partner and the experimenter. Because the learners and the experimenters pictures differed by more than one characteristic, no one
statement by the learner immediately identied the pictures. Thus, if
the learner did not exhaustively describe his or her picture, the experimenter would ask clarication questions, such as Masa ne renk? What
color is the table? or Masa byk m kck m? Is the table big or
small? The experimenter explained the meanings of these questions
prior to the beginning of the treatment tasks by writing them on the
board, modeling their pronunciation, and translating them into English.
The tasks were designed in such a way that the receivers selection of
the correct pictures did not depend on whether the descriptions included the target structures. For example, in one of the target items, the
experimenter saw green apples on a red book in Picture A, green apples
on a grey book in Picture B, and red apples on a red book in Picture C.
Here, the selection of the correct picture did not depend on the distinction between bare and inected nouns. Regardless of the picture, the
apples should be in the plural form (inected), and the book should be
inected for location. This feature was incorporated into the task
design so that it could be possible for the experimenter to know whether
the learners utterance was erroneous. The administration of each task
took 11 min 43 s on average.
Posttest Measures. Two types of tests were used to measure learners
linguistic knowledge gained from the treatment: a multiple-choice

Exposure Condition and Explicit Correction

77

recognition test and an oral production test. The recognition test was
designed to measure learners knowledge about the pattern behind the
allomorphic variation for each morpheme under conditions favorable
for the use of explicit knowledge; these conditions were as follows:
(a) learners had enough time to plan their responses and (b) learners
could give a correct response by attending to the form (Ellis, 2005). The
test included 48 items: 32 distractors and eight critical items for each
target structure. The items were presented randomly through a Webbased testing tool. Each item included a picture and an incomplete twoword sentence (e.g., adam _____), in which one of the words was always
provided. The learners task was to ll in the gap by selecting from ve
options (e.g., a. masada, b. masata, c. masate, d. none, e. masade). The
options presented the orthographic and phonological forms (through
clickable links) of the noun-allomorph combinations. The correct option
included the correct noun-allomorph combination (e.g., masada), whereas
the incorrect options included incorrect noun-allomorph combinations
(e.g. *masata). The items were balanced with respect to which allomorph
is considered correct.
The oral production test was designed to measure learners ability to
mark nouns for plurality or location when necessary and to use the
correct allomorphs of the morphemes. The task involved a time limit,
and its primary focus was on message creation. In this task, learners
were asked to describe the location of the object(s) they saw in the
picture. In each version, there were 40 items: 16 critical items, creating
contexts for the production of each target form, and eight distractor items.
The test was administered using a stimulus presentation computer program. The presentation of the items was random and automatic with an
8-s delay between the items. The items were balanced with respect to
which allomorph was considered correct. There were two versions for
each of the tests. One version was used on the immediate posttest and
the other was used in the delayed posttest. The versions of the tasks
were counterbalanced across experimental units.

Feedback Treatment
A native Turkish-speaking research assistant (hereafter referred to as
the experimenter) provided explicit correction to the receivers whenever they omitted marking on nouns for plurality and/or location when
it was necessary to do so, marked the incorrect noun, or used incorrect
allomorphs during the treatment. At no point during the treatment
were learners told that they would receive feedback. The feedback
type choice was motivated by the results of previous studies showing
that explicit correction is an effective form of feedback (Li, 2010).4

78

Yucel Yilmaz

Explicit correction was operationalized as a statement indicating what


learners should have said. Because learners were absolute beginners,
English was used in the feedback to make sure that they understood
what was said (e.g., You should say X). The explicit correction could
be simple, as in (7), or complex, as in (8), depending on whether learners
had one or two errors in their utterances. Given the fact that learners
were absolute beginners and knew only 39 Turkish words, their initial
utterances included only the uninected nouns, as in (8). After receiving
feedback, some learners started to produce sentences, as in (7), that
included the plural and/or locative markers but involved the use of an
incorrect allomorph. It should also be noted that no additional knowledge of Turkish grammar was necessary to describe the pictures. In
other words, it was possible for leaners to produce grammatical utterances as long as they used the inections presented in the feedback.5
(7) Simple explicit correction:
Learner:
*Sinekler kitapte.
*Fly-PL book-LOC.
The ies [are] on the book.
Researcher: You should say kitap-ta.
book-LOC
on the book
Researcher:
Kitap ne renk?
Book what color?
What color is the book?
(8) Complex explicit correction:
Learner:
*Kedi gemi.
*Cat boat.
The cat [is] the boat.
Researcher: You should say kedi-ler gemi-de.
*Cat-PL boat-LOC
The cats [are] in the boat.
Researcher:
Gemi ne renk?
Boat what color?
What color is the boat?

The receivers were provided feedback on their own errors during the
tasks. The nonreceivers were allowed to hear the feedback provided to
the receivers. The control group learners neither received feedback on
their own errors nor heard the feedback provided to other learners. To
expose the nonreceivers to feedback, each nonreceiver was matched
with a receiver and was allowed to be present in the same room with the
receiver during the receivers task performance. All groups were matched
on output opportunities: Each learner interacted with the experimenter
by performing three tasks in which he or she described pictures (see
the Procedures section for more information). This ensured that all

Exposure Condition and Explicit Correction

79

groups had equal opportunities to produce output and to hear input


not containing the target structures (i.e., target-irrelevant input).
In addition, the practice of pairing a receiver with a nonreceiver has
resulted in pairs being automatically matched on the following variables:
(a) quantity and type of target-relevant input from the experimenter
(i.e., feedback and its idiosyncratic features, e.g., simple or complex),
(b) quantity and type of target-irrelevant input from the experimenter
(i.e., input not containing negative or positive evidence on the target
forms), (c) beginning and end times of the sessions, (d) duration of
the sessions, and (e) version of posttest tasks.
Matching the experimental groups on output opportunities required
the researcher to make two important experimental decisions. First,
learners receiving feedback may naturally attempt to repeat the reformulated part of their utterances immediately after the feedback. This
move is known as modied output (or repair), a discourse feature
shown to be related to L2 acquisition (e.g., McDonough, 2005). Because
modied output was natural only for the receiver role, in order not to
give an advantage to the receivers, opportunities for modied output
were blocked by asking a question that is relevant to task completion
immediately after the feedback (see the feedback examples in [7] and [8]).6
The second related issue was that learners output was likely to include
attempts to use the target forms in subsequent turns. These attempts
could function as extra input for the listener (i.e., either the receiver or
the nonreceiver). The present study allowed participants to hear each
others attempts on the basis of the following grounds. First, it was predicted that, unlike modied output that was natural only for the receivers,
attempts could be made by both experimental groups, and, therefore,
the study would not favor one experimental group over the other. Second,
when learners engage in pair work in classroom contexts, they often
have the chance to hear each others attempts. That is, not separating
feedback and attempts in the laboratory setting is a reasonable choice
if one aims to simulate the phenomenon under investigation with greater
delity to the classroom context, as long as hearing these attempts
does not pose a threat to internal validity.

Procedures
The procedures of the study were as follows. The researcher set up
three meetings with the learners who had studied the vocabulary items
and passed the online test. All meetings took place in a research lab.
The learners in the control group met with the experimenter individually, whereas the learners in the receiver and nonreceiver groups were
paired and met with the experimenter together. These pairs were kept

80

Yucel Yilmaz

intact until the end of the study. During the receiver-nonreceiver treatment sessions, the receiver and the nonreceiver sat side by side, facing
the experimenter. All sessions started with the administration of the
picture-naming task. Learners had to score 100% on the test to proceed
to the treatment tasks. After passing the test, learners carried out the
treatment tasks with the experimenter. In the rst session, learners in
all groups carried out one treatment task with the experimenter. The
study was designed such that the receivers carried out the task before
the nonreceivers to match the experimental groups on opportunities to
apply the knowledge gained from feedback. One week after the rst
meeting, the groups met with the experimenter again to carry out two
more treatment tasks.7 In the rst of these tasks, the nonreceivers
were the rst to carry out the task. In the second task, the learners
who performed the task rst were counterbalanced across all receivernonreceiver pairs. At the end of the session, the learners took the immediate posttest individually in the presence of the experimenter. Two
weeks after the second session, the learners met again with the experimenter alone and took the delayed posttest. The meetings of the learners
in the receiver-nonreceiver pairs with the experimenter were on the
same day and immediately followed each other. In each posttest, the
oral production test was administered before the recognition test. Next, all
participants responded to a background questionnaire, and the receivers
and nonreceivers responded to three manipulation check statements
by rating them on a scale from 1 to 9. The manipulation check statements
were (1) When my partner was doing the activity with the researcher, I was
paying attention to their conversation; (2) I noticed that the experimenter
corrected my errors; and (3) I noticed that the experimenter corrected my
partners errors. Learners performance in all treatment tasks and oral
production tests were audio recorded.

Scoring and Analysis


Learners oral responses to each item in the oral production test were
rst transcribed and then coded for correct suppliance in obligatory
contexts, suppliance in nonobligatory contexts, and misformations in
obligatory contexts. The provision of the correct allomorph was coded
as correct suppliance, the suppliance of an incorrect allomorph as misformation, and the provision of a morpheme in the wrong context as
suppliance in nonobligatory contexts. An independent rater coded 15%
of the data for these categories. Interrater reliability for the plural and
locative oral production tests was acceptable as indexed by Cohens
kappa ( = .86 for the locative; = .92 for the plural). The percentage
agreement between the two independent coders was 91.3% for the locative

81

Exposure Condition and Explicit Correction

and 95.1% for the plural. Disagreements in scoring were then discussed
and resolved. Next, for each learner an adjusted target language use
(ATLU) score was calculated per target structure using the formula in
Ono and Witzel (2002):
ATLU =

( n correct suppliance in obligatory contexts) (2) + ( n of misformations in obligatory contexts) (1)


( n obligatory contexts) (2) + ( n suppliance in nonobligatory contexts) (2)

To calculate learners scores on the recognition test, rst, learners


responses to each item were coded as right or wrong. Then a score was
calculated for each learner by dividing the sum of correct responses by
the total possible score. Cronbachs alpha coefcient was computed to
estimate the internal consistency of each of the two versions of the test
(version A, = .70; version B, = .71).
RESULTS
The descriptive statistics for the mean number of explicit corrections
across the three tasks appear in Table 1. As shown, there is a gradual
decrease from Task 1 to Task 3 in the mean number of feedback instances.
This indicates that the receivers produced increasingly more targetlike
utterances as the experiment progressed. In addition, the mean number
of feedback instances provided for the plural was lower than the mean
number of feedback instances provided for the locative. This shows
that the receivers were slightly more targetlike with the plural than with
the locative.
Next, all data were checked for normality of distribution. As shown
by the results of a series of Shapiro-Wilk tests run on each test, target
form, and time (see Appendix B), the data failed to demonstrate a
normal distribution. The values for skewness and kurtosis were also
inspected to better understand the reasons for this nonnormal
distribution. The skewness values for the oral production test (immediate plural = .98, immediate locative = .23; delayed plural = 1.31, delayed
Table 1. Mean number of explicit correction instances by task and
structure
Structure

Plural
Locative

Task 1

Task 2

Task 3

Total

SD

SD

SD

SD

7.43
7.64

1.09
.63

6.00
7.36

2.48
.93

5.71
6.57

2.13
1.55

19.14
21.57

5.25
2.59

82

Yucel Yilmaz

locative = .76) and for the recognition test (immediate plural = .46,
immediate locative = .63; delayed plural = .12, delayed locative = .06)
revealed that the distribution of the scores, especially in the oral production tests, tended to be positively skewed. This means that most of the
scores were at the lower end of the scale. The low scores of the control
group on both tests might have contributed to these skewness values.
The kurtosis values for the oral production test (immediate plural = .53,
immediate locative = 1.84; delayed plural = .74, delayed locative =
1.01) and for the recognition test (immediate plural = .49, immediate
locative = .11; delayed plural = 1.48, delayed locative = .10) revealed
that the distribution was slightly platykurtic in many cases, indicating large variation within scores. Cognitive or affective individual
differences among learners, the difculty level of the tests, or the
interaction between these two factors may have given rise to these
large variations.
Next, given the nonnormal distribution of the data, various nonparametric tests were carried out to test the hypothesis. First, the performance of each experimental group (i.e., receivers and nonreceivers)
was compared to the performance of the control group. A Kruskal-Wallis
test, the nonparametric equivalent of a one-way ANOVA, was conducted
for each time and outcome measure. Post hoc Mann-Whitney tests were
conducted to nd out if each of the groups was different from the control group. Second, the performances of the two experimental groups
were compared against each other. In this analysis, the observations
that came from the receiver-nonreceiver pairs that participated in
the same sessions were treated as related because they were matched
on many different variables (e.g., quantity and type of feedback; see
the Method section). Given that treating the groups as independent
when they are actually related may lead to Type II errors (Field,
2009), a Wilcoxon signed-ranks test, the nonparametric analogue of
the paired-samples t test, was carried out for each time and outcome
measure. The results of the nonparametric tests are reported for
each structure separately.

The Plural Morpheme


The descriptive statistics for the plural items shown in Table 2 reveal
that the receiver group scored higher than the nonreceiver and control
groups on the recognition test. On the oral production test, however, the
receivers scored higher than the other two groups only at the immediate
posttest. Kruskal-Wallis tests showed that there was an effect for feedback exposure condition on both the immediate recognition, H(2) = 11.34,
p < .001, and the delayed recognition, H(2) = 8.98, p = .011, posttest.

83

Exposure Condition and Explicit Correction

Table 2.
Structure

Descriptive statistics for test scores


Test

Time

Groups
Receivers

Nonreceivers

Control

M Mdn SD M Mdn SD M Mdn SD


Plural

Recognition

Immediate
Delayed
Oral production Immediate
Delayed

.49
.46
.43
.25

.50
.50
.42
.03

.25
.22
.35
.34

.28
.22
.32
.31

.25
.13
.30
.16

.23
.20
.33
.33

.15
.22
.00
.00

.06
.13
.00
.00

.20
.26
.00
.00

Locative Recognition

.27
.31
.47
.33

.25
.25
.59
.31

.17
.14
.32
.29

.27
.30
.44
.35

.25
.25
.60
.42

.13
.14
.30
.29

.23
.21
.00
.00

.25
.25
.00
.00

.11
.15
.00
.00

Immediate
Delayed
Oral production Immediate
Delayed

The Kruskal-Wallis tests also revealed an effect for feedback exposure


condition on both the immediate oral production, H(2) = 18.05, p < .001,
and the delayed oral production, H(2) = 13.44, p = .001, posttests. Post
hoc Mann-Whitney tests, reported in Table 3, revealed that the receivers
statistically outperformed the control group on both test types regardless of time (i.e., immediate and delayed). The Mann-Whitney tests (see
Table 3) also revealed that the nonreceivers statistically outperformed
the control group on the oral production test (regardless of time) but not
on the recognition test (regardless of time). Wilcoxon signed-ranks tests
revealed that the receivers statistically outperformed the nonreceivers
on the immediate and delayed recognition posttests and on the immediate oral production posttest but not on the delayed oral production
posttest.

The Locative Morpheme


Table 2 reveals that the mean scores of all groups on the locative items
were very close to one another in the recognition tests, whereas the
mean scores of the receivers and nonreceivers were considerably
higher than the mean scores of the control group in the oral production
tests. Kruskal-Wallis tests showed that there was an effect for feedback
exposure condition in the oral production testimmediate: H(2) = 17.69,
p < .001; delayed: H(2) = 16.32, p < .001but not in the recognition test
immediate: H(2) = 0.31, p = .856; delayed: H(2) = 2.72, p = .257. Post hoc
Mann-Whitney tests carried out for the oral production test scores revealed
that both the receivers and the nonreceivers statistically outperformed
the control group regardless of time (see Table 3). Wilcoxon signed-ranks

84

Table 3.
Structure

Plural

Post hoc results


Test

Recognition
Oral production

Locative

Recognition
Oral production

Receivers vs.
control

Time

Nonreceivers vs.
control

Receivers vs.
nonreceivers

Immediate
Delayed
Immediate
Delayed

29.00
47.00
21.00
49.00

.001*
.016*
< .001*
.003*

.61
.46
.76
.56

67.00
94.00
28.00
28.00

.139
.848
< .001*
< .001*

.28
.04
.71
.71

2.45
2.85
2.67
.13

.014*
.004*
.008*
.894

.46
.54
.50
.02

Immediate
Delayed
Immediate
Delayed

N/A
N/A
21.00
28.00

N/A
N/A
< .001*
< .001*

N/A
N/A
.76
.71

N/A
N/A
28.00
28.00

N/A
N/A
< .001*
< .001*

N/A
N/A
.71
.71

.32
.21
.97
.27

.975
.832
.331
.789

.06
.04
.18
.05

Note. * = statistically signicant, N/A = not applicable.

Yucel Yilmaz

85

Exposure Condition and Explicit Correction

tests showed that neither in the recognition test nor in the oral production test did the receivers and nonreceivers differ from each other (see
Table 3).
As explained in the Method section, the receivers and nonreceivers
were allowed to hear each others attempts to produce the target morphemes in their subsequent turns. One consideration with this design
decision is that an imbalance in the number of attempts (i.e., extra input
for the other learner) between the groups could make it difcult to attribute
any differences between the groups to feedback exposure. To account
for this, the type and amount of input learners could hear from each
other were analyzed. Table 4 shows the descriptive statistics for the
type and amount of input to which each experimental group was exposed.
As can be seen from Table 4, nontargetlike productions (considering
suppliance in nonobligatory contexts and misformations together)
were more frequent than targetlike productions. Paired-samples t tests
conducted for each morpheme in each category revealed no signicant
differences between the groupsplural: misformations, t(13) = .29,
p = .77; correct suppliance, t(13) = .78, p = .45; oversuppliance, t(13) = .45,
p = .66; locative: misformations, t(13) = .27, p = .79; correct suppliance,
t(13) = .58, p = .57; oversuppliance, t(13) = 1.10, p = .29. Therefore, it is
possible to assume that the receivers and nonreceivers heard a comparable amount of input from each other.
Next, various t tests were conducted on the learners self-ratings of
the extent to which they paid attention to the interaction between their
partner and the experimenter. To determine whether learners self-ratings
were signicantly different from chance, each groups mean score was
compared to the median value of the rating scale (i.e., ve) using onesample t tests. The tests revealed signicant differencesnonreceivers:
t(13) = 14.92, p < .001; receivers: t(13) = 6.41, p < .001. Paired-samples
t tests conducted to determine whether the self-ratings of the two
experimental groups differed from each other revealed no differences
between the groupsnonreceivers: M = 8.35, SD = .84; receivers: M = 7.78,
Table 4.
Structure

Type and mean frequency of attempts


Production type

Receivers

Nonreceivers

SD

SD

Plural

Misformations
Correct suppliance
Oversuppliance

1.29
1.55
.86

1.45
1.65
1.15

1.17
1.31
.74

1.39
1.49
1.11

Locative

Misformations
Correct suppliance
Oversuppliance

1.45
.93
.14

1.44
1.20
.39

1.57
.83
.02

2.04
.98
.09

86

Yucel Yilmaz

SD = 1.61; t(13) = 1.29, p = .165. These results show that both groups paid
attention to their partners interaction with the experimenter, and that
the degree to which they paid attention to it did not differ between the
groups. Finally, two paired-samples t tests were conducted on learners
self-ratings of the two additional manipulation-check statements. For
the statement I noticed that the experimenter corrected my errors,
the receivers ratings were signicantly higher than the nonreceivers
ratingsnonreceivers: M = 5.29, SD = 2.92; receivers: M = 8.14, SD = 1.51;
t(13) = 3.68, p = .003. For the statement I noticed that the experimenter
corrected my partners errors, there was a signicant difference between
the ratings in favor of the nonreceiversnonreceivers: M = 6.43,
SD = 2.98; receivers: M = 3.43, SD = 2.62; t(13) = 2.88, p = .013. These
results show that the receivers and nonreceivers not only paid attention to the interaction between their partners and the experimenter but
also noticed that the receivers were the ones that were corrected. Overall, the analyses of the manipulation check statements revealed that it
was unlikely that the difference between the nonreceivers and receivers
in performance is attributable to the nonreceivers failure to pay attention to the interaction and feedback between the receivers and the
experimenter.
DISCUSSION
The hypothesis of the study predicted that the receivers would outperform the nonreceivers as measured by their recognition and oral production test scores. The hypothesis was conrmed for the plural
morpheme because the receivers outperformed the nonreceivers on all
the tests (immediate oral production and immediate and delayed recognition) except for the delayed oral production test. Additional indirect
evidence conrming the hypothesis came from the comparisons of each
group with the control group. The receivers outperformed the control
on all the tests, whereas the nonreceivers outperformed the control
group only on the oral production test, not on the recognition tests.
The hypothesis, however, was not supported for the locative morpheme because on neither of the tests were there differences between
the receivers and nonreceivers. The comparisons between each of the
groups and the control did not provide any support for the hypothesis
either. Contrary to expectations, the nding that neither experimental
group outperformed the control group on the recognition test raises
doubts as to whether any substantial learning took place for the locative morpheme.
At least two factors may have contributed to the nding that the
receivers outperformed the nonreceivers in the plural morpheme.
The rst is the communicative pressure the receivers might have

Exposure Condition and Explicit Correction

87

experienced due to their involvement in an information exchange


during which conveying their messages accurately was important. As
hypothesized by Long (2007), this type of communicative involvement
might have facilitated the noticing of new linguistic information in the
input by increasing learners motivation to attend to the corrective feedback. In the same vein, communicative pressure might have increased
learners alertness to the incoming corrective feedback instances. According to Tomlin and Villa (1994), alertness is one of the functions that is
involved in attention, along with orientation and detection. Alertness
refers to a general readiness to receive incoming stimuli. Orientation
means directing ones attentional resources to a particular type of
stimuli. Tomlin and Villa (1994) claimed that even though orientation and
alertness are not required for L2 learning, they could facilitate detection, the third and most crucial function of attention in L2 learning,
which refers to the cognitive registration of sensory stimuli (p. 192).
Applying these terms to the current study, one can argue that participating in and observing the interaction including feedback may have
created different levels of alertness and made the receivers more ready
to take advantage of the feedback. The feedback in this context could
be considered an orientation device with the potential to promote the
detection of the nontargetlike form and its targetlike version.
The second factor that may be at work simultaneously with or independent of the rst factor is the customized nature of the feedback for
the receivers needs. Even though both experimental groups were given
equal production opportunities in this study, when feedback was provided, it was the receivers turn to describe the pictures. Having an
opportunity to receive feedback and to produce output immediately
after the feedback may have enhanced the effectiveness of the receivers
hypothesis testing. They could use the information in the feedback to
modify their hypotheses about how the target form worked. Thanks to
this opportunity, the receivers could form new hypotheses by rening
and modifying the previous ones and could submit them to further
testing. It may be that the nonreceivers were not at an equally advantageous position in ne-tuning their hypotheses. They lacked the opportunity to put their newly formed hypotheses to the test. It is well
documented in previous studies that learners engage in hypothesis
testing as a result of producing output (Swain, 1995) and negotiating
meaning (Mackey, Gass, & McDonough, 2000; Pica, Holliday, Lewis, &
Morgenthaler, 1989). Nevertheless, the claim that more effective hypothesis testing and increased alertness played a role in the receivers
better performance should be seen as tentative. Direct evidence through
introspective methods (e.g., stimulated recalls) of data collection is
needed to conrm this claim. In addition, despite the overall advantage
of the receivers in regard to the plural morpheme, the fact that the difference between the feedback exposure conditions disappeared on the

88

Yucel Yilmaz

delayed posttest indicates that continuous exposure to feedback or


input may be necessary to help the receivers maintain their advantage.
With the locative morpheme, receiving feedback on ones own errors
offered no advantage over hearing the feedback given on someone
elses errors. This suggests that the effect of exposure condition on
learning varies depending on the target structure. One factor that may
explain this result is the difference between the structures morphophonological regularity. Morphophonological regularity is dened as the
degree to which the functors are (or are not) affected by their phonological environment (Goldschneider & DeKeyser, 2001, p. 26). The locative
morpheme is morphophonologically less regular than the plural because
it has four allomorphs, whereas the plural has two. Learners may have
found the learning of the locative relatively more challenging because
they had to link four forms to one meaning. Another factor that could
help explain the interaction between exposure condition and target structure is the similarity between learners previously learned languages
(i.e., L1 and L2) and Turkish (i.e., L3) with respect to how plurality and
location are expressed. One cannot claim any similarity between Turkish and Mandarin in this regard because Mandarin expresses plurality
and location through free morphemes, whereas Turkish does so through
inectional sufxes.8 However, a different pattern emerges if English,
the L2 of the learners, is taken into consideration. English and Turkish
are similar in that both languages express plurality by inectional sufxes. However, they differ in the expression of location. Turkish uses a
sufx, whereas English uses free morphemes (i.e., prepositions) to
express location. The lack of correspondence between learners L2 and
L3 (i.e., Turkish) as well as the low morphophonological regularity of
the locative morpheme may have put learners at a disadvantage in
learning the locative. These factors may have made this sufx more difcult to notice and increased the number of hypotheses to be tested.
The experimental groups scores for the locative items were signicantly higher than the control groups scores for the locative items only
in the oral production test; they were not signicantly different in the
recognition test. This discrepancy could be attributed to the different
levels of knowledge measured by each test. In the oral production test,
learners were awarded credit not only for choosing the correct allomorph
(i.e., correct suppliance) but also for marking the nouns for location even
if they chose the incorrect allomorph (i.e., misformations). In the recognition test, however, the only way to receive credit was to select the correct allomorph. It is possible, then, that learners gained some knowledge
as to how to mark the nouns for location but did not gain the knowledge
necessary to mark the nouns with the correct allomorph.9
The most likely reasons for the limited learning of the locative morpheme are the greater learning challenge the structure poses and the
ineffectiveness of the treatment. The duration of the treatment, type of

Exposure Condition and Explicit Correction

89

feedback, and the fact that learners learned the morpheme only through
corrective feedback may be some of the factors that contributed to the
ineffectiveness of the treatment. It could be that absolute beginners
need to be exposed to the locative for a longer period of time to distinguish themselves from a control group. It is equally possible that
learners need to be exposed to feedback types that not only provide the
correct form but also explain the rule behind the structure. In addition,
the learners in this study were not provided any additional positive
evidence on the target forms (other than the positive evidence provided
through corrective feedback). Although this is an essential experimental
control feature for attributing the results to exposure condition alone, it
may have increased the difculty of mapping form and meaning. Similarly, these factors may have contributed to the fact that the effect of
exposure condition in the plural morpheme was less durable. It could
be that these factors decreased the effectiveness of the treatment and
prevented the receivers from keeping their advantages on the delayed
posttest. Future studies could consider increasing the length and number
of treatment sessions as well as the amount of feedback to increase the
effectiveness of feedback in the acquisition of the locative. In addition,
in future rst-exposure studies, a controlled amount of positive evidence could be presented prior to the administration of feedback tasks
to facilitate form-meaning mapping. It would be necessary to test
learners knowledge after this exposure stage to make sure that the
amount of learning across conditions is comparable.
In this study, exposure condition was not isolated from the learners
attempts to produce the target forms. The learners in each of the feedback groups heard their partners correct and incorrect attempts to
incorporate the knowledge he or she gained from the corrective feedback. Therefore, the feedback exposure variable included the ways that
learners attempted to use the correct form in their subsequent turns.
Nonequivalent numbers of attempts would potentially confound the
results. However, this was not the case, as documented by the follow-up
analysis showing that learners were exposed to comparable amounts of
targetlike and nontargetlike input. Although this allows one to claim
that the effect of exposure condition for the plural morpheme and the
lack of it for the locative morpheme cannot be attributed to differing
amounts of extra input learners heard from each other, it does not allow
one to make any claims about the effect of feedback in the absence of
this extra input. In other words, the results should be considered as
shedding light on receiver-nonreceiver differences in contexts in which
learners are allowed to hear each others targetlike and nontargetlike
productions.
When interpreting the results, one needs to pay attention to the
specic features of the feedback used in this study. It is generally
accepted that feedback types differ along an implicit-explicit continuum.

90

Yucel Yilmaz

Feedback types that present metalinguistic rules or indicate what


learners should have said have been considered explicit, whereas the
ones that do not present this type of information have been considered
implicit. According to this distinction, the feedback used in the present
study is relatively more explicit than feedback types, such as clarication requests or recasts, that do not directly indicate what the learners
should have said and is different from other explicit feedback types that
provide metalinguistic rules. It is possible that the unique nature of
explicit correction contributed to the exposure effect observed for the
plural. A more implicit feedback type or a feedback type that provides
metalinguistic rules, therefore, could lead to different results.
Moreover, Lyster (2004), using a different classication criterion,
has distinguished between feedback types that prompt learners to selfrepair their errors and feedback types that provide the correct form.
Because the learners in this study were absolute beginners, it was not
possible to use feedback types that do not provide correct reformulations. It should be noted, however, that future studies targeting learners
who are more procient and using feedback types that do not provide
correct reformulations may produce different results from the results of
this study. In addition, the feedback used in this study was intensive,
provided on a specic set of target structures, and planned (i.e., the
provision of feedback was determined before the task). Therefore, the
ndings of this study may not be readily generalized to contexts in
which feedback is extensive (i.e., provided on many different linguistic
targets with incidental feedback).10
A methodological issue that deserves attention is the question of
which approach should be used to determine whether the nonreceivers
paid attention to the interaction or feedback between the receivers and
the experimenter. The technique used in this study relied on learners
subjective evaluations of their experience because it involved learners
self-ratings of three manipulation-check statements. To obtain a more
complete picture of learners experience, the data from this technique
can be triangulated with data collected with techniques that are more
objective. For example, Mackey (1999) administered a post hoc comprehension check to the observers (nonreceivers) in their L1 to make sure
that the observers paid attention to the input. After the experiment, the
observers were asked to recall some of the information provided by the
NS to the interactors (receivers) during the experiment. The type of
tasks used in Mackey (1999) allowed her to use such a technique. In her
tasks, the NS was in the role of an information supplier, and task completion depended on the comprehension of this information by the
learners. In contrast, in the present study, the learners were the information suppliers during the tasks; therefore, there was no context for
the learners to comprehend the information provided by the NS. Future
research could incorporate Mackeys (1999) technique by designing

Exposure Condition and Explicit Correction

91

tasks in which learners need to comprehend the information provided


by the NS.
The ndings of this study can inform classroom contexts in which
feedback is given during task-based pair work interactions more than
contexts in which feedback is given during whole-class activities. It has
been shown that task-based pair work interactions in classrooms are
comparable to task-based pair work interactions in lab contexts (Gass,
Mackey, & Ross-Feldman, 2011). However, one needs to be careful when
applying the ndings of this study to whole-class activities. In large
language classes in which feedback is given to one learner during a
whole-class activity, there is usually more than one nonreceiver, and the
chances of these nonreceivers perceiving and processing the feedback
vary depending on their physical distance from the receiver. Another
difference between the receiver and nonreceiver roles in this study and
those in classroom contexts is that in classroom contexts learners
could be in both of the roles, whereas in this study learners were always
in only one role. Future research comparing groups in which learners
are allowed to switch their roles to groups that are allowed to be in only
one role can provide valuable insights as to whether being in the
receiver role all the time is necessary to take advantage of the feedback
to its fullest extent.
CONCLUSION
The results of this study showed that when all other things are equal,
learners benet from feedback most when feedback is given on their
own errors. This was found to be the case specically in the context of
acquiring the structure that is morphophonologically regular and similar in bound or free status to the form that expresses plurality in their
L2 (i.e., English). It was speculated that communicative pressure and
the customized nature of the feedback could be responsible for this
nding. Regarding the locative, however, there was no effect for exposure condition. The fact that learners in each experimental group outperformed the control group for the locative morpheme on only one of
the tests suggests that a limited amount of learning of this structure
took place. Thus, it could not be determined whether the lack of effect
was genuine or a byproduct of the difculty of the learning task. Overall,
it seems that there is an interaction between exposure condition and
target structure.
Future research in this area is necessary to conrm or disconrm the
ndings of this study. There are several benets to pursuing this line of
research. For instance, it would help the eld determine whether
learning activities should maximize learners oral production opportunities
to improve their chances of receiving feedback on their own errors.

92

Yucel Yilmaz

In addition, this line of research can help identify the cognitive processes that are involved in L2 learning through feedback. In this study,
the receivers higher performance in the plural morpheme was attributed to more effective hypothesis testing and increased alertness. If this
interpretation is conrmed by future research, the proposed cognitive
processes could be used to explain how learners take advantage of
feedback when they receive feedback on their own errors. Finally, the
ndings of this line of research have important implications for corrective feedback research methodology. A recurrent nding showing an
advantage for the receiver role would suggest that studies in which
both receiver and nonreceiver roles are present should control for the
number of feedback instances learners receive in each role to minimize
the potential confounding effects of feedback exposure condition on the
variables under investigation.
Received 6 February 2014
Accepted 26 September 2014
Final Version Received 17 October 2014
NOTES
1. These studies also had premodied input conditions in which directions were read
to learners from a text containing repetitions and paraphrases of the task items.
2. Unlike in the previous studies, the groups were not called interactors/participators/performers and observers because the groups were not different with respect to
whether they participated in interaction (both groups produced output and interacted
with the researcher). Rather, the groups were called receivers and nonreceivers because
what made these groups different was receiving or not receiving feedback on their own
errors.
3. Following Norris and Ortegas (2000) research design recommendations, a true
control group was used. Norris and Ortega (2000) dened a true control group as a group
receiving neither instruction nor exposure related to the target structure except in pretests and posttest (p. 446). They also argued that a true control group is a more powerful
research tool in identifying effects attributable to instruction than a comparison group in
which learners receive alternative treatments. The members of the control group in this
study were not exposed to the target structure, but they were allowed to carry out the
tasks the receivers and nonreceivers carried out. Therefore, the difference between the
control group and the feedback groups can be attributed to feedback because output
opportunities were controlled for.
4. In addition, it was assumed that a feedback type such as explicit correction would
be relatively less likely to favor one of the conditions because it is unambiguous in its
corrective intent and focus.
5. Note that Turkish does not require a copula in nominal sentences.
6. The experimenter held the oor after the feedback by asking a task-relevant question as quickly as possible.
7. A 1-week interval was given between the rst and second sessions for the following reasons: (a) It was assumed that giving an interval between the sessions would
better represent classroom contexts because learners usually receive feedback over
several different lessons rather than in a single lesson, and (b) given that learners were
absolute beginners, it was hypothesized that this interval length would maximize
learning by giving them enough time to ne-tune their working hypotheses about the
target structures.

Exposure Condition and Explicit Correction

93

8. To be more precise, Mandarin uses adverbs in the clause or numerals, adjectives,


or noun modiers in the noun phrase to express plurality (Xu, 2012). In addition, to describe
the location of an object with respect to a reference point, the preposition zi is used
together with a location noun, which is followed by a locative particle (Sun, 2006).
9. This interpretation is supported by the comparison of misformation and correct
suppliance scores. Separate proportion scores for misformations and correct suppliance
were calculated by dividing the instances of misformations or correct suppliance by the
total number of obligatory contexts. The comparison revealed that the misformation
scores (immediate: receivers M = .42, SD = .30, nonreceivers M = .48, SD = .32; delayed:
receivers M = .31, SD = .27, nonreceivers M = .36, SD = .31) were much higher than the correct
suppliance scores (immediate: receivers M = .26, SD = .23, nonreceivers M = .22, SD = .16;
delayed: receivers M = .17, SD = .18, nonreceivers M = .16, SD = .17), indicating that the use
of incorrect allomorphs was more frequent than the use of correct allomorphs.
10. In this study, the receivers were not given the opportunity to modify their output
after feedback, in an attempt to match the receivers and nonreceivers on as many factors
as possible. It was assumed that allowing the receivers to modify their output might give
them an advantage, because it was not natural for the nonreceivers to produce modied
output. According to an alternative view pointed out by an anonymous reviewer, allowing
receivers to respond to the explicit feedback could benet nonreceivers in understanding
the rules underlying the structures because their attentional resources would not be consumed by the interaction. Future research could consider manipulating modied output
opportunities for receivers to nd out the most favorable conditions for nonreceivers to
benet from feedback.
REFERENCES
Ammar, A., & Spada, N. (2006). One size ts all? Recasts, prompts, and L2 learning. Studies
in Second Language Acquisition, 28, 543574.
Andersen, R. W. (1984). The one to one principle of interlanguage construction. Language
Learning, 34, 7795.
Carroll, S. (2001). Input and evidence: The raw material of second language acquisition.
Philadelphia, PA: Benjamins.
Carroll, S., & Swain, M. (1993). Explicit and implicit negative feedback: An empirical study
of the learning of linguistic generalizations. Studies in Second Language Acquisition,
15, 357386.
Chaudron, C. (1977). A descriptive model of discourse in the corrective treatment of
learners errors. Language Learning, 27, 246.
Doughty, C. J. (2001). Cognitive underpinnings of focus on form. In P. Robinson (Ed.),
Cognition and second language instruction (pp. 206257). Cambridge, UK: Cambridge
University Press.
Doughty, C., & Varela, E. (1998). Communicative focus on form. In C. Doughty and
J. Williams (Eds.), Focus on form in classroom second language acquisition (pp. 114138).
New York, NY: Cambridge University Press.
Egi, T. (2007). Interpreting recasts as linguistic evidence: The roles of linguistic target,
length, and degree of change. Studies in Second Language Acquisition, 29, 511537.
Ellis, R. (1991). Grammar teaching practice or consciousness-raising? In R. Ellis (Ed.), Second language acquisition and second language pedagogy (pp. 232241). Clevedon, UK:
Multilingual Matters.
Ellis, R. (2005). Measuring implicit and explicit knowledge of a second language: A psychometric study. Studies in Second Language Acquisition, 27, 141172.
Ellis, R. (2007). The differential effects of corrective feedback on two grammatical structures. In A. Mackey (Ed.), Conversational interaction in second language acquisition
(pp. 339360). New York, NY: Oxford University Press.
Ellis, R., Loewen, S., & Erlam, R. (2006). Implicit and explicit corrective feedback and the
acquisition of L2 grammar. Studies in Second Language Acquisition, 28, 339368.
Ellis, R., Tanaka, Y., & Yamazaki, A. (1994). Classroom interaction, comprehension, and the
acquisition of L2 word meanings. Language Learning, 44, 449491.
Field, A. P. (2009). Discovering statistics using SPSS. London, UK: Sage.
Gass, S. (1997). Input, interaction and the second language learner. Mahwah, NJ: Erlbaum.

94

Yucel Yilmaz

Gass, S. M., & Mackey, A. (2006). Input, interaction and output: An overview. AILA review,
19, 317.
Gass, S., Mackey, A., & Ross-Feldman, L. (2011). Task-based interactions in classroom and
laboratory settings. Language Learning, 61, 189220.
Goldschneider, J. M., & DeKeyser, R. M. (2001). Explaining the natural order of L2 morpheme acquisition in English: A meta-analysis of multiple determinants. Language
Learning, 51, 150.
Goo, J. (2012). Corrective feedback and working memory capacity in interaction-driven
L2 learning. Studies in Second Language Acquisition, 34, 445474.
Goo, J., & Mackey, A. (2013). The case against the case against recasts. Studies in Second
Language Acquisition, 35, 127165.
Krashen, S. (1981). Second language acquisition and second language learning. Oxford, UK:
Oxford University Press.
Leeman, J. (2003). Recasts and second language development. Studies in Second Language
Acquisition, 25, 3763.
Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis. Language
Learning, 60, 309365.
Loewen, S., & Nabei, T. (2007). Measuring the effects of oral corrective feedback on L2
knowledge. In A. Mackey (Ed.), Conversational interaction in second language acquisition (pp. 361377). New York, NY: Oxford University Press.
Loewen, S., & Philp, J. (2006). Recasts in the adult L2 classroom: Characteristics, explicitness and effectiveness. Modern Language Journal, 90, 536556.
Long, M. H. (1981). Input, interaction, second-language acquisition. In H. Winitz (Ed.),
Native language and foreign language acquisition (pp. 259278). New York, NY:
New York Academy of Sciences.
Long, M. (1991). Focus on form: A design feature in language teaching methodology. In
K. de Bot, R. Ginsberg, & C. Kramsch (Eds.), Foreign language research in cross-cultural
perspective (pp. 3952). Amsterdam, the Netherlands: Benjamins.
Long, M. H. (1996). The role of the linguistic environment in second language acquisition. In W. Ritchie & T. K. Bhatia (Eds.), Handbook of second language acquisition
(pp. 413468). New York, NY: Academic Press.
Long, M. H. (2007). Problems in SLA. Mahwah, NJ: Erlbaum.
Long, M. H., & Robinson, P. (1998). Focus on form: Theory, research, and practice. In
C. Doughty & J. Williams (Eds.), Focus on form in classroom SLA (pp. 1541). New York,
NY: Cambridge University Press.
Lyster, R. (2004). Differential effects of prompts and recasts in form-focused instruction.
Studies in Second Language Acquisition, 26, 399432.
Lyster, R., & Izquierdo, J. (2009). Prompts versus recasts in dyadic interaction. Language
Learning, 59, 453498.
Lyster, R., & Ranta, L. (1997). Corrective feedback and learner uptake. Studies in Second
Language Acquisition, 19, 3766.
Lyster, R., & Ranta, L. (2013). Counterpoint piece: The case for variety in corrective feedback research. Studies in Second Language Acquisition, 35, 167184.
Lyster, R., & Saito, K. (2010). Oral feedback in classroom SLA: A meta-analysis. Studies in
Second Language Acquisition, 32, 265302.
Mackey, A. (1999). Input, interaction, and second language development: An empirical
study of question formation in ESL. Studies in Second Language Acquisition, 21,
557587.
Mackey, A., Gass, S., & McDonough, K. (2000). How do learners perceive interactional
feedback? Studies in Second Language Acquisition, 22, 471497.
Mackey, A., & Goo, J. (2007). Interaction research in SLA: A meta-analysis and research
synthesis. In A. Mackey (Ed.), Conversational interaction in SLA: A collection of empirical studies (pp. 408452). New York, NY: Oxford University Press.
Mackey, A., & Philp, J. (1998). Conversational interaction and second language development: Recasts, responses, and red herrings? Modern Language Journal, 82,
338356.
Mackey, A., Philp, J., Egi, T., Fujii, A., & Tatsumi, T. (2002). Individual differences in working
memory, noticing of interactional feedback and L2 development. In P. Robinson (Ed.),
Individual differences and instructed language learning (pp. 181209). Philadelphia, PA:
Benjamins.

Exposure Condition and Explicit Correction

95

McDonough, K. (2005). Identifying the impact of negative feedback and learners


responses on ESL question development. Studies in Second Language Acquisition,
27, 79103.
Muranoi, H. (2000). Focus on form through interaction enhancement: Integrating formal
instruction into a communicative task in EFL classrooms. Language Learning, 50,
617673.
Norris, J. M., & Ortega, L. (2000). Effectiveness of instruction: A research synthesis and
quantitative meta-analysis. Language Learning, 50, 417528.
Ono, L., & Witzel, J. (2002). Recasts, salience, and morpheme acquisition. Unpublished manuscript, University of Hawaii, Honolulu, HI.
Ortega, L., & Long, M. H. (1997). The effects of models and recasts on the acquisition of
object topicalization and adverb placement in L2 Spanish. Spanish Applied Linguistics,
1, 6586.
Pica, T. (1988). Interlanguage adjustments as outcome of NS-NNS negotiated interaction.
Language Learning, 38, 4573.
Pica, T. (1992). The textual outcomes of native speakernonnative speaker negotiation:
What do they reveal about second language learning? In C. Kramsch & S. McConnellGinet (Eds.), Text and context: Cross-disciplinary perspectives on language study (pp.
198237). Lexington, MA: D. C. Heath.
Pica, T., Holliday, L., Lewis, N., & Morgenthaler, L. (1989). Comprehensible output as an
outcome of linguistic demands on the learner. Studies in Second Language Acquisition,
11, 6390.
Russell, J., & Spada, N. (2006). The effectiveness of corrective feedback for second
language acquisition: A meta-analysis of the research. In J. Norris & L. Ortega (Eds.),
Synthesizing research on language learning and teaching (pp. 133164). Amsterdam,
the Netherlands: Benjamins.
Schmidt, R. (2001). Attention. In P. Robinson (Ed.), Cognition and second language instruction (pp. 332). New York, NY: Cambridge University Press.
Schwartz, B. D. (1993). On explicit and negative evidence effecting and affecting competence and linguistic behavior. Studies in Second Language Acquisition, 15, 147163.
Sheen, Y. (2006). Exploring the relationship between characteristics of recasts and learner
uptake. Language Teaching Research, 10, 361392.
Sheen, Y. (2007). The effects of corrective feedback, language aptitude, and learner attitudes on the acquisition of English articles. In A. Mackey (Ed.), Conversational interaction in second language acquisition (pp. 301322). New York, NY: Oxford University
Press.
Sun, C. (2006). Chinese: A linguistic introduction. Cambridge, UK: Cambridge University
Press.
Swain, M. (1995). Three functions of output in second language learning. In G. Cook &
B. Seidlhofer (Eds.), Principle and practice in applied linguistics: Studies in honor of
H. G. Widdowson (pp. 125144). Oxford, UK: Oxford University Press.
Tomlin, R., & Villa, V. (1994). Attention in cognitive science and second language acquisition. Studies in Second Language Acquisition, 16, 183204.
Tromovich, P., Ammar, A., & Gatbonton, E. (2007). How effective are recasts? The role of
attention, memory, and analytical ability. In A. Mackey (Ed.), Conversational interaction in second language acquisition (pp. 144171). Oxford, UK: Oxford University
Press.
White, L. (1991). Adverb placement in second language acquisition: Some positive and
negative evidence in the classroom. Second Language Research, 7, 133161.
Xu, D. (2012). Plurality and classiers across languages in China. Berlin, Germany: de
Gruyter.
Yilmaz, Y. (2012). The relative effects of explicit correction and recasts on two target
structures via two communication modes. Language Learning, 62, 11341169.
Yilmaz, Y. (2013a). The relative effectiveness of mixed, explicit and implicit feedback.
System, 41, 691705.
Yilmaz, Y. (2013b). Relative effects of explicit and implicit feedback: The role of working
memory capacity and language analytic ability. Applied Linguistics, 34, 344368.

96

Yucel Yilmaz

APPENDIX A
LIST OF WORDS
Balon balloon, bavul suitcase, bere hat, beyaz white, byk big,
defter notebook, ekmek bread, elma apple, etek skirt, gemi
boat, gri grey, inek cow, kafes cage, kahve brown, kama knife,
kamyon truck, kavun melon, kazak sweater, kedi cat, kemer
belt, kemik bone, kirmizi red, kitap book, kck small, masa
table, mavi blue, miki mickey, motor motorbike, sapan slingshot, sedir armchair, sepet basket, sinek y, siyah black, sopa
stick, tabak plate, tepsi tray, torba trash bag, yatak bed, yesil
green.

APPENDIX B
NORMALITY TEST RESULTS
Structure
Plural

Test
Recognition
Oral production

Locative

Recognition
Oral production

Time

df

P value

Immediate
Delayed
Immediate
Delayed

.91
.87
.76
.66

42
42
42
42

.002
.000
.000
.000

Immediate
Delayed
Immediate
Delayed

.90
.93
.74
.76

42
42
42
42

.001
.013
.000
.000

You might also like