You are on page 1of 20

Operant Conditioning (B.F.

Skinner)
Overview:
The theory oI B.F. Skinner is based upon the idea that learning is a Iunction oI change in overt
behavior. Changes in behavior are the result oI an individual's response to events (stimuli) that
occur in the environment. A response produces a consequence such as deIining a word, hitting a
ball, or solving a math problem. When a particular Stimulus-Response (S-R) pattern is reinIorced
(rewarded), the individual is conditioned to respond. The distinctive characteristic oI operant
conditioning relative to previous Iorms oI behaviorism (e.g., Thorndike, Hull) is that the
organism can emit responses instead oI only eliciting response due to an external stimulus.
ReinIorcement is the key element in Skinner's S-R theory. A reinIorcer is anything that
strengthens the desired response. It could be verbal praise, a good grade or a Ieeling oI increased
accomplishment or satisIaction. The theory also covers negative reinIorcers -- any stimulus that
results in the increased Irequency oI a response when it is withdrawn (diIIerent Irom adversive
stimuli -- punishment -- which result in reduced responses). A great deal oI attention was given
to schedules oI reinIorcement (e.g. interval versus ratio) and their eIIects on establishing and
maintaining behavior.
One oI the distinctive aspects oI Skinner's theory is that it attempted to provide behavioral
explanations Ior a broad range oI cognitive phenomena. For example, Skinner explained drive
(motivation) in terms oI deprivation and reinIorcement schedules. Skinner (1957) tried to
account Ior verbal learning and language within the operant conditioning paradigm, although this
eIIort was strongly rejected by linguists and psycholinguists. Skinner (1971) deals with the issue
oI Iree will and social control.
Scope/Application:
Operant conditioning has been widely applied in clinical settings (i.e., behavior modiIication) as
well as teaching (i.e., classroom management) and instructional development (e.g., programmed
instruction). Parenthetically, it should be noted that Skinner rejected the idea oI theories oI
learning (see Skinner, 1950).
Example:
By way oI example, consider the implications oI reinIorcement theory as applied to the
development oI programmed instruction (Markle, 1969; Skinner, 1968)
1. Practice should take the Iorm oI question (stimulus) - answer (response) Irames which expose
the student to the subject in gradual steps
2. Require that the learner make a response Ior every Irame and receive immediate Ieedback
. Try to arrange the diIIiculty oI the questions so the response is always correct and hence a
positive reinIorcement
4. Ensure that good perIormance in the lesson is paired with secondary reinIorcers such as verbal
praise, prizes and good grades.
!rinciples:
1. Behavior that is positively reinIorced will reoccur; intermittent reinIorcement is particularly
eIIective
2. InIormation should be presented in small amounts so that responses can be reinIorced
("shaping")
. ReinIorcements will generalize across similar stimuli ("stimulus generalization") producing
secondary conditioning
References:
Markle, S. (1969). Good Frames and Bad (2nd ed.). New York: Wiley.
Skinner, B.F. (1950). Are theories oI learning necessary? Psychological Review, 57(4), 19-216.
Skinner, B.F. (195). Science and Human Behavior. New York: Macmillan.
Skinner, B.F. (1954). The science oI learning and the art oI teaching. Harvard Educational
Review, 24(2), 86-97.
Skinner, B.F. (1957). Verbal Learning. New York: Appleton-Century-CroIts.
Skinner, B.F. (1968). The Technology oI Teaching. New York: Appleton-Century-CroIts.
Skinner, B.F. (1971). Beyond Freedom and Dignity. New York: KnopI.
Related Web Sites:
There are two journals that contain current behaviorist research: The Journal Ior the
Experimental Analysis oI Behavior (JEAB) and the Journal oI Applied Behavior Analysis. While
the work reported in these journals is not necessarily Skinnerian, much oI it does continue the
legacy oI Skinner's ideas. A bibliography and access to Skinner's works is provided by the B.F.
Skinner Foundation. More background on operant conditioning can be Iound at
http://www.simplypsychology.org/operant-conditioning.html

B.F. Skinner was a controversiaI and interesting
psychoIogist who founded behaviorism and made
important contributions to Iearning theory and principIes
of behavior modification
Burrhus Frederic Skinner was a well-known and controversial 20th century researcher and teacher
who is associated with a school of psychology known as behaviorism.
Fred, as his family called him, was born on March 20, 1904 in Susquehanna, Pennsylvania, a small
railroad town just south of the New York state border.
His parents were Grace and William Skinner, a couple quite concerned with outward appearances
and social respectability. William was an attorney for the Erie Railroad. Grace was actively involved
with numerous civic organizations, primarily to promote the family image. According to her son, she
derived little pleasure from them.
Of the couple's two sons, Fred's younger brother Edward was the more obedient, charming and
socially adept. Edward's misdeeds were often overlooked, while Fred's were always punished.
Despite this apparent favoritism, Fred enjoyed great freedom to wander about doing whatever he
liked. He was resourceful, creating imaginative gizmos as playthings or as solutions to his youthful
problems. One such gadget helped him avoid his mother's displeasure, making a sign pop up when
he forgot to hang up his pajamas.
n later years, Fred would have opposed the use of words like "curiosity", "intelligence", or
"creativity", to characterize his childhood ingenuity. Fred believed that his resourcefulness was an
acquired behavior "shaped" gradually by the environment around him. Accidental successes and
discoveries "reinforced" his continued experimentation.
n 1922, Fred graduated as salutatorian from Susquehanna High School and was admitted to
Hamilton College in Clinton, New York in 1922. The liberal arts college placed a great emphasis on
writing skills, and Fred felt that he would like to become a writer.
He continued to be socially awkward, often appearing aloof and pretentious to his classmates. He
was uncomfortably aware of his inability to fit in with the other students, and later remarked that he
had turned his entire freshman class against him with a critical remark. Adding to his difficulties
adjusting to school, Fred's younger brother Edward suddenly died of a cerebral hemorrhage.
Fred graduated from Hamilton in 1926, again as salutatorian of his class. At about the same time,
his grandfather passed away. Fred wrote a dispassionate, clinical account of his grandfather's death.
He was unable to involve his emotions in his writing, - a profound handicap for a would-be author.
He found himself questioning his life philosophy, and casting about for new answers to his questions
about life and death.
n August, he read about the founder of behaviorism, John B. Watson, for the first time. Behaviorism
was the late 19th century's answer to the criticism that psychology was not a true science.
Watson eliminated the study of motivation, mental processes and emotions from behavioral
psychology, focusing instead on the study of observable, measurable behavior.
Fred Skinner's mind was primed for a change. ncreasingly, the perspective put forth by behaviorists
made sense to him.
n the fall of 1928, Skinner returned to school, this time entering Harvard University for graduate
studies in psychology. n the informal atmosphere at Harvard, Skinner at last began to come into his
own.
There he built a device capable of precisely measuring and recording the number of times a rat
pressed a bar to receive a food pellet. This box, along with the attached recording equipment,
provided a way to collect more objective data about behavior than scientists had been able to gather
before. The device came to be known as the "Skinner box"
Skinner's innovations were viewed with both admiration and suspicion by Harvard faculty.
ntrospective psychology was dominant at Harvard, and behaviorism appeared to belittle studies of
the inner workings of the mind. The head of the Harvard psychology department, Edwin Boring, was
uncomfortable with the direction in which Skinner's studies were going. To Boring's credit, he
consciously tried not to be an obstacle to Skinner's advancement.
n 1931, Skinner received his PhD from Harvard. He remained there for several more years,
conducting research. n 1937, he was offered a teaching/research position at the University of
Minnesota at Minneapolis. He had met Yvonne Blue, his future wife, the previous year. n November
of 1937, shortly before moving to Minneapolis to begin his new career, the two were married.
Skinner's brand of behaviorism was becoming more radical with time. t was fortunate that the
University of Minnesota was not dominated by any particular school of psychology, and was
therefore somewhat open to his brand of behaviorism.
Basically, Skinner modified the tenets of behaviorism to fit his own discoveries, which involved what
he called "operant conditioning." "Conditioning" is the scientific term for learning. "Operant" refers to
Skinner's idea that any organism "operates" on his environment - that is, performs actions that
change the environment around it for better or for worse. Operant psychology is based on the idea
that an action taken by a person or an animal often has consequences that occur naturally in the
environment. This principal is called "operant conditioning". Reinforcement is something that makes
it more likely that a given behavior will be repeated. The consequences of a given action either
reinforce the behavior or do not.

For example, if a child makes faces at the teacher in school, the laughter of the other children may
serve to reinforce his behavior. f the teacher punishes him by making him write, " will not make
faces" one hundred times on the chalkboard, the child may avoid such antics in the future. Thus, the
child initiates the behavior, and factors in the environment either reward or punish his behavior.
Skinner did not worry much about which consequence was the stronger one. He believed that if a
behavior was reinforced, it was apt to be repeated. Skinner believed that positive reinforcement was
more effective than punishment. He also believed that the reinforcement must come swiftly.
Experimenters using Skinner's techniques have taught birds and animals to perform any number of
unnatural actions. We have all seen chickens playing toy pianos or dogs climbing ladders, acting like
firemen. These peculiar behaviors are taught through a process called "shaping."
For example, a chicken is at first rewarded if it turns slightly in the direction of the piano. As it begins
to turn toward the piano more frequently, it begins to be rewarded only when it looks directly at the
piano or moves toward it. Eventually it is rewarded only when it touches the piano, and so forth.
This shaping of behavior, or "successive approximation" has proven to be a very successful teaching
technique. t has been adapted to teach people to overcome phobias or other disruptive behaviors.
Skinner's beliefs and techniques were not radical enough in themselves to cause the storm of
controversy that eventually began to swirl around him. One factor contributing to this storm was the
"baby tender".
The baby tender was a device Skinner invented to keep his second daughter, Deborah in a safe,
thermostatically controlled environment while he worked. t was the high-tech equivalent of a
playpen, but was misunderstood and construed as a diabolical device that Skinner was using to
experiment upon his hapless child. He was accused of keeping Deborah, who became known as
"the baby in the box" inside the baby tender for three years, depriving her of fresh air and human
companionship. Although this was far from the truth, magazine articles painted Skinner as an
unfeeling, inhumane parent.
n 1971, Skinner published a book that would prove to be even more shocking to the American
public. n "Beyond Freedom and Dignity", Skinner challenged the very foundation of the American
belief system. He dismissed the notion that individual freedom existed. Man's actions were nothing
more than a set of behaviors that were shaped by his environment, over which he had no control.
Such views, even if they had been completely understood in the context of Skinner's work, flew in
the face of what most Americans held dear. They removed admired attributes from man -- free will,
dignity, and conscious thought -- and replaced them with behaviors that were shaped by an
environment over which individual man had little or no control.
Skinner's penchant for substituting his own special vocabulary for words that he felt might be
misunderstood probably contributed to the controvesies that flared up around him. Since most
people had no idea what he was talking about, these words did not clarify his ideas, but rather
confused his listeners.
When he advocated the use of operant conditioning techniques to control and engineer human
behavior, the idea smacked of tyranny and abuse of power. Skinner responded that all behavior is
already controlled by factors in the environment, and that society needed to manage some of those
factors.
Therapists have taken Skinner's ideas and used them to help people overcome phobias and other
maladaptive behavior. They are helping people control their actions without using the emotionally
charged language that got Skinner into so much hot water.
Psychologists have disproven the idea that a cat can always be trained to perform the same tasks as
a pigeon. nstead, certain species seem to be pre-wired to perform certain types of tasks, while other
species may be unable to learn them, despite their physical ability to do so.
mmediate rewards are no longer considered to be the best reinforcers under all conditions, although
they play an important role in many types of learning. Today, scientists acknowledge that learning
involves more complicated combinations of factors. Sometimes a delayed reward is more effective
than an immediate one. A combination of reward and punishment can also speed learning.
Programmed teaching materials providing immediate feedback to students' responses are utilized in
today's classrooms to effectively teach certain types of material. Skinner's ideas have also been
adopted to teach mentally retarded and autistic children, are used in industry to reduce job
accidents, and are used in numerous applications in health-related fields.
B. F. Skinner died of leukemia on August 18, 1990, at the age of 86.
n spite of some flaws in B.F. Skinner's views, the principles of operant conditioning still play an
important role in the way we approach learning and behavior modification today.
Bibliography:
Carpenter, Finley. The Skinner Primer: Behind Freedom and Dignity. New York: The Free Press, a
Division of Macmillan Publishing Company, nc., 1974:
Bjork, Daniel W. B. F. Skinner: A Life. New York: BasicBooks, a Division of Harper Collins
Publishers, nc., 1993
Hunt, Morton. The Story of Psychology. New York: Doubleday, 1993.


biography of the Iate B.F. Skinner, an merican,
whose Theory of Behaviorism had an enormous impact
on the science of PsychoIogy.
One of the most controversial and influential figures in the last century, Burrhus Frederic (B.F.)
Skinner was born in Susquehanna, Pa., USA in 1904. He graduated with a major in literature from
Hamilton College, New York in 1931 and initially had aspirations to be a professional writer.
He quickly realized that this was not the vocation for him and around the same time that he decided
not to pursue writing for a living he discovered the theories of the great Behaviorist, John B. Watson.
Watson's thesis was to have a major impact on Skinner and would direct the rest of his life.
Watson founded Behaviorism in 1913, when Skinner was just 9 years old, because he, like many
other researchers and critical thinkers of his day, had become disillusioned with the Structuralist
school and their method of studying behavior via 'introspection'. Watson believed that the theories of
Darwin (evolution) and Functionalism were the way forward. He was impressed by the work of
Russian researchers on reflexes in dogs, most notably the findings of van Pavlov. Watson
concluded that basic reflexes could be developed into learned responses directly influenced by
stimuli in the environment.
The principal question Watson asked was 'what useful purpose does behavior serve?' - Behaviorism
was born. n this field of study only observable behaviors were to be examined and a wider number
of subjects could now be studied (unlike Structuralism), including animals, children, the retarded and
the insane.
Skinner returned to college, spurred on by his enthusiasm for Behaviorism, and he obtained his PhD
in Experimental Psychology in 1931 from Harvard University. By this stage, Watson had left
Psychology completely after rumors of an affair with his research assistant. Skinner took up the
mantle with enthusiasm and was to have much more influence in both psychology and critical
thought than Watson could have ever imagined.
B.F. Skinner held a strict behaviorist viewpoint advocating that operant instrumental learning was
more important than Pavlov's Classical Conditioning. n Classical Conditioning naturally occurring
behaviors or reflexes are paired with a neutral stimulus e.g. dogs naturally salivate when they are
presented with food. Pavlov discovered that if a bell was rang (the neutral stimulus) when the dogs
were feed they would eventually salivate every time a bell was sounded regardless of whether food
was presented or not

n operant instrumental conditioning, learning occurs as a result of reinforcement where specific
rewards or punishments are implemented in order to achieve or dissuade the behavior to be
changed. Skinner went further than Watson in that he firmly suggested that the study of learning
should only be concerned with observable stimuli and responses -'thought', 'feeling', 'motivational
factors' etc were deemed 'unobservable' and therefore not measurable, and that mental events were
themselves behaviors and not causes. He had no time for such concepts, and it was this strict
behaviorist viewpoint that brought him both admirers and ardent critics.
Skinner wrote that if humans were to be changed, even saved, then the environment itself must be
changed and not the 'inner self', via a specifically chosen pattern of rewards and punishments. He
believed that it was possible to have large-scale control over human behavior and that the belief that
people were 'free agents' was simply wrong. To Skinner, therefore, the environment was THE key,
because it was this that molded behavior.
Most of Skinner's work was carried out on animals, principally rats and pigeons, and it was from their
behaviors that he would infer how humans also behave. The controlled chamber he designed to
study learning in rats, where the animals press levers for food rewards, was named after him - the
Skinner Box.
He also designed an open baby crib - called the Air Crib, which was constructed using clear
Plexiglas sides (rather than the usual wooden bars), because he felt it was essential that infants
should see the world clearly and not in a restricted fashion. His own children were reared in these
cribs and later; one of his daughters would also use the air crib with her own children.
Commercially though, the Air Crib was a huge flop - people believed that Skinner was treating
humans in the same way as he treated his rats - in boxes. This reaction, in retrospect was probably
due to a lack of understanding about what Skinner intended.
Skinner published a great number of articles and books during his lifetime, the two most widely read
being 'Walden Two' (1948) and 'Beyond Freedom and Dignity' (1971). They were to provoke a
massive protest from many scientists who disputed the behaviorist concept that humans were simply
'reactors' to events in their environments.
Skinner's work influenced thinking in many different fields of psychology and his views in two
principal areas will now be briefly highlighted:
On language: - Skinner assumed that children were born as 'blank slates' or 'tabula rasae' and that
they learn language via shaping the sounds they hear from their caregivers into words and
eventually sentences through selective reinforcement. This viewpoint was most avidly criticized by
Noam Chomsky (1968, 1980), who found evidence for an innate 'Language Acquisition Device' or
'LAD', where newborns are biologically programmed for language learning.
On Personality: - Skinner said (1977, p10): " see no evidence for an inner world of mental life
relative either to an analysis of behavior as a function of environmental forces or to the physiology of
the nervous system". So, once again it was the external environment, plus the past learning history
of the individual, which was said to 'shape' their personality. New research has shown this line of
thought to be flawed - children are born with certain temperamental characteristics and it is both
genetics and environment that shape personality.
Conclusion:
B.F. Skinner's work had a major effect not only on Psychology but also in how 20th Century thought
evolved. He had his avid supporters and fiercest critics. Many of his critics had never even read his
work, but simply rejected it outright, mainly because of Skinner's inferences of human behavior from
his research on animals. He spent his life fighting in the Behaviorist corner until his death in 1990.
Undisputedly he will be remembered for his important findings, his advancing the study of learning
and for his legacy - how to best channel critical thinking and the pitfalls we would do well to avoid.
Operant conditioning
lrom Wlklpedla Lhe free encyclopedla
!ump Lo navlgaLlon search
Opetoot teJltects bete lot tbe Jefloltloo of tbe wotJ opetoot see wlktloootyopetoot

lL has been suggesLed LhaL Motool opetoot cooJltlooloq be merged lnLo Lhls arLlcle or secLlon
(ulscuss)
Operant conditioning is the use oI a behavior's antecedent and/or its consequence to inIluence
the occurrence and Iorm oI behavior. Operant conditioning is distinguished Irom classical
conditioning (also called respondent conditioning) in that operant conditioning deals with the
modiIication oI "voluntary behavior" or operant behavior. Operant behavior "operates" on the
environment and is maintained by its consequences, while classical conditioning deals with the
conditioning oI reIlexive (reIlex) behaviors which are elicited by antecedent conditions.
Behaviors conditioned via a classical conditioning procedure are not maintained by
consequences.
|1|

Contents
hlde
O 1 8elnforcemenL punlshmenL and exLlncLlon
4 11 lour conLexLs of operanL condlLlonlng
O 2 1horndlkes law of effecL
O 3 8lologlcal correlaLes of operanL condlLlonlng
O 4 lacLors LhaL alLer Lhe effecLlveness of consequences
O 3 CperanL varlablllLy
O 6 Avoldance learnlng
4 61 ulscrlmlnaLed avoldance learnlng
4 62 lreeoperanL avoldance learnlng
O 7 1woprocess Lheory of avoldance
O 8 verbal 8ehavlor
O 9 lour Lerm conLlngency
O 10 CperanL hoardlng
O 11 An alLernaLlve Lo Lhe law of effecL
O 12 See also
O 13 8eferences
O 14 LxLernal llnks
edit] Reinforcement, punisbment, and extinction
ReinIorcement and punishment, the core tools oI operant conditioning, are either positive
(delivered Iollowing a response), or negative (withdrawn Iollowing a response). This creates a
total oI Iour basic consequences, with the addition oI a IiIth procedure known as extinction (i.e.
no change in consequences Iollowing a response).
It is important to note that actors are not spoken oI as being reinIorced, punished, or
extinguished; it is the actions that are reinIorced, punished, or extinguished. Additionally,
reinIorcement, punishment, and extinction are not terms whose use is restricted to the laboratory.
Naturally occurring consequences can also be said to reinIorce, punish, or extinguish behavior
and are not always delivered by people.
O ke|nforcement ls a consequence LhaL causes a behavlor Lo occur wlLh greaLer frequency
O 9n|shment ls a consequence LhaL causes a behavlor Lo occur wlLh less frequency
O t|nct|on ls Lhe lack of any consequence followlng a behavlor When a behavlor ls
lnconsequenLlal (le produclng nelLher favorable nor unfavorable consequences) lL wlll occur
wlLh less frequency When a prevlously relnforced behavlor ls no longer relnforced wlLh elLher
poslLlve or negaLlve relnforcemenL lL leads Lo a decllne ln Lhe response
edit] Four contexts of operant conditioning
Here the terms positive and negative are not used in their popular sense, but rather: positive
reIers to addition, and negative reIers to subtraction.
What is added or subtracted may be either reinIorcement or punishment. Hence positive
punishment is sometimes a conIusing term, as it denotes the "addition" oI a stimulus or increase
in the intensity oI a stimulus that is aversive (such as spanking or an electric shock). The Iour
procedures are:
1 9os|t|ve re|nforcement (8elnforcemenL) occurs when a behavlor (response) ls followed by a
sLlmulus LhaL ls appeLlLlve or rewardlng lncreaslng Lhe frequency of LhaL behavlor ln Lhe
Sklnner box experlmenL a sLlmulus such as food or sugar soluLlon can be dellvered when Lhe raL
engages ln a LargeL behavlor such as presslng a lever
2 -egat|ve re|nforcement (Lscape) occurs when a behavlor (response) ls followed by Lhe removal
of an averslve sLlmulus Lhereby lncreaslng LhaL behavlors frequency ln Lhe Sklnner box
experlmenL negaLlve relnforcemenL can be a loud nolse conLlnuously soundlng lnslde Lhe raLs
cage unLll lL engages ln Lhe LargeL behavlor such as presslng a lever upon whlch Lhe loud nolse
ls removed
3 9os|t|ve pn|shment (unlshmenL) (also called unlshmenL by conLlngenL sLlmulaLlon) occurs
when a behavlor (response) ls followed by a sLlmulus such as lnLroduclng a shock or loud nolse
resulLlng ln a decrease ln LhaL behavlor
4 -egat|ve pn|shment (enalLy) (also called unlshmenL by conLlngenL wlLhdrawal) occurs
when a behavlor (response) ls followed by Lhe removal of a sLlmulus such as Laklng away a
chllds Loy followlng an undeslred behavlor resulLlng ln a decrease ln LhaL behavlor
so.
O vo|dance |earn|ng ls a Lype of learnlng ln whlch a cerLaln behavlor resulLs ln Lhe cessaLlon of an
averslve sLlmulus lor example performlng Lhe behavlor of shleldlng ones eyes when ln Lhe
sunllghL (or golng lndoors) wlll help avold Lhe averslve sLlmulaLlon of havlng llghL ln ones eyes
O t|nct|on occurs when a behavlor (response) LhaL had prevlously been relnforced ls no longer
effecLlve ln Lhe Sklnner box experlmenL Lhls ls Lhe raL pushlng Lhe lever and belng rewarded
wlLh a food pelleL several Llmes and Lhen pushlng Lhe lever agaln and never recelvlng a food
pelleL agaln LvenLually Lhe raL would cease pushlng Lhe lever
O -oncont|ngent re|nforcement refers Lo dellvery of relnforclng sLlmull regardless of Lhe
organlsms (aberranL) behavlor 1he ldea ls LhaL Lhe LargeL behavlor decreases because lL ls no
longer necessary Lo recelve Lhe relnforcemenL 1hls Lyplcally enLalls Llmebased dellvery of
sLlmull ldenLlfled as malnLalnlng aberranL behavlor whlch serves Lo decrease Lhe raLe of Lhe
LargeL behavlor
2
As no measured behavlor ls ldenLlfled as belng sLrengLhened Lhere ls
conLroversy surroundlng Lhe use of Lhe Lerm nonconLlngenL relnforcemenL
3

O hap|ng ls a form of operanL condlLlonlng ln whlch Lhe lncreaslngly accuraLe approxlmaLlons of a
deslred response are relnforced
4

O ha|n|ng ls an lnsLrucLlonal procedure whlch lnvolves relnforclng lndlvldual responses occurrlng
ln a sequence Lo form a complex behavlor
4

edit] Tborndikes law of effect
Molo ottlcle low of effect
Operant conditioning, sometimes called instrumenta conditioning or instrumenta earning, was
Iirst extensively studied by Edward L. Thorndike (18741949), who observed the behavior oI
cats trying to escape Irom home-made puzzle boxes.
|5|
When Iirst constrained in the boxes, the
cats took a long time to escape. With experience, ineIIective responses occurred less Irequently
and successIul responses occurred more Irequently, enabling the cats to escape in less time over
successive trials. In his law oI eIIect, Thorndike theorized that successIul responses, those
producing satisfying consequences, were "stamped in" by the experience and thus occurred more
Irequently. UnsuccessIul responses, those producing annoying consequences, were stamped out
and subsequently occurred less Irequently. In short, some consequences strengthened behavior
and some consequences weakened behavior. Thorndike produced the Iirst known learning curves
through this procedure. B.F. Skinner (19041990) Iormulated a more detailed analysis oI operant
conditioning based on reinIorcement, punishment, and extinction. Following the ideas oI Ernst
Mach, Skinner rejected Thorndike's mediating structures required by "satisIaction" and
constructed a new conceptualization oI behavior without any such reIerences. So, while
experimenting with some homemade Ieeding mechanisms, Skinner invented the operant
conditioning chamber which allowed him to measure rate oI response as a key dependent
variable using a cumulative record oI lever presses or key pecks.
|6|

edit] Biological correlates of operant conditioning
The Iirst scientiIic studies identiIying neurons that responded in ways that suggested they encode
Ior conditioned stimuli came Irom work by Mahlon deLong
|7||8|
and by R.T. "Rusty"
Richardson.
|8|
They showed that nucleus basalis neurons, which release acetylcholine broadly
throughout the cerebral cortex, are activated shortly aIter a conditioned stimulus, or aIter a
primary reward iI no conditioned stimulus exists. These neurons are equally active Ior positive
and negative reinIorcers, and have been demonstrated to cause plasticity in many cortical
regions.
|9|
Evidence also exists that dopamine is activated at similar times. There is considerable
evidence that dopamine participates in both reinIorcement and aversive learning.
|10|
Dopamine
pathways project much more densely onto Irontal cortex regions. Cholinergic projections, in
contrast, are dense even in the posterior cortical regions like the primary visual cortex. A study
oI patients with Parkinson's disease, a condition attributed to the insuIIicient action oI dopamine,
Iurther illustrates the role oI dopamine in positive reinIorcement.
|11|
It showed that while oII their
medication, patients learned more readily with aversive consequences than with positive
reinIorcement. Patients who were on their medication showed the opposite to be the case,
positive reinIorcement proving to be the more eIIective Iorm oI learning when the action oI
dopamine is high.
edit] Factors tbat alter tbe effectiveness of consequences
When using consequences to modiIy a response, the eIIectiveness oI a consequence can be
increased or decreased by various Iactors. These Iactors can apply to either reinIorcing or
punishing consequences.
1 at|at|on]Depr|vat|on 1he effecLlveness of a consequence wlll be reduced lf Lhe lndlvlduals
appeLlLe for LhaL source of sLlmulaLlon has been saLlsfled lnversely Lhe effecLlveness of a
consequence wlll lncrease as Lhe lndlvldual becomes deprlved of LhaL sLlmulus lf someone ls
noL hungry food wlll noL be an effecLlve relnforcer for behavlor SaLlaLlon ls generally only a
poLenLlal problem wlLh prlmary relnforcers Lhose LhaL do noL need Lo be learned such as food
and waLer
2 mmed|acy AfLer a response how lmmedlaLely a consequence ls Lhen felL deLermlnes Lhe
effecLlveness of Lhe consequence More lmmedlaLe feedback wlll be more effecLlve Lhan less
lmmedlaLe feedback lf someones llcense plaLe ls caughL by a Lrafflc camera for speedlng and
Lhey recelve a speedlng LlckeL ln Lhe mall a week laLer Lhls consequence wlll noL be very
effecLlve agalnsL speedlng 8uL lf someone ls speedlng and ls caughL ln Lhe acL by an offlcer who
pulls Lhem over Lhen Lhelr speedlng behavlor ls more llkely Lo be affecLed
3 ont|ngency lf a consequence does noL conLlngenLly (rellably or conslsLenLly) follow Lhe LargeL
response lLs effecLlveness upon Lhe response ls reduced 8uL lf a consequence follows Lhe
response conslsLenLly afLer successlve lnsLances lLs ablllLy Lo modlfy Lhe response ls lncreased
1he schedule of relnforcemenL when conslsLenL leads Lo fasLer learnlng When Lhe schedule ls
varlable Lhe learnlng ls slower LxLlncLlon ls more dlfflculL when learnlng occurs durlng
lnLermlLLenL relnforcemenL and more easlly exLlngulshed when learnlng occurs durlng a hlghly
conslsLenL schedule
4 |ze 1hls ls a cosLbeneflL deLermlnanL of wheLher a consequence wlll be effecLlve lf Lhe slze
or amounL of Lhe consequence ls large enough Lo be worLh Lhe efforL Lhe consequence wlll be
more effecLlve upon Lhe behavlor An unusually large loLLery [ackpoL for example mlghL be
enough Lo geL someone Lo buy a onedollar loLLery LlckeL (or even buylng mulLlple LlckeLs) 8uL lf
a loLLery [ackpoL ls small Lhe same person mlghL noL feel lL Lo be worLh Lhe efforL of drlvlng ouL
and flndlng a place Lo buy a LlckeL ln Lhls example lLs also useful Lo noLe LhaL efforL ls a
punlshlng consequence Pow Lhese opposlng expecLed consequences (relnforclng and
punlshlng) balance ouL wlll deLermlne wheLher Lhe behavlor ls performed or noL
Most oI these Iactors exist Ior biological reasons. The biological purpose oI the Principle oI
Satiation is to maintain the organism's homeostasis. When an organism has been deprived oI
sugar, Ior example, the eIIectiveness oI the taste oI sugar as a reinIorcer is high. However, as the
organism reaches or exceeds their optimum blood-sugar levels, the taste oI sugar becomes less
eIIective, perhaps even aversive.
The Principles oI Immediacy and Contingency exist Ior neurochemical reasons. When an
organism experiences a reinIorcing stimulus, dopamine pathways in the brain are activated. This
network oI pathways "releases a short pulse oI dopamine onto many dendrites, thus broadcasting
a rather global reinIorcement signal to postsynaptic neurons."
|12|
This results in the plasticity oI
these synapses allowing recently activated synapses to increase their sensitivity to eIIerent
signals, hence increasing the probability oI occurrence Ior the recent responses preceding the
reinIorcement. These responses are, statistically, the most likely to have been the behavior
responsible Ior successIully achieving reinIorcement. But when the application oI reinIorcement
is either less immediate or less contingent (less consistent), the ability oI dopamine to act upon
the appropriate synapses is reduced.
edit] Uperant variability
Operant variability is what allows a response to adapt to new situations. Operant behavior is
distinguished Irom reIlexes in that its response topography (the Iorm oI the response) is subject
to slight variations Irom one perIormance to another. These slight variations can include small
diIIerences in the speciIic motions involved, diIIerences in the amount oI Iorce applied, and
small changes in the timing oI the response. II a subject's history oI reinIorcement is consistent,
such variations will remain stable because the same successIul variations are more likely to be
reinIorced than less successIul variations. However, behavioral variability can also be altered
when subjected to certain controlling variables.
|1|

edit] Avoidance learning
Avoidance learning belongs to negative reinIorcement schedules. The subject learns that a
certain response will result in the termination or prevention oI an aversive stimulus. There are
two kinds oI commonly used experimental settings: discriminated and Iree-operant avoidance
learning.
edit] Discriminated avoidance learning
In discriminated avoidance learning, a novel stimulus such as a light or a tone is Iollowed by an
aversive stimulus such as a shock (CS-US, similar to classical conditioning). During the Iirst
trials (called escape-trials) the animal usually experiences both the CS (Conditioned Stimulus)
and the US (Unconditioned Stimulus), showing the operant response to terminate the aversive
US. During later trials, the animal will learn to perIorm the response already during the
presentation oI the CS thus preventing the aversive US Irom occurring. Such trials are called
"avoidance trials."
edit] Free-operant avoidance learning
In this experimental session, no discrete stimulus is used to signal the occurrence oI the aversive
stimulus. Rather, the aversive stimulus (mostly shocks) are presented without explicit warning
stimuli. There are two crucial time intervals determining the rate oI avoidance learning. This Iirst
one is called the S-S-interval (shock-shock-interval). This is the amount oI time which passes
during successive presentations oI the shock (unless the operant response is perIormed). The
other one is called the R-S-interval (response-shock-interval) which speciIies the length oI the
time interval Iollowing an operant response during which no shocks will be delivered. Note that
each time the organism perIorms the operant response, the R-S-interval without shocks begins
anew.
edit] Two-process tbeory of avoidance
This theory was originally established to explain learning in discriminated avoidance learning. It
assumes two processes to take place:
o) closslcol cooJltlooloq of feot
uurlng Lhe flrsL Lrlals of Lhe Lralnlng Lhe organlsm experlences boLh CS and averslve uS (escape
Lrlals) 1he Lheory assumed LhaL durlng Lhose Lrlals classlcal condlLlonlng Lakes place by palrlng
Lhe CS wlLh Lhe uS 8ecause of Lhe averslve naLure of Lhe uS Lhe CS ls supposed Lo ellclL a
condlLloned emoLlonal reacLlon (CL8) fear ln classlcal condlLlonlng presenLlng a CS
condlLloned wlLh an averslve uS dlsrupLs Lhe organlsms ongolng behavlor
b) kelofotcemeot of tbe opetoot tespoose by feotteJoctloo
8ecause durlng Lhe flrsL process Lhe CS slgnallng Lhe averslve uS has lLself become averslve by
ellclLlng fear ln Lhe organlsm reduclng Lhls unpleasanL emoLlonal reacLlon serves Lo moLlvaLe
Lhe operanL response 1he organlsm learns Lo make Lhe response durlng Lhe uS Lhus
LermlnaLlng Lhe averslve lnLernal reacLlon ellclLed by Lhe CS An lmporLanL aspecL of Lhls Lheory
ls LhaL Lhe Lerm avoldance does noL really descrlbe whaL Lhe organlsm ls dolng lL does noL
avold Lhe averslve uS ln Lhe sense of anLlclpaLlng lL 8aLher Lhe organlsm escapes an averslve
lnLernal sLaLe caused by Lhe CS
edit] Verbul Behuvlor
Molo ottlcle vetbol 8ebovlot (book)
In 1957, Skinner published 'erba Behavior, a theoretical extension oI the work he had
pioneered since 198. This work extended the theory oI operant conditioning to human behavior
previously assigned to the areas oI language, linguistics and other areas. 'erba Behavior is the
logical extension oI Skinner's ideas, in which he introduced new Iunctional relationship
categories such as intraverbals, autoclitics, mands, tacts and the controlling relationship oI the
audience. All oI these relationships were based on operant conditioning and relied on no new
mechanisms despite the introduction oI new Iunctional categories.
edit] Four term contingency
Applied behavior analysis, which is the name oI the discipline directly descended Irom Skinner's
work, holds that behavior is explained in Iour terms: conditional stimulus (S
C
), a discriminative
stimulus (S
d
), a response (R), and a reinIorcing stimulus (S
rein
or S
r
Ior reinIorcers, sometimes
S
ave
Ior aversive stimuli).
|14|

edit] Uperant boarding
Operant hoarding is a reIerring to the choice made by a rat, on a compound schedule called a
multiple schedule, that maximizes its rate oI reinIorcement in an operant conditioning context.
More speciIically, rats were shown to have allowed Iood pellets to accumulate in a Iood tray by
continuing to press a lever on a continuous reinIorcement schedule instead oI retrieving those
pellets. Retrieval oI the pellets always instituted a one-minute period oI extinction during which
no additional Iood pellets were available but those that had been accumulated earlier could be
consumed. This Iinding appears to contradict the usual Iinding that rats behave impulsively in
situations in which there is a choice between a smaller Iood object right away and a larger Iood
object aIter some delay. See schedules oI reinIorcement.
|15|

edit] An alternative to tbe law of effect
However, an alternative perspective has been proposed by R. Allen and Beatrix Gardner.
|16||17|

Under this idea, which they called "IeedIorward," animals learn during operant conditioning by
simple pairing oI stimuli, rather than by the consequences oI their actions. Skinner asserted that a
rat or pigeon would only manipulate a lever iI rewarded Ior the action, a process he called
"shaping" (reward Ior approaching then manipulating a lever).
|18|
However, in order to prove the
necessity oI reward (reinIorcement) in lever pressing, a control condition where Iood is delivered
without regard to behavior must also be conducted. Skinner never published this control group.
Only much later was it Iound that rats and pigeons do indeed learn to manipulate a lever when
Iood comes irrespective oI behavior. This phenomenon is known as autoshaping.
|19|
Autoshaping
demonstrates that consequence oI action is not necessary in an operant conditioning chamber,
and it contradicts the law oI eIIect. Further experimentation has shown that rats naturally handle
small objects, such as a lever, when Iood is present.
|20|
Rats seem to insist on handling the lever
when Iree Iood is available (contra-Ireeloading)
|21||22|
and even when pressing the lever leads to
less Iood (omission training).
|2||24|
Whenever Iood is presented, rats handle the lever, regardless
iI lever pressing leads to more Iood. ThereIore, handling a lever is a natural behavior that rats do
as preparatory Ieeding activity, and in turn, lever pressing cannot logically be used as evidence
Ior reward or reinIorcement to occur. In the absence oI evidence Ior reinIorcement during
operant conditioning, learning which occurs during operant experiments is actually only
Pavlovian (classical) conditioning. The dichotomy between Pavlovian and operant conditioning
is thereIore an inappropriate separation.

%heories of Learning in Educational
!sychology
B. F. Skinner and Uperant Conditioning

O Pome
O 1herapeuLlc 8odywork by Sunny Cooper
O ArLlcles LlsL
O LLhlcs ArcheLypes CL
O 1he Cranlal nerves CL
O llbromyalgla CL Course
O vlLal SubsLances 1CM CL
O AcupolnLs CL Course
O 1CM Manual
O 1M! CL Course
O ArcheLype 8log
O AbouL Lhe lnsLrucLor
O MeLacognlLlon 8esearch
O Learnlng 1heory Map

B. F. Skinner (1904 - 1990)
perant Conditioning
|ography

8urrhus lrederlc Sklnner was born March 20 1904 ln Susquehanna ennsylvanla 8urrhus recelved hls
8A ln Lngllsh from PamllLon College ln upsLaLe new ?ork AfLer some Lravellng he declded Lo go back Lo
school and earned hls masLers ln psychology ln 1930 and hls docLoraLe ln 1931 boLh from Parvard
unlverslLy and sLayed Lhere Lo do research unLll 1936

ln 1931 he moved Lo Mlnneapolls Lo Leach aL Lhe unlverslLy of MlnnesoLa 1here he meL and soon
marrled ?vonne 8lue ln 1943 anoLher move Look hlm Lo Lhe psychology deparLmenL aL lndlana
unlverslLy where he became deparLmenL chalr ln 1948 he was lnvlLed back Lo Parvard where he
remalned for Lhe resL of hls llfe Pe was a very acLlve man dolng research and guldlng hundreds of
docLoral candldaLes as well as wrlLlng many books

AugusL 18 1990 8 l Sklnner dled of leukemla afLer becomlng perhaps Lhe mosL celebraLed
psychologlsL slnce Slgmund lreud

Sklnner accepLed Lhe model of classlcal condlLlonlng as orlglnaLed by avlov and elaboraLed on by
WaLson and CuLhrle buL he LhoughL Lhls Lype of condlLlonlng only explalned a small porLlon of human
and anlmal behavlor Pe LhoughL LhaL Lhe ma[orlLy of response by humans do noL resulL from obvlous
sLlmull 1he noLlon of relnforcemenL had been lnLroduced by 1horndlke and Sklnner developed Lhls ldea
much furLher

|nners 1heory Cperant ond|t|on|ng

8 l Sklnners sysLem ls based on operanL condlLlonlng 1he organlsm whlle golng abouL lLs everyday
acLlvlLles ls ln Lhe process of operaLlng" on Lhe envlronmenL ln Lhe course of lLs acLlvlLles Lhe
organlsm encounLers a speclal klnd of sLlmulus called a relnforclng sLlmulus or slmply a relnforcer 1hls
speclal sLlmulus has Lhe effecL of lncreaslng Lhe Lhe behavlor occurrlng [usL before Lhe relnforcer 1hls ls
operanL condlLlonlng Lhe behavlor ls followed by a consequence and Lhe naLure of Lhe consequence
modlfles Lhe organlsms Lendency Lo repeaL Lhe behavlor ln Lhe fuLure" A behavlor followed by a
relnforclng sLlmulus resulLs ln an lncreased probablllLy of LhaL behavlor occurrlng ln Lhe fuLure

Sklnners observaLlons can be dlvlded lnLo lndependenL varlables whlch can be manlpulaLed by Lhe
experlmenLer and dependenL varlables whlch can noL be manlpulaLed by Lhe experlmenLer and are
LhoughL Lo be affecLed by Lhe lndependenL varlables

lndependenL varlables
1ype of relnforcemenLSchedule of relnforcemenL

uependenL varlables (measures of learnlng)
- AcqulslLlon raLe how rapldly an anlmal can be Lralned Lo a new operanL behavlor as a funcLlon of
relnforcemenL Sklnner Lyplcally deprlved hls lab anlmals of food for 24 or more hours before beglnnlng
a schedule of relnforcemenL 1hls Lended Lo lncrease acqulslLlon raLe

- 8aLe of response Lhls ls a measure of learnlng LhaL ls very senslLlve Lo dlfferenL schedules of
relnforcemenL ln mosL cases anlmals were glven lnLermlLLenL schedules of relnforcemenL so Lhey were
called upon Lo ellclL Lhe deslred response aL oLher Llmes as well 8aLe of response ls a measure of correcL
responses LhroughouL a LesLlng schedule lncludlng Lhe Llmes when relnforcemenL ls noL provlded afLer a
correcL response lL appears as lf LesL anlmals bulld expecLaLlons when Lhey are glven rewards aL
predlcLable Llmes (Anlmals whlch are fed aL Lhe same Llme each day become acLlve as LhaL Llme
approaches and a dog whose masLer comes home aL Lhe same Llme each day becomes more aLLenLlve
around LhaL Llme of day) Also Sklnner found LhaL when flxed lnLerval relnforcemenL was used Lhe
deslred behavlor would decrease or dlsappear [usL afLer a relnforcemenL buL when lL was almosL Llme
for Lhe nexL relnforcemenL Lhe anlmal would resume Lhe deslred responses

- LxLlncLlon raLe 1he raLe aL whlch an operanL response dlsappears followlng Lhe wlLhdrawal of
relnforcemenL Sklnner found LhaL conLlnuous relnforcemenL schedules produced a fasLer raLe of
learnlng ln Lhe early sLages of a Lralnlng program and also a more rapld exLlncLlon raLe once Lhe
relnforcemenL was dlsconLlnued A behavlor no longer followed by Lhe relnforclng sLlmulus resulLs ln a
decreased probablllLy of LhaL behavlor occurrlng ln Lhe fuLure

1ypes of relnforcemenL

1 rlmary relnforcemenL lnsLlncLlve behavlors lead Lo saLlsfacLlon of baslc survlval needs such as food
waLer sex shelLer no learnlng Lakes place because Lhe behavlors emerge sponLaneously
2 Secondary relnforcemenL Lhe relnforcer ls noL relnforclng by lLself buL becomes relnforclng when
palred wlLh a prlmary relnforcer such as palrlng a sound or a llghL wlLh food
3 Cenerallzed relnforcemenL sLlmull become relnforclng Lhrough repeaLed palrlng wlLh prlmary or
secondary relnforcers Many are culLurally relnforced lor example ln human behavlor wealLh power
fame sLrengLh and lnLelllgence are valued ln many culLures 1he exLernal symbols of Lhese aLLrlbuLes
are generallzed relnforcers Money rank recognlLlon degrees and cerLlflcaLes eLc are sLrongly
relnforclng Lo many lndlvlduals ln Lhe culLures LhaL value Lhe aLLrlbuLes Lhey symbollze

8elnforcers always follow a behavlor and could be pleasanL or unpleasanL (noxlous) and could be added
Lo or removed from a slLuaLlon 1he followlng Lable summarlzes Lhe varlous comblnaLlons

Add Lo a SlLuaLlon AfLer a 8esponse

leasanL oslLlve 8elnforcemenL 8eward lncreases Lhe probablllLy of Lhe same response occurrlng
agaln (Lxample pralse moneLary reward food)

noxlous unlshmenL AdmlnlsLerlng a palnful or unpleasanL relnforcer afLer an unwanLed response
uecreases Lhe probablllLy of Lhe same response occurrlng agaln(Lxamples corporal punlshmenL
elecLrlcal shocks yelllng)

8emove from a SlLuaLlon AfLer a 8esponse

leasanL unlshmenL uecrease Lhe probablllLy of Lhe same response occurrlng agaln (Lxample
punlshlng a Leenager by Laklng away hls cell phone or car keys)

noxlous negaLlve 8elnforcemenL 8emovlng or decreaslng an unpleasanL or palnful slLuaLlon afLer a
deslrable response ls produced lncreases Lhe probablllLy of Lhe same response occurrlng agaln
(Lxample Llme off for good behavlor)

Schedules of 8elnforcemenL
- ConLlnuous relnforcemenL relnforcemenL ls glven every Llme Lhe anlmal glves Lhe deslred response
- lnLermlLLenL relnforcemenL relnforcemenL ls glven only parL of Lhe Llmes Lhe anlmal glves Lhe deslred
response
- 8aLlo relnforcemenL a predeLermlned proporLlon of responses wlll be relnforced
- llxed raLlo relnforcemenL relnforcemenL ls glven on a regular raLlo such as every flfLh Llme Lhe
deslred behavlor ls produced
- varlable (random) flxed relnforcemenL relnforcemenL ls glven for a predeLermlned proporLlon of
responses buL randomly lnsLead of on a flxed schedule
- lnLerval relnforcemenL relnforcemenL ls glven afLer a predeLermlned perlod of Llme
- llxed lnLerval relnforcemenL relnforcemenL ls glven on a regular schedule such as every flve mlnuLes
- varlable lnLerval relnforcemenL relnforcemenL ls glven afLer random amounLs of Llme have passed
ln anlmal sLudles Sklnner found LhaL conLlnuous relnforcemenL ln Lhe early sLages of Lralnlng seems Lo
lncrease Lhe raLe of learnlng LaLer lnLermlLLenL relnforcemenL keeps Lhe response golng longer and
slows exLlncLlon



Sklnner speclflcally addressed Lhe appllcaLlons of behavlorlsm and operanL condlLlonlng Lo educaLlonal
pracLlce Pe belleved LhaL Lhe goal of educaLlon was Lo Lraln learners ln survlval skllls for self and socleLy
1he role of Lhe Leacher was Lo relnforce behavlors LhaL conLrlbuLed Lo survlval skllls and exLlngulsh
behavlors LhaL dld noL 8ehavlorlsL vlews have shaped much of conLemporary educaLlon ln chlldren and
adulL learnlng

Learnlng 1heory 8lbllography

8oeree C C (1998) 8 l Sklnner 8eLrleved SepLember 19 2003 from
hLLp//wwwshlpedu/7Lcgboeree/sklnnerhLml
Lefrancols 1972
SanLrock 1988
Merrlam Caffarella 1991

You might also like