You are on page 1of 5

CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 53

Note Kemper, S. (1987). Syntactic complexity and eld- tional representations. Journal of Memory and
erly adults’ prose recall. Experimental Aging Language, 25, 279–294.
Research, 13, 47–52. Soederberg Miller, L.M., & Stine-Morrow, E.A.L.
1. Address correspondence to G.A. Morrow, D.G., Stine-Morrow, E.A.L., Leirer, V.O., (1998). Aging and the effects of knowledge on
Andrassy, J.M., & Kahn, J. (1997). The role of on-line reading strategies. Journal of
Radvansky, Department of Psychology, reader age and focus of attention in creating Gerontology: Psychological Sciences, 53B,
University of Notre Dame, Notre situation models from narratives. Journal of
P223–P233.
Dame, IN 46556; e-mail: radvansky.1 Gerontology: Psychological Science, 52B,
P73–P80. Stine, E.A.L., & Wingfield, A. (1988). Memorability
@nd.edu. Radvansky, G.A., & Curiel, J.M. (1998). Narrative functions as an indicator of qualitative age dif-
comprehension and aging: The fate of com- ferences in text recall. Psychology and Aging, 3,
pleted goal information. Psychology and Aging, 179–183.
References 13, 69–79. Stine-Morrow, E.A.L., Loveless, M., & Soederberg,
Radvansky, G.A., Gerard, L.D., Zacks, R.T., & L. (1996). Resource allocation in on-line read-
Hasher, L. (1990). Younger and older adults’ ing by younger and older adults. Psychology
Cohen, G. (1979). Language comprehension in old use of mental models as representations for and Aging, 11, 475–486.
age. Cognitive Psychology, 11, 412–429. text materials. Psychology and Aging, 5, von Hippel, W., Silver, L.A., & Lynch, M.E. (in
Hamm, V.P., & Hasher, L. (1992). Age and the 209–214. press). Stereotyping against your will: The role
Radvansky, G.A., Zacks, R.T., & Hasher, L. (1996).
availability of inferences. Psychology and Aging, of inhibitory ability in stereotyping and preju-
Fact retrieval in younger and older adults: The
7, 56–64. role of mental models. Psychology and Aging, dice among the elderly. Personality and Social
James, L.E., Burke, D.M., Austin, A., & Hulme, E. 11, 258–271. Psychology Bulletin.
(1998). Production and perception of “ver- Schmalhofer, F., & Glavanov, D. (1986). Three com- Zwaan, R.A., & Radvansky, G.A. (1998). Situation
bosity” in younger and older adults. ponents of understanding a programmer’s models in language comprehension and mem-
Psychology and Aging, 13, 355–367. manual: Verbatim, propositional, and situa- ory. Psychological Bulletin, 123, 162–185.

states through the acoustic proper-


Vocal Expression and Perception ties of their speech. For instance,
of Emotion many of us have experienced talk-
ing in an unwittingly loud voice
Jo-Anne Bachorowski1 when feeling gleeful, speaking in
Department of Psychology, Vanderbilt University, Nashville, Tennessee an uncharacteristically high-
pitched voice when greeting a sex-
ually desirable person, or talking
with marked vocal tremor while
Abstract expected by chance. More de- giving a public speech. In turn, lis-
Speech is an acoustically tailed characterizations of teners are seemingly adept at mak-
rich signal that provides con- these production and percep- ing accurate evaluations of emo-
siderable personal information tion aspects of vocal commu- tional states—even in the absence
about talkers. The expression nication will necessarily of visual cues, as routinely occurs
of emotions in speech sounds involve knowledge about dif- during telephone conversations.
and corresponding abilities to ferences among talkers, such Production and perception phe-
perceive such emotions are as those components of speech nomena are both facets of a broad
both fundamental aspects of that provide comparatively research area concerned with un-
human communication. Find- stable cues to individual talk- derstanding the ways in which
ings from studies seeking to ers’ identities. speech acoustics provide personal
characterize the acoustic prop-
information about talkers, such as
erties of emotional speech in- Keywords
gender and individual identity, in-
dicate that speech acoustics emotion; speech acoustics;
dependent of linguistic content.
provide an external cue to the vocal communication
This article provides an overview
level of nonspecific arousal as-
of the links between speech
sociated with emotional pro-
cesses and, to a lesser extent, acoustics and emotions (for more
the relative pleasantness of ex- The speech stream is a highly detailed reviews, see Pittam &
perienced emotions. Outcomes complex and variable signal that is Scherer, 1993, and Scherer, 1989).
from perceptual tests show most directly studied by analyzing Some limitations of traditional ap-
that listeners are able to accu- its acoustic properties, or sound proaches to this research area, and
rately judge emotions from patterns. We know from everyday alternative ways of thinking about
speech at rates far greater than experience that talkers provide in- enduring problems, are also
formation about their emotional discussed.

Copyright © 1999 American Psychological Society


54 VOLUME 8, NUMBER 2, APRIL 1999

vocal-production-related physiolo- tours, or the pattern of F0 changes


SPEECH ACOUSTICS gy, such as the fluctuations in respi- over the course of an utterance. For
ration and muscle tension that can example, F0 has been noted to de-
occur in conjunction with some crease over time during portrayals
The source-filter theory of emotions (Scherer, 1989). of anger, but to increase over time
speech production is helpful for during portrayals of joy. In con-
understanding the ways in which- trast, emotions associated with low
speech acoustics might provide in- levels of physiological arousal (e.g.,
VOCAL EMOTION
formation about emotional state sadness) are consistently associated
FROM A PRODUCTION
(see Kent, 1997, for a thorough in- with lower mean F0, F0 variability,
STANDPOINT
troduction to speech acoustics). In and vocal intensity, as well as de-
this framework, speech sounds re- creases in F0 over time.
sult from the combination of source Most production-related investi- Rather than relying on acted
energy, produced by vibration of gations have been guided by the portrayals, my colleague and I
the vocal folds (formerly referred assumption that distinct patterns have studied the acoustic proper-
to as the vocal cords), and the sub- of acoustic cues will be found to be ties of natural speech produced by
sequent filtering of that energy by associated with discrete emotional naive participants in the context of
the vocal tract above the larynx. states. Largely for practical rea- controlled emotion-induction pro-
Source-related acoustic cues sons, these investigations have typ- cedures.2 We have focused our
refer to those aspects of speech ically analyzed the emotional acoustic analysis on very short
sounds that are primarily associat- speech produced by small num- vowel segments both because de-
ed with vocal-fold vibration. In bers of actors or naive subjects tailed measurement of source- and
emotions research, measures asso- asked to portray various emotions filter-related cues is possible with
ciated with F0 (i.e., the fundamental while producing linguistically neu- these sounds and because these
frequency of speech, which corre- tral utterances. For both theoretical speech samples are less likely than
sponds to the rate of vocal-fold vi- and practical reasons, most analy- sentence-length utterances to be in-
bration and is perceived as vocal ses of emotional speech have fo- fluenced by demand characteris-
pitch) are the most commonly cused on source-related acoustic tics, such as culturally prescribed
used. Other potentially important cues. For these cues, a restricted yet rules about how particular emo-
source measures include jitter and fairly reliable pattern of findings tions ought to be conveyed in
shimmer, which correspond to vari- has emerged. For example, Scherer, speech. In one such study
ability in the frequency and ampli- Banse, Wallbott, and Goldbeck (Bachorowski & Owren, 1995), pos-
tude of vocal-fold vibration, re- (1991; also see Banse & Scherer, itive and negative emotions were
spectively. Filter-related cues are 1996; Leinonen, Hiltunen, induced by giving participants
examined less often by emotions Linnankoski, & Laakso, 1997) ex- “Good Job” and “Try Harder” feed-
researchers. However, these cues amined the acoustic features of back as they performed a difficult
may be important for understand- neutral and emotional nonsense computerized spelling task. In real-
ing emotional speech because facial sentences spoken by four actors. In ity, this feedback was not contin-
expression (e.g., lip position) can comparison with neutral speech, gent on participants’ performance.
influence filtering effects. Thus, a portrayals of fear, joy, and anger After each feedback presentation,
sentence spoken while smiling can were each associated with a higher subjects’ speech was recorded as
sound different from the same sen- mean F0, whereas portrayals of they announced the number (n) of
tence spoken while frowning. sadness were associated with a the upcoming block of trials using
These kinds of acoustic differences lower mean F0. A corresponding the phrase “test n test.” F0, jitter,
are reflected in formants, which are pattern was observed for vocal in- and shimmer were measured from
vocal-tract resonances that corre- tensity, or amplitude. 30 instances of the “eh” sound that
spond to the frequencies amplified Across studies, portrayals of occurred in the first utterance of
through vocal-tract filtering. emotions associated with high lev- “test” in each stock phrase. The re-
Another way of thinking about res- els of physiological arousal (e.g., sults indicated that emotion ex-
onances is that they are the natural anger, fear, anxiety, and joy) have pressed through the vocal channel
frequencies that are selectively re- been associated with increases in depended not only on the valence
inforced because of the size and mean F0, F0 variability, and vocal (i.e., the relative pleasantness or
shape of the vocal tract (see Kent, intensity. Some acoustic differentia- unpleasantness) of the elicited
1997). Both source- and filter-relat- tion among these emotions has emotion, but also on differences in
ed cues are sensitive to changes in been found by examining F0 con- the self-reported intensity with

Published by Blackwell Publishers, Inc.


CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 55

which emotions were typically ex- play in the production of emotion- fects also suggest that characteristic
perienced (Bachorowski & Braaten, al speech. acoustic differences between voices
1994). play a role in perceptual evalua-
We observed similar outcomes tions of emotion from speech.
in an unpublished study that used PERCEPTION OF
a more standard emotion-induc- VOCAL EMOTION
tion paradigm in which naive par-
TOWARD A BROADER
ticipants described the thoughts Tests of listeners’ abilities to FRAMEWORK
and feelings evoked by emotion- infer emotion from speech are crit-
eliciting slides. Notably, efforts to ical for evaluating the perceptual
link both source- and filter-related importance of acoustic cues shown A number of constraints have
acoustic cues with discrete emo- to be important from a production impeded the development of a de-
tions were largely unsuccessful. perspective, and help to inform re- tailed account of vocal-emotion-
Instead, the overall pattern of re- search aimed at developing an related phenomena. For instance,
sults indicated that values of acoustic typology of emotional speech is complex, both in the
acoustic parameters were associ- speech. The standard perception number of potentially relevant
ated with nonspecific arousal and, paradigm is to have listeners acoustic cues related to emotional
to a lesser extent, emotional va- choose which one of several emo- expression and in the multiplicity
lence. Again, differences in emo- tion words best characterizes lin- of other factors that influence the
tional intensity mediated the rela- guistically neutral utterances speech signal at any moment in
tionships between acoustic made by actors attempting to por- time. More pragmatically, accurate
measures and emotional states. tray various emotions (e.g., and detailed acoustic analysis is
Although the expression and Leinonen et al., 1997; Scherer, time-consuming. From a method-
perception of emotion are salient Banse, & Wallbott, 1998). Listeners ological standpoint, the small num-
aspects of human vocal communi- are usually able to perceive the in- ber of participants typically stud-
cation, researchers have yet to fully tended emotions at rates signifi- ied and the reliance on acted
characterize the ways in which cantly better than those expected portrayals have limited the gener-
speech acoustics provide cues to by chance. This general success in ality of findings. Paradigms that in-
emotional states. The most parsi- identifying emotions is typically volve collecting speech samples
monious interpretation of produc- interpreted to indicate that listen- during the controlled induction of
tion-related data is that speech ers associate particular patterns of emotional states best balance the
acoustics provide an external cue acoustic cues with various discrete need for methodological rigor and
to the level of nonspecific arousal emotional states. Evidence for real-world validity.
associated with emotional process- cross-cultural similarities in both Although investigators have
es. Less reliable differentiations are perceptual accuracy and error pat- typically sought to identify invari-
found when researchers look for terns (Scherer et al., 1998) further ant patterns of acoustic cues for
associations between acoustic mea- suggests that the ability to infer various discrete emotional experi-
sures and either emotional valence emotion from speech is a funda- ences, this strategy may be prob-
or discrete emotion categories. mental component of human vocal lematic for a number of reasons.
Moreover, potentially important communication. For instance, this tactic generally
individual differences, including In light of these findings, it is also fails to consider the talker–listener
the identity of the talker and emo- important to note that error rates relationship and the “intended”
tional intensity, are routinely found are also often quite high. A hint impact of vocal signals on the lis-
to mediate vocal expression of about the basis of detection failures tener’s affective states. Some cues,
emotion. As Scherer (1986) has comes from the fact that listeners especially those associated with the
pointed out, there is an apparent are more accurate in inferring emo- rate of vocal-fold vibration, are
contradiction between the difficul- tion from particular voices. readily modifiable. They can be
ty in finding acoustic differentia- Furthermore, for any given actor, used, for example, to signal com-
tion of emotional states and the listeners typically perceive some municative intent or be recruited
comparative ease with which lis- emotions more accurately than oth- for the purposes of affective per-
teners are able to judge emotions ers. Although it is likely that some suasion. Thus, treating these cues
from speech. Resolving this contra- emotions may simply be more diffi- as honest readouts of emotional
diction will likely involve an ex- cult to infer from voice than others, states ignores their other potential
plicit understanding of the role that and that actors vary in the quality of functions in emotion-related com-
individual difference variables their emotion portrayals, these ef- munication.

Copyright © 1999 American Psychological Society


56 VOLUME 8, NUMBER 2, APRIL 1999

Incorporating a more talker- na. In that most studies have ar- Acknowledgments—Work on this arti-
centered (i.e., idiographic) per- guably examined affect rather than cle was completed while the author was
spective may also help advance emotion, it may have been unrea- generously hosted as a Visiting Scholar
by the Department of Psychology at
our understanding of emotional sonable to expect that distinct Cornell University. Funds in support of
speech. Evaluations of emotional acoustic patterns could be identi- this work came from National Science
state are necessarily made against fied. Instead, there is remarkable Foundation (POWRE) and National
Institute of Mental Health (B/START)
an acoustic backdrop of individu- consistency in support of the no- awards, and from Vanderbilt University.
ally distinctive voice characteris- tion that the acoustic features of Michael J. Owren provided valuable
tics, and yet differences among “emotional” speech are best de- comments on an earlier version of this
manuscript, and our collaborative work
talkers are usually treated as unin- scribed using dimensions of non- led to some of the ideas presented here.
teresting variability in vocal-emo- specific arousal and affective
tion research. However, everyday valence, and that most vocal pro-
experience suggests that more ac- ductions index affective rather
Notes
curate and detailed perceptual than emotional experience.
judgments of emotional state can The expression and perception 1. Address correspondence to Jo-
be made for familiar than for unfa- of emotional states in speech Anne Bachorowski, Department of
miliar talkers. For example, dis- acoustics are fundamental aspects Psychology, Wilson Hall, Vanderbilt
criminations between related emo- of human communication. In fact, University, Nashville, TN 37240; e-
tions, such as amusement and joy, mail: j.a.bachorowski@vanderbilt.edu.
disturbances in either of these com-
2. Preliminary results from work
are probably more accurate for munication components can con- being conducted in other laboratories
speech samples from a close friend tribute to profound deficits in so- demonstrate that both standard emo-
than those from a more casual ac- cial relationships. By its very tion-induction paradigms and playful,
quaintance. Suggestive empirical nature, research in vocal expression gamelike paradigms are successful for
support for the importance of talk- eliciting speech samples that can be
and perception of emotion is richly
used to study vocal expression of emo-
er characteristics comes from stud- interdisciplinary—a circumstance tion. Some investigators using these
ies indicating that acoustic differ- that gives rise to both its inherent kinds of strategies include Arvid
ences among talkers exert a complexities and its considerable Kappas (arvid@psy.ulaval.ca), Gary
powerful influence on cognitive intellectual appeal. As a result of Katz (gary.katz@csun.edu), and Tom
operations such as linguistic pro- Johnstone in Klaus Scherer ’s lab
improved digital processing tech-
(johnstone@fapse.unige.ch).
cessing and memory (e.g., Palmeri, niques as well as advances in the
Goldinger, & Pisoni, 1993). Thus, related disciplines of speech sci-
more detailed characterizations of ence, cognitive science, and References
the acoustic features of emotional acoustic primatology, findings ob-
Bachorowski, J.-A., & Braaten, E.B. (1994).
speech might be found by examin- tained in the coming years should Emotional intensity: Measurement and theo-
ing fluctuations in acoustic cues prove especially informative for retical implications. Personality and Individual
Differences, 17, 191–199.
against comparatively more stable our understanding of emotional Bachorowski, J.-A., & Owren, M.J. (1995). Vocal
but individually distinctive talker expression through the vocal expression of emotion: Acoustic properties of
speech are associated with emotional intensity
characteristics (see Bachorowski & channel. and context. Psychological Science, 6, 219–224.
Owren, 1998). Bachorowski, J.-A., & Owren, M.J. (1998). Acoustic
cues to gender and talker identity are present in a
Research in vocal-emotion phe- short vowel segment produced in running speech.
nomena might also benefit from a Manuscript submitted for publication.
Banse, R., & Scherer, K.R. (1996). Acoustic profiles
reinterpretation of findings based in vocal emotion expression. Journal of
on Russell and Feldman Barrett’s Personality and Social Psychology, 70, 614–636.
Recommended Reading Kent, R.D. (1997). The speech sciences. San Diego:
(1998) distinction between affect Singular Publishing.
and emotion. In their account, af- Bachorowski, J.-A., & Owren, M.J. Leinonen, L., Hiltunen, T., Linnankoski, I., &
Laakso, M.-L. (1997). Expression of emotional-
fect is always present and is best (1995). (See References) motivational connotations with a one-word
described by bipolar dimensions of Kent, R.D. (1997). (See References) utterance. Journal of the Acoustical Society of
arousal and valence. In contrast, Murray, I.R., & Arnott, J.L. (1993). America, 102, 1853–1863.
Toward the simulation of emo- Palmeri, T.J., Goldinger, S.D., & Pisoni, D.B. (1993).
prototypical emotion episodes tion in synthetic speech: A Episodic encoding of voice attributes and
recognition memory for spoken words. Journal
happen more rarely and are associ- review of the literature on of Experimental Psychology, 19, 309–328.
ated with identifiable neurophysio- human vocal emotion. Journal of Pittam, J., & Scherer, K.R. (1993). Vocal expression
the Acoustical Society of America, and communication of emotion. In M. Lewis &
logical, behavioral, and cognitive J.M. Haviland (Eds.), Handbook of emotions (pp.
processes. This distinction certain- 93, 1097–1108. 185–197). New York: Guilford Press.
Pittam, J., & Scherer, K.R. (1993). (See Russell, J.A., & Feldman Barrett, L. (1998). Affect
ly sheds new light on vocal pro- References) and prototypical emotional episodes. Manuscript
duction and perception phenome- submitted for publication.

Published by Blackwell Publishers, Inc.


CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 57

Scherer, K.R. (1986). Vocal affect expression: A Emotion: Theory, research, and experience: Vol. 4. across languages and cultures. Manuscript sub-
review and model for future research. The measurement of emotions (pp. 233–259). New mitted for publication.
Psychological Bulletin, 99, 143–165. York: Academic Press. Scherer, K.R., Banse, R., Wallbott, H.G., & Goldbeck,
Scherer, K.R. (1989). Vocal measurement of emo- Scherer, K.R., Banse, R., & Wallbott, H.G. (1998). T. (1991). Vocal cues in emotion encoding and
tion. In R. Plutchik & H. Kellerman (Eds.), Emotion inferences from vocal expression correlate decoding. Motivation and Emotion, 15, 123–148.

choices. Pretend play depends on


Developing a Cultural Theory of Mind: being able to deal in imagined
The CIAO Approach worlds. In childhood and beyond,
successful peer interaction depends
Angeline Lillard1 in part on correctly interpreting the
Department of Psychology, University of Virginia, Charlottesville, Virginia peer’s intentions (“Did he bump
into me on purpose?”). Providing
reliable court testimony depends in
animals do. Although chimpanzees part on knowing what it means to
Abstract remember, and to lie. Because of its
appear to engage in purposeful de-
The study of children’s broad relevance, the theory-of-mind
ception in the wild, well-controlled
knowledge about minds is an perspective could even stand a
studies suggest that they have sim-
extremely active area of devel- chance of unifying some disparate
ply detected behavioral regularities
opmental psychology. This ar- areas of research under a single con-
(Povinelli & Giambrone, in press).
ticle discusses the reach of this ceptual framework, a unification
Imputing intentions and other
research and the theoretical lost when the developmental stages
views guiding it. It then pres- mental states is referred to as having
a theory of mind (Premack & proposed by Jean Piaget, once very
ents some cultural variations influential, became derailed.
(within the United States) in Woodruff, 1978). Understanding of
behavior explanation and ex- minds is theorylike at least in the
plains the relevance of that sense that mental states are not tan-
gible, and therefore may exist only MENTAL REPRESENTATION
variation to developmental
theory. A theory of early mind in theory, as some philosophers
reading that is presented in- argue. A second reason for consider- Theory-of-mind research con-
corporates culture, introspec- ing this knowledge a theory is that, cerns many issues. A particularly
tion, analogy, and ontogeny like all theories, knowledge of mind active area of research focuses on
(CIAO). appears to have a coherent, causal- children’s understanding of mental
explanatory structure, and concepts concepts and activities, like pre-
Keywords are defined in terms of other con- tense, emotions, and, especially, be-
theory of mind; cultural cepts specific to that body of knowl- lief. Understanding belief is a hall-
psychology; social cognition edge (Wellman, 1990). For example, mark of understanding minds,
surprise is crucially defined in rela- because representing the world is
tion to belief or expectation. quite possibly the most important
The ability to posit mental states Theory-of-mind research has feature of minds. People respond
in other people is among the most taken developmental psychology by not to the world as it is, but to the
subtly remarkable of human feats. storm in the past decade. One might world as they believe it to be. Many
Rather than simply detecting be- expect that such fury would run it- of Shakespeare’s plays, for exam-
havioral regularities (“A person self dry. Instead, theory of mind is ple, hinge on this understanding:
looking at candy usually proceeds surfacing across the field, because Lear’s belief that the faithful
to get candy”), most people readily many developmental issues can be Cordelia is only his “sometime
assume that others have internal profitably viewed from this perspec- daughter,” Romeo’s fatal miscon-
mental states (“She sees candy, she tive. In infancy, for example, social ception that Juliet is dead, the
wants candy, and she intends to get referencing2 depends on knowing comedies’ mistaken identities.
candy”). And whereas almost all that emotions can be about objects These plays are paradigmatic of
people do this (the really striking and events. Learning new words de- our fascination with how people
exceptions being those with pends in part on deciphering what view the world, and how belief
autism), it may be the case that no adults refer to, given the infinite drives action.

Copyright © 1999 American Psychological Society

You might also like