The Tritone Paradox: A Link Between Music and Speech

Diana Deutsch1
Department of Psychology, University of California, San Diego, La Jolla, California

the fundamental frequency?but that are integer also at frequencies Such of the fundamental. multiples a set of frequencies is called a har monic and (1843) that (1940) showed a harmonic with when presented series, the listener hears a pitch that to the fundamental, corresponds Schouten even when itself the fundamental is weak or absent. abstract relation How do we as to per pitches so ships between two When ceive musical patterns? in combina tones are presented a musical inter tion, we perceive as and we perceive intervals same in size when their being the in stand component frequencies val, the same sical scale ratio. The is based Two traditional mu in part on this series. Later Seebeck


that pulse declared sounds could be heard in the chest, this assertion was hotly disputed. that fol the controversy During a Venetian lowed, jok physician such that perhaps ingly declared in to be heard sounds were "only London" (Hunt, 1978, p. 138). sounds The idea that certain might be perceived quite differ re in different geographic ently as to so unlikely appears gions form the basis of a joke. Yet this a sound pattern concerns article is indeed that, it strongly appears, in dif heard differently by people I explore ferent geographic regions. of this pattern the properties basis the for this curious association. graphic and geo

classic the blood,

in his William Harvey, of treatise on the circulation


attribute of "high Pitch?the or "lowness" of sound?is ness" our of music. central to perception that we en many pitches Although are rather outside music counter blurred (consider, e.g., footsteps, and rustling sounds), squeaks, most musical those produced by and the singing voice instruments

are clearly defined. It is from such and harmo that melodies pitches nies are formed. Questions concerning pitch and have intrigued pitch relationships times. ancient since scientists show is credited with Pythagoras that the pitch of ing experimentally a vibrating inversely string varies its length, and that particular with are intervals musical produced stand in cer when string lengths the 17th cen tain ratios. During and Mersenne Galileo tury, are that these associations showed between on the relationship fre and vibration length string The shorter the string, the quency: of vibration. the rate higher that also discovered Mersenne a body vibrates, it does so when that cor not only at the frequency to its perceived pitch? responds based

on a adjacent notes form a pitch relationship keyboard this corresponds called a semitone; to a frequency ratio of approxi 18:17. Intervals that com mately of semi the same number prise principle.
tones are given the same name. For

an interval comprising 12 example, semitones (a ratio of 2:1) is called an octave, an interval comprising 7 semitones (a ratio of 3:2) is called a fifth, and an interval com (a ratio of 4:3) prising 5 semitones is called a perfect fourth. Because of the perceptual of tone pairs that form equivalence the same interval, a musical pas perfect of dif sage can be played by means it can be trans ferent pitches (i.e., to different and keys), posed that the intervals remain provided the same, its perceptual identity is are in Musical patterns preserved. this respect like visual shapes,



Deutsch, D. (1991). (See References) Deutsch, D. (1996). The perception of auditory patterns. InW. Prinz & B. Bridgeman (Eds.)/ Handbook of perception and action (Vol. 1, pp. 253-296). New York: Wiley. Deutsch, D. (Ed.). (in press). The psy chology of music (2nd ed.). San Di
ego: Academic Press.


by Cambridge





which different


tities when

iden their perceptual are translated to they in the visual


a reasonable



be given.

positions field. Indeed, the notion that such a as radi pattern might be perceived under transposition cally different as the notion that is as paradoxical a a visual undergo shape might being metamorphosis through in to a different moved position

rv^v_^x^c#) (a#) (jT) vD (3)



pattern analyses are based on the octave. It has long been rec that tones that are related ognized octaves have a certain percep by This is acknowl tual equivalence. Other for edged in the system of notation scale. The the traditional musical core of this scale consists of 12 to the divi tones, corresponding sion of the octave and each tone into semitones, is given a name (C,

periment demonstrate would dence class of

an ex (1964) performed that he hypothesized the indepen of pitch Fig. 1. The pitch class circle, formed by the 12 pitch classes within the octave.

the dimensions

and pitch height. He gener a series of tones, each of which ated com consisted of a set of pure-tone ponents
taves.2 sisted C4, of . . .,

that were
For example,

one ..., and tone


con C3,

that are in opposite positions

circle are played sequentially;






along the
for ex

and of

components so on, components . . ., and so

ample, D followed by G# might played, or F followed by B.


another . .., C#2, this

consisted C#3, C#4,



C#, D, and so on). The entire scale, as it ascends in height (as in going from left to right up a keyboard), occur consists of the repeating rence of this sequence of note
names across successive octaves.

way, the tones were clearly defined in terms of pitch class, but their oc was tave placement ambiguous. For each tone, the components a bell-shaped spectral envelope, by so that those in the middle of the were musical and range highest, lowest. those at the extremes were Shepard varied the tones while

counterclockwise steps, the series to descend appeared endlessly. a visual This musical paradox has in the endless staircase counterpart devised and Penrose by Penrose (1958), and later popularized by the artist M.C. Escher.

the amplitudes of were determined


is designated placement by and tones that stand in subscripts, octave relation are held to be in the same pitch For example, class. tones C3, C4, and C5 are all in pitch class C, but in successively higher tones D#3, D#4, and D#5 octaves; are all in pitch class D#, but in suc so on. cessively higher octaves; and We can thus regard the pitch of
a tone as varying along two dimen

Ris The composer Jean-Claude a number of compel set produced ling variants of this pitch paradox (Risset, 1971). In one case, a single tone glided around the pitch class so that circle in clockwise direction, it seemed to ascend endlessly. movement Counterclockwise pro duced of an end the impression lessly descending glide. In another a tone demonstration, glided clock the pitch class circle wise around while its spectral envelope moved so that the tone ap downward, both to ascend and to de peared scend at the same time. At first sight, such demonstra tions of pitch circularity appear to that pitch confirm the hypothesis class and pitch height are indepen dent. However, another influence in this situation is operating also. In grouping of a together elements array, we tend to form perceptual that are those between linkages in preference to close together, those that are further apart. There
are several well-known illustra

classes of the posi keeping con tion and shape of the envelope

the pitch

sions: The monotonie height defines tinuum from

dimension of on a con its position

low to high, and the of pitch class circular dimension its position within defines the oc tave. We may then ask whether are indepen these two dimensions dent, some or whether fashion. they interact in sense Common

two that when found Shepard in succes such tones were played sion, listeners heard either an as or a descending pattern cending on which was the one, depending the pitch shorter distance along class circle (Fig. 1). So, for example, the pair D#-E was always heard as the shorter dis because ascending, tance was clockwise. the Similarly, A#-A was always heard as de pair scending,
tance was


the shorter a




say that surely they must be were If a musician independent. asked, "Which tone is higher, C or F#?" a likely reply is that the ques One would tion is meaningless: need to know which C and which F#

produced A demonstration: that repeatedly the pitch class



fascinating series of tones all around moved


circle in clockwise to be endlessly appeared the When, ascending. alternatively, tones moved around the circle in


1998 American




in vision, of this principle as the tendency to to group such are positioned in gether dots that or to perceive spatial proximity, tions motion between lights adjacent that come on and off in succession (Rock, 1986). In the case of music, link to create perceptual tones that form small between ages intervals rather than large melodic ones (Bregman & Campbell, 1971; 1975; Van Noorden, Deutsch, tend the possibility 1975). that in pitch circularity demonstra of relative height tions, judgments influenced be so heavily by might associa that a potential proximity tion between pitch class and per be over would ceived height this factor. whelmed by This raises we





I won reasoning, a how listeners would dered judge tones that were related pair of such a half-octave (this inter by exactly val is called a tritone), so that they were by the same dis separated tance along the pitch class circle in Given such direction. What would hap if D were played, for example, pen, or if B were fol followed by G#, lowed by F? Because proximity could not then be invoked by the either would system, judg perceptual ments of relative height be ambigu from the assump ous, as predicted an or would tion of independence,

emerge? perceived height curred to me that in making the listener could judgments, to the absolute

class and It oc such refer of

the positions tones along the pitch class circle. If tones in one region of the circle were tagged as higher, and those in as the opposite region were tagged the ambiguity of height lower, be resolved. would Imagine, for ex ample, that the listener represented in the pitch class circle as shown Figure 1, with the vertical resenting perceived listener mentally placed axis rep If the height.

pitch class C in the highest position (as in this hear then he or she would figure), C followed by F# as descending. If, the listener mentally however, F# in the highest position placed hear instead, then he or she would the identical pair of tones as as cending. led me This reasoning on a series of experiments



to embark that em

-201 \ I /


-40-1 : "\
?i LU



\ 1

*? j

1 1

the following procedure. were tri with presented Subjects tone pairs such as I have described, each pair and they judged whether or a descend formed an ascending of The percentage ing pattern.3 times that the subject heard a pat was tern as descending then plot ted as a function of the pitch class of the first tone of

i-'-'-n LU y/\

the pair 1986, 1987, 1991, 1994, (Deutsch, & Fisher, 1996; Deutsch, Kuyper, & Ray, 1990; 1987; Deutsch, North, & Deutsch, 1994). Ragozzine Each tone consisted


components by octaves, were determined plitudes rated bell-shaped spectral


of six pure that were sepa am and whose by a



-?i ?*




The a tone

(A envelope. of such a representation spectral in Fig. 2.) The tone pair is shown tone pairs were under, generated that were placed at dif envelopes the spec positions along in most trum, experiments spaced at half-octave intervals. Varying in this fash the envelope positions ferent ion controlled for interpretations

Fig. upper Spectral graph representation a tone represents of a tone of pitch pair class D, that produces and the lower the tritone paradox.



of pitch class G#. I


by Cambridge




amplitudes of the harmonic loudnesses compo nents of the tones.


on the relative


higher lower.







of Figure 3 shows the judgments 4 different this para subjects using In each case, the data were digm. over four spectral enve averaged at half that were spaced lopes octave intervals. It can be seen that subject on the posi pended systematically tions of the tones along the pitch the direction class circle. However, varied strik of this dependence from one subject to another. ingly So we can think of the pitch class a orien circle as having particular to height, and tation with respect from one this orientation differs person to another. For example, the whose data are shown on subject the upper right of Figure 3 heard tones E, F, F#, G, G#, and A as the judgments of each de

So for this subject, F# and G the highest position along the pitch class circle. In contrast, data are shown the subject whose on the lower right heard tones A#,

at the of undergraduates San Di of California, University ego. The subjects in this study all and could had normal hearing group judge correctly whether pairs of tones that were pure separated by or half-octaves formed ascending Within this patterns. descending population, terns were the way perceived or not the subject late with whether had musical training, so it appears is not mu that the tritone paradox sical studies tions in origin. Results from other explana against argued in terms of low-level charac the tritone pat did not corre

B, C, C#, D, and D# as higher and the others as lower. So for this sub the highest ject, C and C# defined the circle instead. position along of Figure 4 shows the orientations the pitch class circle for these 2 sub jects (derived from the data shown in Fig. 3). I refer to tones that stand at the top of a subject's pitch class circle as his or her peak pitch classes. can be the basis for this What pitch class and perceived height, and for in its differences the individual In one experiment manifestation? (Deutsch et al, 1987), the effect oc of a curred in a substantial majority curious association between

teristics of the hearing mechanism. For example, the profiles relating in class to perceived pitch height the study of the tritone paradox to patterns did not correspond of for the compo relative loudness nents of the same tones when these were compared with each other in dividually.

100 80 60 40 o 20 H 2 O 2 LU 0"H?i?i?i?i?i?i?r O CC#DD#E

(/> LU O O DC < LU X 2 rr LU < Q.


of informal observa to conjecture that the tritone paradox might be related to tions
FF#GG#AA#B 0"H?i?i?i?i?i?i?i?i?i?r~ CC#DD#E FF#GG#AA#B

A number led me

of speech sounds. I hypothesized that Specifically, a the listener develops long-term of the pitch range of representation his or her speaking voice, and that this representation includes a defi nition of the octave band in which the largest proportion of pitch val ues occurs. I further hypothesized that the pitch classes delimiting are band for speech this octave taken by the listener as defining the the pitch position highest along class circle, thereby determining the way he or she orients the pitch class circle with respect to height. and Together with Tom North a to ex Lee Ray, I undertook study amine this hypothesis. First, we in a full ex had subjects participate

the processing

T?i?i?i?i?i?i?i?i?i?i?r CC#DD #E FF#GG#AA#B

i?i?i?i?i?i?i?i?i?i?i?r FF CC#DD#E


PITCHCLASS OF FIRST TONE Fig. 3. Perception of the tritone paradox by 4 different listeners. Each graph shows the percentage of judgments that a tone pair formed a descending pattern, plotted as a function of the pitch class of the first tone of the pair.


1998 American







O^/c^n^L^ (bj ^\v)

J English 20


? ? Ce)
?? 03 O Cl


of the pitch class circle with respect to height, for 2 different Fig. 4. Orientations The circle on the left is derived from the data on the upper right of Figure subjects. 3, with peak pitch classes of F# and G. The circle on the right is derived from the data shown on the lower right of Figure 3, with peak pitch classes of C and C#. p?riment and we on the tritone paradox, a group whose selected showed clear and con judgments sistent relationships between pitch by

O CO ^-* C O o


ii mu

o Q.




factor. The
version assumes


10 H

class and perceived Then height. we had each subject speak freely into a microphone for roughly 15 min, and we took pitch estimates of at 4 the subject's speech samples ms intervals. We then determined in which the octave the largest number of pitch estimates occurred and identified the pitch classes that defined the boundaries of this oc then compared, for each subject, the pitch classes de limiting the octave band for speech and those defining the highest po sition along the pitch class circle, as determined by that subject's judg ments of the tritone paradox. For 8 of the 9 subjects, the two sets of closely pitch classes corresponded (Deutsch et al, 1990). The findings from this study are in accordance with the conjecture that perception of the tritone para dox is based on a representation of the pitch class circle by the listener, whose orientation is related to the range of his or her speaking pitch voice. Two versions of this conjec ture can then be considered. The version restricted, that the listener's pitch range for speech is itself de a learned termined by template, it is determined but rather assumes first, and more does not assume tave band. We

is acquired template to speech pro exposure through duced by other people, and that the is used both to evaluate template perceived speech and to constrain own speech output. the listener's The characteristics would of such a tem to plate expected vary among people who speak dif or dialects, much ferent languages as other characteristics, speech such as vowel languages A further quality, and dialects. vary then be



pitch class
classes other

5. Distributions
in two groups

of peak
of listeners,



from the south


of England




the study supported I had noticed in conjecture. that Californians and formally from the south of England people tended to hear the tritone paradox in different ways. In a formal ex second 1991, 1994), I (Deutsch, periment found that there was indeed a sig in the distribu nificant difference a tion of peak pitch classes between group of subjects who had grown and a group who up in California had grown up in the south of En in Figure 5, the to have peak tended classes in the range B, C, C#, pitch the English D, and D#, whereas a different showed group pattern, gland. As Californians shown with the most frequent peak pitch classes being F#, G, and G#. This experiment the supports view that through a learning pro cess, an individual acquires a rep

of listeners forwhom each pitch class is a peak pitch class is graphed. From Deutsch (1991). resentation that has a with of the pitch class circle

particular respect to height. tation is derived from the speech to which the person is exposed, and varies from one language or dialect
to another. Further, we can assume

orientation This orien

that this template is involved both own in the person's speech output and in his or her evaluation of speech produced by others. Dolson (1994) has reviewed number speech of from a the

findings literature that support the First, it appears present conjecture. that most people the pitch confine to a range of their of speech an octave. Second, within roughly a the given linguistic community,


by Cambridge




an oc speech of females is close to tave above that of males. Third, the re ranges of speech differ pitch a given lin little within markably of guistic community (except, for the gender difference). course, In contrast, there are considerable in the pitch ranges of differences across different speech linguistic communities. there is a Finally, be lack of correlation surprising tween the pitch range of an indi vidual's speech and physiological such as the person's parameters height, chest size, and weight, of vocal tract. length What could be the evolutionary value of such a template? The pitch of a speaker's voice varies depend state, ing on his or her emotional and so serves to convey such states to the listener 1992; (Fernald, such as Scherer, 1985). A template a framework, this could provide common to a linguistic community, the pitch of a speaker's allowing so that it can voice to be evaluated his or her emotional state. signal The template could also be used in of aspects conveying syntactic & Donse (Cutler, Dahan, speech laar, 1997). A number of other laboratories


tory has of the tritone be influenced formed I found

from my labora that perception indicated paradox by a template that is in life. Ragozzine and differences tritone of paradox in the could well



is the subject

of ongoing


early statistical of two

have for argued Philosophers centuries that strong linkages must exist between music and speech. This view has been shared by in their who, many composers search have for optimal incorporated that are features expressivity, into their music

perception between who


had all grown up of Youngstown, Ohio (Ragozzine & Deutsch, 1994). We designated those subjects whose parents had as also grown up in Youngstown and those whose "locals," parents as had elsewhere up grown "aliens." The alien group pro a duced of peak pitch histogram classes that was very similar to the one produced by my Californian 1991), but the lo subjects a very differ cal group produced ent histogram. Because parents have a particularly influ strong ence on speech development, this an individual's study indicates that (Deutsch, class pitch template formed in childhood. Most obtained such a might be

subjects in the area

characteristic of spoken language. From a different advances considerable perspective, in documenting have been made the role of features based on pitch in the comprehension of spoken (Cutler et al., 1997) and in language the recognition state of emotional 1992; Scherer, 1985). The on the tritone present findings indicate that, in addition, paradox with experience speech can influ ence how music is perceived, and the door to uncovering other open (Fernald, such influences.

(Deutsch, 1996), I recently more direct evidence for

have recently produced additional evidence the idea that supporting the orientation of the pitch class circle varies statistically depending on Gian geographic community. (1998), in a study of stu grande dents at Florida Atlantic University a in Boca Raton, Florida, obtained of peak pitch classes that histogram was quite similar to the one I ob tained with Californians (Deutsch, 1991). R. Treptoe (in unpublished a similar his also produced work) togram from subjects at the Univer Steven's Point. In sity of Wisconsin, contrast, Dawe, Platt, and Welsh (1998), in a study of students atMc in Hamilton, Master University a obtained that Ontario, histogram was very similar to the one I ob tained from subjects from the south of England (Deutsch, 1991).

developmentally acquired I studied the perceptions template. of 15 subjects, together with those of their mothers. Ten of the subjects were and 5 were adults. children, all Californian, subjects were but their mothers had grown up in many different geographic regions, the European including England, and various parts of the continent, United States. As expected, the The mothers the tritone para perceived dox in strikingly different ways. And all the remarkably, although their Californian, subjects were perceptions to those of also differed corresponded their mothers, closely and so

1. Address ana Deutsch, correspondence Department of to Di Psychol

ogy, University of California, ego, La Jolla, CA 92093; 2. A sine wave, a tone or pure the

San Di e-mail:
tone, is

frequency times in a where as

of a single
of is that = second

number the waveform


repeats itself. This is specified

1 Hertz the maximum about (Hz) 1 cycle

per sec

ond. The amplitude

variation convenience,

of a tone is defined
amount of are pressure For value. converted

the mean

into decibels (dB). The spectral enve lope determines the amplitude of each
frequency ment on component.


from considerably each other. However, 10 of because the subjects were children, it is pos sible that the correlation between the perceptions of children and their mothers reflects a particularly in the influence strong maternal case of young listeners?an issue

3. Signals comprising
the tritone

a full experi

with a description of the procedures for testing subjects and analyzing the in the following data, are available disk and accompanying book compact let: Deutsch, D. (1995).Musical illusions and paradoxes (Available from Philomel Records, P.O. Box 12189, La Jolla, CA 92039-2189).



1998 American




D. (1996). Mothers and their children a musical illusion in strikingly similar Journal of the Acoustical ways. Society of America, 99, 2482. Deutsch, D., Kuyper, W.L., & Fisher, Y. (1987). The tritone paradox: Its presence and form of dis tribution in a general population. Music Per ception, 5, 79-92. Deutsch, D., North, T., & Ray, L. (1990). The tri tone paradox: Correlate with the listener's vo cal range for speech. Music Perception, 7, 371 384. Dolson, M. (1994). The pitch of speech as a func tion of linguistic community. Music Perception, 11, 321-331. Fernald, A. (1992). Human maternal vocalizations to infants as biologically relevant signals: An In J.H. Barkow, L. perspective. evolutionary Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary psychology and thegeneration of cul ture (pp. 391-427). New York: Oxford Univer sity Press. J. (1998). The tritone paradox: Effects Giangrande, of pitch class and position of the spectral en velope. Music Perception, 15, 253-264. Hunt, F.V. (1978). Origins in acoustics. New Haven, CT: Yale University Press. Penrose, L.S., & Penrose, R. (1958). Impossible ob Deutsch, hear jects: A special type of illusion. British Journal of Psychology, 49, 31-33. F., & Deutsch, D. (1994). A regional Ragozzine, in perception of the tritone paradox difference within the United States. Music Perception, 12, 213-225. Risset, J.C (1971). Paradoxes de hauteur. Proceed ings of the 7th International Congree of Acoustics, 3, 613-616. Rock, I. (1986). The description and analysis of ob In K.R. Boff, L. ject and event perception. Kaufman, & J.P. Thomas (Eds.), Handbook of perception and human performance (chap. 33). New York: Wiley. Scherer, K.R. (1985). Vocal affect signaling: A com parative approach. Advances in the Study of Be havior, 15, 189-244. Schouten, J.F. (1940). The perception of pitch. Phil ips Technical Review, 5, 286-294. Seebeck, A. (1843). Ueber die Sirene. Annalen f?r Physik und Chemie, 60, 449-481. in judgments of Shepard, R.N. (1964). Circularity relative pitch. Journal of theAcoustical Society of America, 36, 2346-2353. Van Noorden, L.P.A.S. (1975). Temporal coherence in the perception of tone sequences. Unpublished doctoral dissertation, Techniche Hogeschoel, Eindhoven, The Netherlands.

J. (1971). Primary au Bregman, A.S., & Campbell, and perception of ditory stream segregation order in rapid sequences of tones. Journal of Experimental Psychology, 89, 244-249. W., van. Cutler, A., Dahan, D., & Donselaar, of spo (1997). Prosody in the comprehension ken language: A literature review. Language and Speech, 40, 141-210. Dawe, L.A., Platt, J.R., & Welsh, E. (1998). Spectral motion and the tritone paradox after-effects among Canadian subjects. Perception & Psycho physics, 60, 209-220. Deutsch, D. (1975). Two-channel listening to mu sical scales. Journal of the Acoustical Society of America, 57, 1156-1160. Deutsch, D. (1986). A musical paradox. Music Per ception, 3, 275-280. Deutsch, D. (1987). The tritone paradox: Effects of spectral variables. Perception & Psychophysics, 41, 563-575. Deutsch, D. (1991). The tritone paradox: An influ ence of language on music perception. Music Perception, 8, 335-347. Deutsch, D. (1994). The tritone paradox: Some fur correlates. Music Perception, ther geographical 12, 125-136.

