You are on page 1of 14
Aer 5 Bregman 11 Here is another example ofthe icaty that it might have. ices tu a he ters of pe ude tt br cs cae ‘B2t ea cue forthe spall poston of» sound source Bat how doe {ierobot know, when iscomparing tenes ota ear, hat EDpatgtheney ete um ono un meithee te oo cach ins diferent place, te imple stegy of compar Ils te ovo en wil longer work. ma os spar tonparon of the ineralies derived om each sures ow #0 fo hor enchenegy came fom ech source ocak ex? "To esopiz the component sounds at have been sed ogee 0 form he cre hat ees or ear he ay oatn mus sone Cate indvidldesciptos tat re bse on ony thse components ie en tt ave ae on an ene et The frowes by whch idee thi hasbeen eed "aitory scene analyst? {Gregman 1990). ee Z erm che aay’ was fst used by researchers in computer vison to refer to how 2 computer might save te following presen. Ina photograph of sere of normal eomplosy, the Wale parts of 3 Seog fen denis ene he ames a interuped by the presence of anther et that hes inane conn man 1 ome aa Be ‘ane ven ote satgy by withthe computer tempo tp together Mths popertee-cpn usar tte cls Qatar and ‘2 orth belong fo the sae objec Only the an the cores gal ‘Rapeandpropert of that jet be determined. By arly nog {Eee aalye th proces whey lhe auditory evidence at come ‘ver tine fom » gle environmental sore pul togather a per ‘Spal ent This chapter wil Sache the methods tte udory ‘Suem employs and sme of he research that has covered them 2 Auditory scene analysis: hearing in complex environments ‘Albert. Bregman 2.0 INTRODUCTION 1 would ike to introduce an approach to auditory perception that ‘anced with gute different questions from those asked by traditional epcophysis. The ers xruped with uch queton as thee Whats Ra Siebigum amount of energy that can be sensed by the auditory ‘ter? How far apart do the Fequences of two pure tones have to Be ‘Seder tobe ditnguished wen they are played either sequentially or reac ane? How dow the experience loudness of one gFOW as joi intensity incrensed? How do dferences between the acu AEBaLeia registered at each ear tll us where the source of sound is? “Ra feveloment of rif intligence in recent yeas has made ws avn that even the above questions were answered with cerainty we ‘rould sl uederstand very tl bout how the auditory system worked TR Wout exerese that can help ust see the problems i frat 1o imagine ‘he auditory system a being possessed by a robot, and then to ask what 3 been found fo be tue of man ation The robot would sill ave BRS scaly sing the snformation that i received i mast dict Bicol would be is dealing with mixtures of sounds, Its record of any coming signal would repreent he sum ofall sound-producing soures nad bee simultaneously active a the tine ofthe recording Suppose {Hat our robot had 2 definition in its memory of the sound of a voice ‘Sying sprticlar word. It stil might not be able to recognize the word ‘Boat de presence of ther sounds had ented tution in which 10 ofthe recording matched the defrition closely enough. Even eee Gnghtmintakendy hear some sceidental product ofthe mintre ‘of two voles 26 3 word, 21 SCENE ANALYSIS IN AUDITION Wen you deserbe the problem of mites to mst people they are Incline any that they solve simply by paying attention to neo the founds at a In eying ti, ey np at hepato re ae found ae somehow a theent bundle hat canbe sete by the pe See Seen Hone muon ha pe risa pater foamed by presure changes over ied Eve loo at prot of he waver of mine of ound te ning vious int hat nbs the sound as amature or tl you how Ota apa eho that the fist tage inthe snaps of sounds by the human ‘ig ent ei pyc of men ni 5 Meda ae. Dat e dior syste ales place in the cochen, There he sound is docomposed 12. Anditory scone nays: ring complex ensirnmente Fig 21 Spetogrim of) mint of sounds and () ono the components “ol the mitre the word she” Into separate neural pate that approximately represent the dflerent {tequencisin the signal ee Moore nd Paterson (1986), for discussions ‘onthe mis the eave ably to rparate very dosaly spaced fee: ‘Guenses} Decomposition into component frequencies sa techraque aso {hed by scientist In trying to understand hearing. The results are dis- played in a spectrogram, pictre that shows Ue on the axis and Fequency onthe y aris The darkness at any point on the picture shows the ltetiy ofthe sound sta parclar time and frequency. We can pprecate he limitations in the useflnes of the ea’ frequency-based ‘ccompoition ofthe signal by considering the information shown in 2 spectrogram. ’The one shown i Fig. 21(2) shows a misture of sounds. The reader sight think thatthe problem of finding the Individual sources from imtures might be solved immediately by using such a pictur. This ‘ould indeed be done ithe sources were steady, pure tones that were ‘well separated in requeney. Then each horizontal streak on the pcre ‘would represent a separate environmental sound, persisting for some period of time. Figure2 (a) however, represents a more natural ease in (thi each envitenmental sound has many frequency components and these ae not conta over tine (I represents a istreof aman saying "Show, anater singing to himeelf, and an unvelated piece played by instruments in the beeKground). listener can easly heat the word shoe! inthe recording from which this spectrogram was made. Figure 2.10) ‘hows the word spoken insolation This what must be extracted from {he mintre, We con dl veully to some extent ater seing the wolted pattern, bat Hateners can do it even without being told which word ls Alber 8. Bregman 13 ‘embeded in the miature. The problem they face can be shown with Feteence to the spectrogram, We can se tha othe mixtre the steaks that represent the componente ofthe word ae terlaced wih and cose ver those representing the components ofthe other sounds. Even 2 "ingle streak can represent the sum of fw0 of more components ofthe Same or almoshe-rame frequency that have been derived from di {erent sounds So separating the equencies inthe nervous system and laying them out over time, as stated in Fg. 2.1@) does notin set, ‘provide coheren! bundles that canbe slated by attention. The separate ound sources that stenersdesebe are not given in any simple way in the spectral decomposition of the signal. The problem of finding them s the job of suite scene analysis. i appeers to me that there ate thos process ocurring ln the human listener that serv to decompoce auditory mixtures. One Ith activation of learned schemas in a purely automatic way leis common obsera- on that occasionally, people imagine they hear her names spoken in 2 ‘oly environment for example acy street comer. Apparently, a chance ‘oovcurrence of sounds can activate the mental schema that represents the sound of one's name. This hypersensitivity nd automatic attivation presumably occurs beeauee people so frequenily heat their names [Spoken tha its schema is in hight potentiated state Whenever the ‘Bcoming sound matahoe the tchem’t acouste defiton inven 3 Spproniate way, it Becomes active. This method of analysis might penetrate some mistues a Tong a the Sound pater recopized by the Eohema was not tally distorted. “A second proses that can decompose mixtures isthe use of schemas ine voluntaty way. An example occurs when we are intent on tying 10 thee whether ocr name ie being called out bya person announcing the ext appointment in a busy ofc. The experience of “ying isan ine tiation that voluntary attention is involved. Notice, however, that the Shera for our rame i alo involved though is operation sot automatic inthis cae. genera, whenever we ae listening fora speci sound oF tas of sounds, some criterion for recognizing the targets must be em- ployed, call this erterion a schema because it tsa mental representation Ota particular st of characteristics, Both ofthese methods, automatic and voluntary secgnition, require that schemas (aowledge ofthe structure of particular sounds or sound ‘Gases that ae important fo us) have already Bon formed by prior lis fering, I these achemacbaced methods were the only oes for decom posing matures It would be hard for us to frm schemas for important Funds inthe fit place unless we fequenty encountered these sounds {nolan I would be valuable, therefore, for us tohave general methods {or patiioning an incoming rlsture of sound into separate acoustic Sues that ned be used porto any spetic knowledge ofthe import St sounds of eur environment. 4 Aor see ans: hing ix complex environments 18 appears that we do have such methods have cefered to them collectively by the term ‘primtve auditory scene analysis (Bregman 195, 38) By calling them primitive I mean that instead of depending on owls of sel yp of Sound, Sch voces, tl ta ‘eniso machines, they depend on genersl acoustic properties that can be sed fr decomposing all ype of mixture 211. Using general acoustic regularities ‘ve ae ean ans ewe aout propeis of» und tant ual ning the wane says ple ren fe ste nt falar ith he ga Tene ire te et of fpr ropes of tng won nd nt Slat fe ina sun ony Sed das ef sone ‘Asecampe of pn propery th la tet any sod eae cvronen ahaa Ta the gir cone al ‘ltl posit oan! Bueno tet a ten os pind caer ey pons xan man. Th Spare ss ob won SAS pre Sning bv irks gure an 0) of he wl yes rcs we frequen we pectin tie tgs at Sines so) the equny of Se sen he el ay "clr of sucht sete oes ht rc sl vrais Geching the omen vr ca) ens ms sue Serta mace fr ate cn employ asaya xa hi rely ercampl wt shld olde ht saphena fm nay of sequen components cera pen fe Sata he arene conponers ne mips ot» cron fa ‘Reval Arete many sun prot yee What ee os IStvowof hypothe aes nme sans om oh wih dfn of honky i chrono Sane oe ‘multiples of « common fundazental? This would be a highly improbable Sen A ber thew ony oe scoot as Somalng amon Hers nother Samp Whe sae eat, Spey br i ems wo sacs of cmpaents te mene cathe uing mele ofa ter fanantal Ooouy, oe bet th te Sa 0 mba roe {Sind Tew fhamonity main eh whee erly Otte cane sd ei on in ans a find howeer it no ge ey ean be ral ‘Aral ic ot imposible Wt the harmonise sounds might Sey rena. oppeety rer reton it oe wat Bare ‘ound na lle tary were conng an tet ps Aller. Breen 15 in space or that hey stated and stopped at diferent ime. Since the fequncycomponeris from a sngle seus event tend o come from {Sesame place and tort and sop 8 roughly the sme tie, dle ‘ce in se properties would protect us agit accepting acdetly itmonicconporentsw ports of singe sound, We tetire ned 0 plot many oper athe sae tine i we at come tothe ight “Srpard (961) has aged that Because anima ave evolved in a word tat onan egret ily ht peeps ystems {ve evolved modes of operation that ake advantoge af hem. Ta 2S imal evolved, the properties of tee percept ystems became ‘Sout the roles of fw py word The esting math ‘al pyc apn Sepa e e eae or fing he awe auditory ogazaton wou {Syto ducorer elton ong the components he incoming snd Ear royal present wh pac the sound have been crested by Alferentevromental evens. Then we can do experimen oe eat thee the human autor system explo thee Telos 0 decom ove mre 2.2 EFFECTS OF AUDITORY SCENE ANALYSIS rin any eee wh cil way Scone culls fee many spect of audioy pereption tat we 60 ‘tually Bebo tobe related to perceptual orgarzaion Organization Bote viewed as grouping of auditory properties fer they have teen coated by pocpion However, even sich Sppaenty a" Prop (iss loudner can be afeced by Organization. Hee is an example desribed by Warren (1982) a¢“homeponie continuity’ (es ssa Warren eta 1972; Waren 198), Suppose te ae frvented with a tendy sound which ft hols a fed nen for Tie seconds then sddenly becomes more intense, beds is new intensity biel, ad then rete tote orginal evel and contin for Some tie. If he petiod of geste nena is hort we her ot = Range nthe onal sgn but os a second sound, ent in prop {het fo the enigial one ining fe and then diappening. The Sia is heard a conduing unchanged in ludnsy behind headed tre The intensity, when i tthe higher levels nerpeted not ‘Sming om a sing loud sound, but aa motu of to ser sounds oer words te itraty infomation is shared st beeen 0 red even One ether hand im experimenting wh a stimu, Five osced tht the rican fll in intent ae not udder then 6 Aucitory scene aalesharng in comple encronments the worsound interpretation does nt oecur (Bregman 1981) Instead the ‘Sigel sound is ened ae changing in lunes, and the ul intensity at Seek tele used to desve the perceived loudness of single ound {Pe eh Tovln, organizational processes wil decide whether we hear the loud sound oF two softer ones seve a variation of this example to show hove perceptual rw ean affect even a sound's perceived spatial position regman iy ets presen the signal T have just decribed, bt this tine to our Tp cor only atthe same ime we presen the ie ear sound that en ical ur cxginal signal in all respects except that it always i ec dae lower inert. During the ntl phase, when both the S308 oes intensity, stays beefy at this level and then returns ascsrelive te ogial sound coninsing inthe middle, accompanied WEB bys ovune a our extreme ight With this procedure two sounds sre heard, each with its own poston, saeetenstcad, we sane nd omer the tnt a he ight ea ee eee ia ms ease, we hese the sound move over othe right and deny 1 the centre our spaial perception following the changing tae es ptelies a the two ere We perceive only single sound tv single location. ‘aad of wheter the change i sudden or gradual it leads othe aaaeaty balance between the two ears during the middle phase Terese and more intenge atone en) If pereived location were te ea by te story ofthe event, we would hear the same thing in arrests But we do not. Ths tls us that when we perceive & {he Se, tis ot the eation of sound” in the abstract but of pariclar rector pending on How many sounds are crested By perceptual or rere of porcived locations wl be different. The simple Farr coat perception tht daslalpeychophysic as discovered cae cctne steers In simple, quiet environmen cannot be applied Nog sor a uber of tones with purer timbres (Bregman and aac ory can afc the poresived Eden ofa speech sound by ke ang occ componets hat might hers be pat of and SEAN a to ether perevived event (Darwin 1984; Cloca and eC g5) can deterine wheter wo melodies, presented 382 aera ee Peard individually or whether a new emergent melody, ‘BREE fll ents is ard ited (Dowling 1973) Tt can determine torr the naruments of on ensemble are heard with thee individ Ween or blend to form a global timbre Bregman 1990, Ch 5 In Free ffews every audlory experience in natural environments Albert. Bregae 17 [Although 1 have surveyed uch ofthe research on auditory scene vale length ina separate vole (regan 1950), 1 wil ry hee to omvey base understanding of the approach. The presentation willbe ‘rparized around the different environmental regularities exploited by he auditory system ‘Regularity 1. Unrelated sounds seldom start rtp ot erly the sme "Acoust components derived fom independent nvronmenal events pdm a an op ae in ede ale Give Sheagy athe momen tht another begins. The sedory yee Regist A chen uss wh ve cae the ep. oP matrg when eps ser breomes more comple buts teres he eeu sytem cal sl eon the same fequnoy ‘Shaponet os before interpreted 0a coniation of on dial Stata new one Jing i. The od sound conics to be heat fr TRE dine ong wa ner one Ifthe petra becomes simple agar srfareonmasony the components ofthe ld sound his sengthens ee eNespuon ofthe ald ones having been presenta the tine, The reise quale the added sound re derived rom the components erg infer spectrum tha ae et oer afer the components ofthe flv simpler spectrum are subtracted ou he eos ome a ebes Kw ole pert Set etn oe ert tes ‘Solis tot ince te one tinted by th oe. stn wl ese {he toe a conning hugh the noe seems fo be present al the fern te one ns ine ote Say aytr eerie at he neural cy ecu ie mune ih connate tne nt ilps pereopt ofthe tone conning behind the nie, Probably {berdenton of he slemtvenerpetaon stone tuning into aie, ane he audony esters exloaton of anor evionmental Sin tht we wil sess ner when propre re desved rom ‘Tile conning sound they tend not to change sede. wc pecedg evrple conta itis hard to determine whether thepn Stthe netrl acy tat a ier she confining tone ‘tePafowed to conta the preptin ofthe nl. A broad band ‘res Souns ery much the sane wheter oro! he ao band of {Kajcnces ated tothe tne has Been subraced from KA beer | SiS oc erng hw he tol infrmatn tom there shared Paina im ohh a band of nok, say Oo 1 Ke (HE means 18. Auditory scene eles: hearng in camplexenirouments ‘thousands of eyes pr second’ alemats with a wider band (3 0° PEST wh the nawer band of rise longer in duration than the “hes bo The sus s generated no hatte components shared by Thetwe sounds those Ot Ke rnge ve equa i testy it {he two sur In ths demonstration stn il hea he rao Hose Satning through the wideband nse (Warren 1962,Ch SPIN ith ste pression hat the shot intermiter ans lacs the components that have Bee sed by the suiony system ‘Sere the prep te continuing narrow tard re The contusion hike 0:1 His components has Been removed andthe ‘ade noe is kpesenced as having he seme quality 21-2 zn. The spectrum Fine wiser band 092 Ke ole har been divided up providing ep Sie components fr the continuing and te added wounds. “Aresampl of he lp en satin te fd of speech prep tents oud van experent by Darra (96) Each vowel na at= (age canbe ditnguished from all he eters bythe postns of « Setter of frmants (testy peaks) i ts specrum. Each formant “noi of an ugmested intra fa numberof harmonies, Darwin ‘erect onc ofthe harmonics nthe lowes formant of vow! that eee eore the wwe and continued throughout ts dustin. ta this ‘iets pate the lone fo thea und andthe vowel the new! SAr"The bet was to eure the vowel to sound more the diferent {Eels reared becuse wh eho pasa tne ‘Signe sm deny sa separate sound. Then when was jlned ‘alert vowel al al saan acted ie vor CotonedA port was interpre st the cntnang (4) tne. Only {pe rerunder, with this part removed, was interpreted as the owe ‘Foc nang sensory evidence potted to peak in the spectrum at 3 “ren pace fom tat shown by the woe body of evden, nda Going so, changed he Wen ofthe peeved vowel Regularity 2. Graduales of change (a) A single sound ends to change ts proper sot (on oly. (0) A sequence of sv from the same source tends 0 ‘ages properties Soy. “The envrormentlepulany concerning the sequential semblance cf Sounds om hse sute ten teva forme, Ovo an daar ie town pacing ct that extends over te, Ex TRUER clicor ttre angle la note ingle bing (ete roar ton rhe contnuous nose of motor, The sound Albert 5. Bregman 19 ‘coming from such an event tends to change continuously eather than ‘bruply in te properties as the event unolds “A scond form otis sequential egulary is concerned with suceson «fzounds fromthe same source. Many examples canbe cited: a succession ‘of footsteps, 2 series of chirps in airs oa frog's cal series of peeks ‘ofa woodpecker, and soon Sound derived in sucesion from the same Scouse source tend to reemble ane another, with ony gradual changes pbk beneen member ofthesenes ne ‘Eretimes iti not the event that is cecuring eepenedly, but our sultry aces to bn a mintre of sounds, each of which fs waxing td waning in intensity, the auditory system may obtain a succession of ‘exposure tothe properties of one ofthe sounds relatively unmixed with Dropertis ofthe others. The spectral samples caught in these successive FElmpse’ ofthe sound are likely to resemble one another. "An important fac about sequential resemblance i that sounds do, in fac, change: However, since the changes tond tobe gradual samples from the same acoustic event that retaken closer together inte tend torsemble each other more dosly. These facts about sequential change in the sounds ofthe work are encapeulated in two ruler that seem to be followed by the auditory tate. The fret rule that i based on these regulates isthe ‘sudden ange rule’ The auditory system will teat a sudden change of properties the ont of 3 noe suent ample of thie were given arin this chapler. We had the example of homephonic continuity fr well asthe example in which one ear recived an abrupt rise in Intensity while the other ea’ sound remand steady. In both cass, the euddennes of the change triggered the interpretation ofan added ‘Sound This et the stage fr the od-pus-newe strategy to decompose the Spectrum into the components belonging to the old sound and those of the added one "The use of the strategy requires a definition of suddenness. However, there is no clear boundary Between gradual and sudden changes. TNs is “lustated ina recent unpublished stady by Jean Kim and myself, We played a rapid sequence of four one-second tones o subjects and asked tem to judge ther order of nee. They were highy overlapped in ine for crample each successive tone could begin a ie as 0.15 ater the previous one The srddennese ofthe rte intensity of each tone sit fas turned on helped the listener to decompose it from the mixture. ‘Wien the onsets tonk 0.04 the tones sounded distinct andthe listener ‘oul judge their order fi well bat when It took 0.64 the tones were ‘STblended together in an impenetrable mush of sound and it was almost Impossible to fuge their order, However, the chang from sudden 10 (geval was no all or nothing in nature. A intermediate ont time, Sie gave an ilermediate result. The suddenness of specral change 20 Aitory scene sly: ering comple einen wort ie every other ca for scene analysis The stronger iti he more fee he grup sree duis (0.04 fo 064 «) hat | have described as em eat ficmnge of suadeners may be a valid range oy for Sage meray The values that count as sudden and gradu for Ree: fer features of sound, sch a spatial location Smbre, ot Sea vena et am pinned down quantity. seecntrnce th sudory sytem wil ot be abet find a close ens sma the spec belo ad alter he sudden change yppo% ‘ach nh ps on Testor oan of terete ard bing add to he misture oF aw ange i the Sila of single contining sound, it may concide tat second sean cle! the fis An cxpesient by Darvin and Datel Fare pe dom example oth napech perception, Pst it impotnt 77 fou hath fandamenal equency ad he frequencies ofthe Fan eats The formant re those prs ofthe specu in which to ets ve been stenpibened in llesity The researchers wee parted cramp of continous spec onistingof oot formant reas gradual changes the pions of seca peat) boc and rene co vowels The vowels were represented by fequencis sheen a wee hi ew fo Bd Ine ie of eh a ory suddenly changed the speaker's fundamental equency, ces swe ws always ona lower frequency, 101 Hs, and the Shen's hr on 17 He (HL mana et sc, Th othe ceprtgatn of to apparent fern ler, one speaking in Siete ph ced tober its highc. Furthermore even hough there ayer BNinuous change ite formant fequenis the pat of he We before and after the sbrap change in pic tended to be ree ltd com on ance. Each role Samed to be let PeeP pe Pid in whch te oer was talking Furthermore the siting hs es canatrcted in perception fended ot be based on Ines rs ha ded the charge he fandarent aso fee Fy end ot want make stake by cent ables a pele eteen te sounds fom wo diferent ther of elon Pople pekabe sequential reel of sods rom he same source the’ rouping by sary rule. We do and othe ways in which sounds canbe sir Howevet, aaa ee de quent of pure ones spat locaton ann the Fn ude frequency of comple ones has een. ote grouping regan 1950, Ch 2) We are ao unsure Seen ee erect of say or he frequencies of pare ones Aer 5 regmen 21 the diderenc that ates grouping appears tobe he dferenc between Te lopahme of truer bu for oer dimensions noone has Yet rd to discover appropriate quantitative measures of te depres of nian. shes bon etd inten, towards demonstrating tha such Strains actualy have anf on grouping comthonsese {Ssesments of lay Income se, we have alcdy dcsed snaty. We might tatty mene ote sma sec of ger Stmiaay. However In mentioning the aay rule separ Tam {hk of cane n which the sounds crrin duet samples ps ley atch ose succor of oouteps ors uceson of umes one ‘Sardine nbture whenever the tere drop inintnaty. These ucesve ‘Shope ofsound ae mode inte abortry by sutcesions of fone Fits of note foal pes and so on. “Tee! ofthe rl to ake sounds that have ia proper to led hom ogee perceptual vo propa ado seprepte ee Broups Iromone anther Each ned group cose bythe siory Sse ‘hae come fom ting cent AF are pte, teen posses endo treat the grouped sounds othe pacha ‘eteenc ihn whch to lok fr far pte a "The phenomenon of auditory steaming exemplifies the working of ahs re iapan and Camp 1971 van Noorden 197,902, Sop- oe we sat with vo sf tne, oe of igh ones and the ote of TErrones in which there ra conadcabefequeny separation between the high and lw set bet where the frequency diereees betwee he {eres iin each et re sal Suppose we then interne he high) Stu ow (tones together (2g HULL.) to make a ong sequen, ‘id ply ito tener. Recall hat | id tin he word are the Pi tine tw aps mh sane our mes mae Sinise thar properties ae ely to be: The grouping by proxy ys ery nn ant se we eno tone alo. te samples are fr apa in ine) the aur system Eng o epic ene st of ons a coming om sing sure Fomever when the sequence i speeded upto perepual chaps, tr mudtory sears a formed Hiteer Re the sequence fro distinc sures of sound ere ace a roughly the ame tine ‘ate formed bythe high ond low cones taken ncombiaion tend ‘ESppew in avo of paves formed bythe high tones lone and the ipwenes alone For xmple hv melodies willbe ead, on involving the high nots and the ltr the low ones Alo if the oil mined Segue oneal, diferent tempos! paters wil be heard in the ES wg For emp eng pret igh ‘ores 22 Auditory sce enlys erng in complerencromments HHLHLUALHHLLHL, wl rea into the streams HH-H--H-HH SLL L-b==L Eek. tn te laters the dase eps ances that open in each cam se of he neste tone the bets lcd ihe le steam bythe percept grouping So eporlpaers adn yan nope sue) cn beers o are DY te septation io netory soca heey ave the dee of diflrenc ete he quench nigh ant ow tes ee yen cong ange ES fesepaton i euecytncmes gen. Tisfequery ier her ean'ue ff gna the spec the nqunce A quence a ‘tk te eqn fence mer moa Be ped up move bee Il be seg wan Norden 197. The sexily of segregation tthe speed of the sequence can be lowes servi he af hnge betwen igh and ow tones ‘When changes or tore ply the ngunc sity tobe eed ‘nang th meee sue, per fe perce proces suid bythe enone egy at he ‘ee fends out he sn smear, he prope ted lo Sates ony set “A mpan fc abou the spepton ofthe ones of dss tat at hen he nes te dtl by uray rage) ht 8 lise Een aa igh ped the fs fe igh an ow tne ae ‘Soul megrte ine Sglestars Aste sary tem contins {Shea we polation fy segregation cars tppens tha Seyrptve endony bus up orate four second J he ed Sere tga ese ie tobuld upset oar Sands dlagte Even the suc pe ors couple of scons Scene eon wl ean, sdf the succes in ‘Peat bfsreptd mere cy the ond tine regan 17s tual pence scan toorepond othe evra at hat he Spree sound cang the act of Hing at Read ts "Nomen eae: Apptey the auto stan eee "open ot ral forthe etna the sud it apps that even simple percepts dgments ae afd by she agjegton of conporests im sierent frequency renges. Tht ‘Sac infor! spution beter tons one higher in egueney dels bones nee io age a begun separ Aes inst te Bogan 199 pp 1980 for an cu fs peony Alert 8. Bregmon 23 2.3 MULTIPLE BASES FOR SEGREGATION “The fequencies ofthe ones are not the ony dienes that affect hee ‘eyurnal rpaizaton. Tones cn alo be grouped scoring 1 sm {ici cher pail patos (esrb by Bregman 195, p. 73-0) Inada if the sounds are complex, other feces ean play le. In tone formed of many harmonic, a9 vies and musa tones ae, the ‘grouping can depend on sia in tse fandamentl frequencies {as nth bres, we deine timbre bythe veltive eng ‘thamonis i thelr spectra Singh 1967, Bregman 1980) Toes teoands whore waveforms repent eal) wll often sepegae fom foc (Onnnetring and Began 1976) Toes that ge fom one re Setar sige tr tay hn che tame slope and are inte sme equa ange ger ond Egan 1980. “Frese various types of diference compete and cooperate with one snot in dering pouping diferent actos promete {ospings ofthe sounds, he winner wil be the grouping with the mst {Eoor favouring itor the grouping thats favoured by the facto that ‘Be sciory sytem prefers to ue Bregman 195 pp. 15-1, 218 39, “Ths beng Up the point that nt all sous erence equal ‘portant in determing grouping. Harmar od fbn (99) shed {lene to reogrne pits of melodies whore naeshad been ntereaved forma new sequence of tones In each suc ene ofthe wo Seloies were made 10 difer fom one another ons ingle acoustic characte The most elecive ways Yo hep peopl © sepezte the two melodies were 1 send them to different eas to shift he notes of tne melody up by an octave, o to introduce a mbre difference between thea (pure sinewave tones veteus rich complex tones). Virtually 90 lnproverent of segregation was obtained when the dfference involved the atck and decay times ofthe noes of the two sts or when the ‘yim ofthe overall equence was altered by shifting one staf notes tothat they did nt fall exact alway betwen the notes of theater, for when noes wat added tothe notes of one meld. "When we observe, nan experiment, hat segregation i more strongly ‘Mee by some acoustic actors than odhes, we stl donot kaow whether the observed differences arose because oftheir effects on the primitive iysng mechanism or because of the activity of more sophisti: (Gest mechaniams that use leering or stetion. Evidence that there 5 ethan one mechanism comes fom the different effects that speed “ ‘onthe intentions of the subject. Van Noorden (1975) ie lseners a sequence in which tones in wo diferent frequency iterated, and seked them about the pereptal segregation of toes into high and low steams. He varied the equency separation 24 Auditory scene analy hearing in complex eirowments | ~=—r—st——s——s— a a eet atten to Wen Samet caeseiaegh andi ane mesenger ered i el eee teeny eect Se ee heer ad Stee ape Se er nee ata” howers tease re a Sal ers egrger be gh Miche ach Spcite hegre, aaa te here Le ea Toe ar dace nace Sa cat welt 2 eo ene ca ese ng ae re tne pete eee eae wea te a nak tng eogene etos Mean ae rome eee ber or St te ee ean The cee cae at atten ee hance er ee cicaetea roca a he eas SN ah ne saber For eng sce Shomer yt eg he ae ele oh Hanes tone he Se SS Serre a ef any ne poe ofan ee be anh pene ys neh Se re er aed wc ee FI em can ele, xine th sty SL ier clam er afin crane oes Pe tn cons Welnebed norsk Se ee rece ean ee caer yer Ce Wk ee ta eds in ibaad Se pire ilo CE an nto opine mead (a tae aoa eee a ae rr thet ars mete Ae cgonel hee prs Se Hang oe bc sean a eee et a er eee ited ena by pane Alter. Bren 25, r0uping. However, a purer tet of primitive grouping would be to cone listeners atenional strategies by ashing them to try to se the sume acoustic difference (eg. attack time) on coy tet but to oppose the intentions by introducing other diflerenee that could be employed by Primitive scene analysis to form groupings tht oppose the intended ‘nes. The effectiveness ofthis apposition would reveal the effectiveness With which primitive grouping weed that propery, ‘The readiness ofthe auditory system to group similar sounds when {hey occur ina sequences the bass fora light ferent version ofthe ‘cldcplusnew’ strategy fom the one I described ear. In te examples gave before, one ofthe components ofa mixture started ahead of the mature as 2 whole and continued, without a break, into it The dete _lmpse thatthe auditory system obtained ofthe eatierstartng sound Allowed it tobe factored out ofthe mixture. Asmar, but weaker, elect, ‘curs when two sounds are used-—a simpler one anda more complex ‘one that contains the simpler one as part off ut thee i 2 reak ‘between them. They are rap alienated, with rie silences Between them, in a repeating cyte. Under favourable creumstances, te stent will experience the pure tone vie on each eye, once when ft it presented in isolation and once when the complex tone occurs. This ‘means that the isolated pure tone has captured the corresponding fe= quency out ofthe complex tone nto a puretone stream. A 2 eel the lstener wil hear the pure tone twice per cycle (he second occurence the eomponent captured out ofthe complex tone) The remainder of the ‘comple tone wil form a second sound tha erst occur only ence pet This extraction ofan ealerheard sound ot of» complex on was escrbed eae a th ol plu new strategy The ony difference i the Proent cave is at there in slonc spaing the alae fone fon [counterpart the Iter mire of stat compres the complex fone, However, because the isolated fone (he captor) end ister inside the complex tone ae not conn, ths sale pat irl ot nine ep rey tn en captor ae pre toes, then cenit Aen between ta esute he captung Willen sepentoe, the second occurrence of pare fone on exch eye an no Tonge be ‘ard Gegman and Pinker 197) “This iuence of frequency proximity in th capturing of tone from complex spectrum sa det counterpart iflenes a the Sequenal steaming of fone that cect in ferent frequency ranges The to ‘es can be deere in he same ters an ae fone or steam of Sia tones) its oink self to newly arin eer pr off) in proportion othe pronto he nev compones tose ready the steam. In bth eases theres 9 santa rouping by proxy. 125 Auditory sme analyse: ring in complex enviroments 24 DIFFERENCES IN SPATIAL LOCATION Regularity 2(gradusines of change) applies to spatial location aswell as tothe other fos already mentioned Sounds tha are rented by theese «vent ypcaly come rom the sme poston In space or fons Ication that is changing slowly. Correspondingly, there fa seeno-ansiyse Brincpe that groups sounds that come fromm the se spall lcton Bregman (1950) describes cases in which ths grouping oturs when the sounds appar sequently (p. 75-85), or when the are presented st the same tn (pp. 293-32). An example of heater canbe eae by sending the upper- and lowersrequeny components of «speck soured te seaae nceranoa = paneound wil hed on 2h side ofthe head (eg. Cutting 976) This segregation of spacch components while audi fen doe nt prevent her use epee a forme speech sound, Ths freedom of speech perception fom cos pasion ‘by primitive scene analysis resembles rat happen in the cae of sop, regaton by harmonic relations. dlscss the later farther on topeth withthe reasons oy it may ocut (ee also Bregman 1980, Ch fh Whi the use of spatial lation is central to any atempl by en- sneer to programme computers o segregate the eaund of» poron "penkng fom ater co-ocuring sounds hans dont seem to depend se heavy on this cut, They do we spt ction, bal when it ometes genie» Sequential grouping based om eqn dierent Pea totes eg. Smith ea 382)" 7 opel When two steady sounds are played at the same time in diferent spt locations, the listener's ait Yo derives separate extmae forte location snot very precise (Diveny and Over 1969) Yt bles that there role for spatial diferencesinsudiory scene analy. | would ues tha, in general, they playa faiating fle sony eahoncing ‘seregation based on oer factors ouch as ssynchrony, or dllerences fe frequency or timbre. They would probably greatly faite the flow, Ing of sounds ito mixtures when the err sound and the added ones ‘er a difree tatens should the hata not give absolute priority to the spatial cue? {can Imagine both a reason based onthe environmental information {tall o scene analysis and one based on the physiology of audition, Fist, consider the environmental argument. The graducoos of change ofthe location ofan event in our environment shouldbe tn tng a regula a gaduainess of change of fequeney. Whe ths may wel be tre, the physics of sound makes the evidence about lcaben tas relible: We tie ferences inthe sound received st our two cat othe ay it ltr by our outer eat to derive the direction fom which it was coming, However, sounds can bounce around corners the envi, ‘mento reflect ff walls near one of Ou ear Fan sbaructing Sa Abe. Brgean 27 can move dose tone of or ear atenusing the sound. Thee phe- a ah main non ce Sch vent canot however dhunge the fandametal frequency of» sound rina harmon ets o aa eumey tat a eee efor thereore we should nat be surprised tat fondamentl gurney, rmeoic relator, a requencycompestion ae wed mace song than spatial oeaton. Another posible reson forth onervative we the ation coe maybe found nthe pyslogy of hearing To segregate ‘Sands toed on koa the sudo spe mst a ch frequency components have cme om the sone petal loco hat if ust ssgn separate een etna to ech one This may Be ‘fcuk do when the component are denely packed nthe spectrum Replay 3. When tly ire with rept prin a sens las te efoto ih Te Frsucy components ar maps of «omen Keene | described this regulary earlier and outlined the corresponding strategy that could be employed to group the partials (frequency ‘component in spectrum Let me now desabe afew obeevations and ‘xperiments which show tat people do indeed use such a saegy. The ‘most cbvioussbvervation is tat people ean hear te individual pices ‘of two complex tones played at the same time, even they start and sop together. We have reason to believe that n'a complex tone, many of pati contribute ots pitch (ee Moore 1989, Ch 5} But to drive each pitch conocly, the auditory system mast be including only the partie {fone of the tones and excluding the paras ofthe other one In the present example, ths could only be done by using the harmon lations Eetween the parts ofthe same tone, Ths example shows only that the suditory system can use harmonic reatons for driving fo pitches a the sme time. I does not prove that harmonic relations can be used (0 segregate groups of part for anyother purpose. This root howeve®, thas come foun Iabotatory studies, Darvin and Gardner (1986) showed that a segregation based on hat- ‘monic relations could effect the perception ofa vowel They based their procedure on a finding by Moore ta (1986), 20 let me diverge ite "odescribe this finding, These researchers started with a complex toe a which all partials were harmonically related tothe same fundamental ‘Such a tone i peresived ar a unified whole, with ony 8 singe pitch, ‘Thay found that when alow harmonic was mist fom ie Rarmonic ‘alu by a sufficient amount i was heard ss a separate tone wih plch

You might also like