You are on page 1of 7

Vocal fold vibrations at high soprano fundamental frequencies

Matthias Echternach, Michael Dllinger, Johan Sundberg, Louisa Traser, , Bernhard Richter, and , and

Citation: The Journal of the Acoustical Society of America 133, EL82 (2013); doi: 10.1121/1.4773200
View online: http://dx.doi.org/10.1121/1.4773200
View Table of Contents: http://asa.scitation.org/toc/jas/133/2
Published by the Acoustical Society of America

Articles you may be interested in


Three registers in an untrained female singer analyzed by videokymography, strobolaryngoscopy and sound
spectrography
The Journal of the Acoustical Society of America 123, 347 (2008); 10.1121/1.2804939

Glottal behavior in the high soprano range and the transition to the whistle register
The Journal of the Acoustical Society of America 131, 951 (2012); 10.1121/1.3664008
Echternach et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4773200] Published Online 4 January 2013

Vocal fold vibrations at high soprano fundamental


frequencies
Matthias Echternach
Institut of Musicians Medicine, Freiburg University Medical Center, Breisacher Str. 60,
79106 Freiburg, Germany
matthias.echternach@uniklinik-freiburg.de

Michael Dollinger
Department of Phoniatrics and Pedaudiology, Erlangen University Medical Center,
Bohlenplatz 21 91054 Erlangen, Germany
Michael.Doellinger@uk-erlangen.de

Johan Sundberg
KTH Voice Research Centre, Speech Music and Hearing, Lindstedtsv
agen 24,
SE-10044 Stockholm, Sweden
jsu@csc.kth.se

Louisa Traser and Bernhard Richter


Institut of Musicians Medicine, Freiburg University Medical Center, Breisacher Str. 60,
79106 Freiburg, Germany
louisa.traser@uniklinik-freiburg.de, bernhard.richter@uniklinik-freiburg.de

Abstract: Human voice production at very high fundamental frequencies


is not yet understood in detail. It was hypothesized that these frequencies
are produced by turbulences, vocal tract/vocal fold interactions, or vocal
fold oscillations without closure. Hitherto it has been impossible to visually
analyze the vocal mechanism due to technical limitations. Latest
high-speed technology, which captures 20 000 frames/s, using transnasal
endoscopy was applied. Up to 1568 Hz human vocal folds do exhibit
oscillations with complete closure. Therefore, the recent results suggest that
human voice production at very high F0s up to 1568 Hz is not caused by
turbulence, but rather by airflow modulation from vocal fold oscillations.
C 2013 Acoustical Society of America
V
PACS numbers: 43.70.Gr, 43.75.Rs, 43.75.Yy [AL]
Date Received: September 19, 2012 Date Accepted: December 10, 2012

1. Introduction
How do singers produce musical tones at the very top of the human vocal range? This
question has always been a source of fascination, as evidenced by the notoriety of the
high notes in the Queen of the Night aria in Mozarts opera Die Zauberflote, which
calls for pitches up to F6 (1397 Hz), more than two and a half octaves above middle
C. This infamous aria strikes fear into the heart of many classically trained sopranos
as the high notes demand that the singer has not only excellent vocal technique but
also a great deal of courage. In fact singing at very high pitches is often avoided by
singers as it is frequently associated with vocal fatigue and the risk of vocal overuse.
Further the mechanism of voice production needed to achieve these high fundamental
frequencies (F0s) is not yet fully understood.
It has often been assumed that high F0s above 1000 Hz are produced by a
separate vocal mechanical principle,15 leading to the use of special vocal register ter-
minology such as whistle or flageolet register.2,47 One hypothesis states that, in
contrast to lower F0s, voice production at very high pitches does not result from mod-
ulation of the airflow by the vocal folds but is rather a result of air vortices.8,9

EL82 J. Acoust. Soc. Am. 133 (2), February 2013 C 2013 Acoustical Society of America
V
Echternach et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4773200] Published Online 4 January 2013

According to this theory the vocal folds are stiff and a turbulent vortex develops in a
similar manner to sound production by a whistle. In contrast, it has been argued that
the sound signal is not a pure sinusoidal tone, like a whistle, but includes overtones,
even at these F0s.2 One possible explanation for this could be that the vocal folds
interact with the vortex causing small vocal fold vibrations.2,10 Indeed, previous studies
into these F0s have sometimes observed small vocal fold oscillations without complete
vocal fold closure,3,5 which form a tight coupling of the sub- and supraglottic resonan-
ces resulting in the production of a strong air stream. However, these small vocal fold
vibrations might also provide an explanation of voice production at very high F0s if
the myeloelastic-aerodynamic theory of voice production11,12 is applicable at these
pitches as well as for sound production at lower F0s. In this respect it has been
assumed that very high F0s are produced with very small vocal fold vibrations by
shortening of the oscillating vocal fold length, as is typically the case in the flageolet
of string instruments.13 Competing hypotheses have been posited because hitherto it
has been almost impossible to analyze vocal oscillations successfully at very high F0s
due to the technical limitations of previous visual analysis techniques.
One drawback is need for high sampling rates. Whilst both have been used in pre-
vious investigations of voice production at these F0s stroboscopic examinations,3,14,15 on
the one hand, are limited by the construction of an artificial glottal cycle out of many adja-
cent real glottal cycles and high-speed videokymography,5 is limited, on the other hand, by
the relatively low sampling rate of 8000 frames/s. Also, the latter technique used by Svec 
et al. presents only a single line of sight perpendicular to the glottis.5 As one partial aspect,
high speed imaging was used also in a recent study by Garnier et al.16 analyzing vocal fold
oscillations up to A#6 (1865 Hz) in a single female subject. However, the frame rate used
was only 2000 frames/s, so there was a no more than a maximum of 2 frames/glottal cycle.
By this, the accuracy of the analysis of vibratory patterns was limited. Further, studies in
the past have involved the use of a transoral rigid laryngoscopy,3,5 which itself most likely
has a great effect on the physiological singing conditions at these high frequencies. Singing
at very high F0s requires modification of the vocal tract shape and its associated resonan-
ces6,1720 and as rigid laryngoscopy restricts the oral articulators such as the tongue, singers
are not able to produce the voice in its usual physiological manner. Singing with a pro-
truded tongue might also influence the tension of the vocal folds. Finally, for physiological
reasons it is nearly impossible to sustain an /i/ like vowel at very high F0s.11,12 However,
this would be desirable because an upright epiglottis position provides good visibility of the
vocal folds using rigid laryngoscopy. As a consequence of these limitations most of the pro-
fessional singers who participated in previous studies could not perform these high F0s
when the rigid laryngoscope was used and had to be excluded from the analysis.3,5
Transnasal fiber optic high-speed digital laryngoscopy could be used as an al-
ternative technique in order to match the necessary criteria needed for a study of this
kind: good visibility of the vocal folds and minimal influence on vocal tract shaping.
Until now however, it has not been possible to use the transnasal flexible high-speed
technique due to problems in illuminating dark cavities in the human body with suffi-
cient frame rates of up to 20 000 frames/s or more.
2. Materials and methods
In this investigation we applied the latest high speed imaging technique (Camera 1610
and Camera M310, Vision Research, Wayne, NJ) and transformed the camera
(C-mount adapter 25 mm, Fa. Karl Storz, Tuttlingen, Germany) into a transnasal fiber
optic endoscope (ENF GP, Fa. Olympus, Hamburg, Germany), see Fig. 1. The light
quality was also increased by using a Xenon 300 W light source (Visera CLV S40, Fa.
Olympus, Hamburg, Germany). With this equipment, it has been possible to analyze
entire vocal fold oscillations at a frame rate of 20 000 frames/s. In order to verify the
F0 values and to aid the analysis of oscillatory patterns an audio signal and electro-
glottographic signal (EGG, Laryngograph, London) were recorded simultaneously
(audio sampling rate 24 kHz). We analyzed a professional classically trained soprano

J. Acoust. Soc. Am. 133 (2), February 2013 Echternach et al.: Vocal fold vibrations at high frequencies EL83
Echternach et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4773200] Published Online 4 January 2013

Fig. 1. (Color online) Schematic illustration of the experimental setup: The red path shows the transnasal high-
speed recording and post-processing by the phonovibrogram, see also Mm. 1. The electroglottographic signal
(EGG, blue path) and the audio signal (yellow path) are recorded synchronously. The green frame refers to the
pitches with associated fundamental frequencies.

asking her to sustain the pitches of C6 (1047 Hz), D6 (1175 Hz), E6 (1319 Hz), F6
(1397 Hz), and G6 (1568 Hz), whilst maintaining, as far as possible, an /i/ like vowel.

3. Results
The singer experienced a register shift between D6 and E6, which was associated with
a narrowing of the pharyngeal walls, but not with any major laryngeal events. Full
oscillations were observed along the whole length of the vocal folds with complete
vocal fold closure, as shown in Fig. 2 and Mm. 1. From this material a phonovibro-
gram was constructed, which allows the oscillatory characteristics of each vocal fold to
be established by segmenting the free edges of both vocal folds and scanning these lines
over time,21,22 as shown in Fig. 3. As demonstrated by the black parts, which are
related to no distance to the anteriorposterior midline and therefore glottal closure,
there was at each glottal cycle parts of total closure. However, with rising F0 the
closed phase was decreasing in relation to the glottal cycle. Further, as indicated by
the color changes from intense red (light gray), which refers to greatest distance to the
glottis midline to black, vocal fold oscillations were mostly periodic and there was no
major difference in oscillation pattern between the left and right vocal folds. Also, the
frequency of vocal fold oscillations tallies with the F0 calculations of the audio and
EGG signal.
Mm. 1. Laryngoscopic view of the vocal folds using high-speed digital imaging at 20 000
frames/s. The singer is producing the pitch of G6 (1568 Hz). The construction of the pho-
novibrogram is also shown to demonstrate the total closure of the vocal folds at this
fundamental frequency. This is a file of type mov (9.6MB).

EL84 J. Acoust. Soc. Am. 133 (2), February 2013 Echternach et al.: Vocal fold vibrations at high frequencies
Echternach et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4773200] Published Online 4 January 2013

Fig. 2. (a) Anatomical landmarks in the laryngoscopic view. (b) Representative images from the laryngoscopic
high-speed material representing one glottal cycle in relation to a glottal area waveform. Subdivisions of the
glottal cycle are also indicated. The pictures refer to voice production at a fundamental frequency of G6 (1568
Hz). Therefore one glottal cycle is related to the time of 0.64 ms.

4. Discussion
The results clearly show that this professional soprano produced the vocal sound
through modulation and interruption of the airflow by the vocal folds up to 1568 Hz,
which is in strong disagreement to the theory of a whistle-like mechanism. Further, we
found that the whole length of the membranous part of the vocal folds took part in
the oscillatory process, which is in disagreement to the theory of a flageolet like mecha-
nism of voice production. Finally, it is also obvious that there was total closure of the
vocal folds, even at these highest pitches. So, there was no apparent major difference
in voice production at F0s up to 1568 Hz, in comparison to lower frequencies, to that
described by the myeloelastic-aerodynamic theory of human voice production.11,12 It
should be mentioned, however, that human voice production might be possible up to
F0 of 5000 Hz. It cannot be excluded that another voice production mechanism occurs
at higher F0s than measured in the presented study.
Due to the very high frame rate, the number of pixels representing the glottis
varied between 20  40 and 30  70. As a consequence, the phonovibrograms were
associated with some artifacts, such as horizontal lines. However, as shown in Fig. 3,
there were clearly oscillations at the whole membranous part of the glottis and com-
plete closed phases during the glottal cycle.
The finding of total closure during the glottal cycle was unexpected, but it
nevertheless seems reasonable that total closure of the vocal folds occurs, even at these
high F0s. Our study analyzed a professionally trained voice, which needs its own carry-
ing power without the amplification of a microphone. A persistent gap between the
vocal folds, however, could be associated with only weak overtones in the sound

J. Acoust. Soc. Am. 133 (2), February 2013 Echternach et al.: Vocal fold vibrations at high frequencies EL85
Echternach et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4773200] Published Online 4 January 2013

Fig. 3. (Color online) Pitches with associated phonovibrograms, glottal area waveform, electroglottographic
(EGG) signals and acoustic spectra. In the phonovibrogram graphs, black represents the state when that specific
part of the vocal fold is at the glottal midline, the more intense the red (the lighter the gray), the further away is
that part of the glottal midline. The glottal area graph (sum of pixels in the glottis) relates to maximum and min-
imum opening of the glottal area, where the zero-line represents complete glottal closure. In the EGG signal
maximum amplitude relates to maximum contact. Total closure of the vocal fold therefore should be ideally
accompanied by maximum EGG amplitude. The audio spectrum is shown in the final row, where peaks usually
relate to the overtones in the spectrum. In some cases (e.g., for F6) subharmonics (F0/n) also occur.

spectrum. Complete vocal fold closure would therefore seem to be a rather effective al-
ternative in order to fulfill the requirements associated with singing Mozarts Queen of
the Night on stage.

ACKNOWLEDGMENTS
All devices used for the high speed recording were borrowed from the manufacturers. The
authors would therefore like to thank Olympus, Hamburg, Germany, Karl Storz, Tuttlingen,
Germany, and Vision Research, Baden-Baden, Germany, for their help in realizing this study.
The authors would also like to thank Dieter L. Heene and Fabian Burk for editorial help,
Denis Dubrovskiy for help in construction of the supplementary video and Jude Brereton
(www.in2voice.co.uk) for native correction. Also, the authors would like to thank the subject
for her willingness to take part in this study. The works of M.E. and B.R. are supported by the
Deutsche Forschungsgemeinschaft (DFG), Grant No. RI 1050/4-1. The contribution of M.D.
was enabled by the Deutsche Forschungsgemeinschaft (DFG), Grant No. FOR894/1-2.

References and links


1
N. Henrich, Mirroring the voice from Garcia to the present day: Some insights into singing voice
registers, Logoped. Phoniatr. Vocol. 31, 314 (2006).
2
H. Herzel and R. Reuter, Whistle register and biphonation in a childs voice, Folia Phoniatr. Logop.
49, 216224 (1997).

EL86 J. Acoust. Soc. Am. 133 (2), February 2013 Echternach et al.: Vocal fold vibrations at high frequencies
Echternach et al.: JASA Express Letters [http://dx.doi.org/10.1121/1.4773200] Published Online 4 January 2013

3
A. Keilmann and F. Michek, Physiologie und akustische Analysen der Pfeifstimme der Frau
(Physiology and acoustic analysis of whistle voice of the woman), Folia Phoniatr. (Basel) 45, 247255
(1993).
4
B. Roubeau, N. Henrich, and M. Castellengo, Laryngeal vibratory mechanisms: the notion of vocal
register revisited, J. Voice 23, 425438 (2009).
5 
J. G. Svec, J. Sundberg, and S. Hertegard, Three registers in an untrained female singer analyzed by
videokymography, strobolaryngoscopy and sound spectrography, J. Acoust. Soc. Am. 123, 347353
(2008).
6
D. G. Miller and H. K. Schutte, Physical definition of the flageolet register, J. Voice 7, 206212
(1993).
7
R. Miller, The Structure of Singing (Schirmer Books, New York, 1986).
8
P. Schultz, Uber einen Fall von willkurlichem laryngealen Pfeifen beim Menschen (About a case of
voluntary laryngeal human whistling), Arch. Physiol., Suppl. 523 (1902).
9
J. W. Van den Berg, Vocal ligaments versus registers, NATS Bull. 20, 1621 (1963).
10
D. A. Berry, H. Herzel, I. R. Titze, and B. H. Story, Bifurcations in excised larynx experiments,
J. Voice 10, 129138 (1996).
11
I. R. Titze, Principles of Voice Production (Prentice Hall, Englewood Cliffs, NJ, 1994).
12
J. Sundberg, The Science of the Singing Voice (Northern Illinois University Press, Dekalb, IL, 1987).
13
F. Martienssen-Lohmann, Der wissende S anger (The Knowing Singer) (Atlantisverlag, Zurich, 1963).
14
R. Luchsinger and C. Dubois, Phonetische und stroboskopische Untersuchungen an einem
Stimmph anomen (Phonetic and stroboscopic examination on a voice phenomenon), Folia Phoniatr.
8, 201210 (1956).
15
E. J. Garde, Observation stroboscopique de la vibration des cordes vocales dans le petit registre
(ou registre de sifflet) des soprani suraigus (Stroboscopic observation of the vibration of vocal chords
in the smallor whistle register of high sopranos), Folia Phoniatr. 3, 248253 (1951).
16
M. Garnier, N. Henrich, L. Crevier-Buchmann, C. Vincent, J. Smith, and J. Wolfe, Glottal behavior in
the high soprano range and the transition to the whistle register, J. Acoust. Soc. Am. 131, 951962
(2012).
17
M. Echternach, J. Sundberg, S. Arndt, M. Markl, M. Schumacher, and B. Richter, Vocal tract in
female registersa dynamic real-time MRI study, J. Voice 24, 133139 (2010).
18
M. Garnier, N. Henrich, J. Smith, and J. Wolfe, Vocal tract adjustments in the high soprano range,
J. Acoust. Soc. Am. 127, 37713780 (2010).
19
E. Joliveau, J. Smith, and J. Wolfe, Acoustics: tuning of vocal tract resonance by sopranos, Nature
(London) 427, 116 (2004).
20
J. Sundberg, Articulatory configuration and pitch in a classically trained soprano singer, J. Voice 23,
546551 (2009).
21
M. D ollinger, The next step in voice assessment: high-speed digital endoscopy and objective
evaluation, Curr. Bioinform. 4, 101111 (2009).
22
J. Lohscheller, U. Eysholdt, H. Toy, and M. D ollinger, Phonovibrography: mapping high-speed movies
of vocal fold vibrations into 2-D diagrams for visualizing and analyzing the underlying laryngeal dynamics,
IEEE Trans. Med. Imaging 27, 300309 (2008).

J. Acoust. Soc. Am. 133 (2), February 2013 Echternach et al.: Vocal fold vibrations at high frequencies EL87