You are on page 1of 10

Educational Gerontology, 27:159–168, 2001

Copyright Ó 2001 Brunner-Routledge


0360-1277/01 $12.00 C .00

EFFECTS OF AGING ON SELECTED ACOUSTIC VOICE


PARAMETERS: PRELIMINARY NORMATIVE DATA AND
EDUCATIONAL IMPLICATIONS

Steve An Xue
School of Hearing and Speech Sciences,
Ohio University, Athens, Ohio, USA

Dimitar Deliyski
Kay Elemetrics Corp., Lincoln Park, New Jersey, USA

The study reported in this article attempted to obtain normative acoustic data
of voice for elderly male and female speakers and to explore the educational
implications of the effects of aging on those selected acoustic parameters. Voice
samples from 21 male and 23 female elderly speakers aged 70 to 80 years were
obtained on measures of 15 selected Multi-Dimensional Voice Program acoustic
parameters. These data then were compared with the published norms for young
and middle-aged adults. The results showed that, compared with young and
middle-aged adults, elderly speakers had signiŽcantly different (usually poorer)
vocal output on all of the selected acoustic parameters of voice. These Žndings
illustrate the importance of establishing acoustic norms and thresholds for elderly
men and women and stress the necessity of using discretion in making dragnostic
measurements of elderly speakers’ acoustic parameters of voice. This article also
highlights the educational implications of such aging voice changes.

Three factors contributed to the study discussed in this article. First,


there has been a substantial, continuous increase in the proportion of
elderly individuals in the United States (Brock, 1990; Maurer, 1984. As
a result, speech and hearing clinicians have been experiencing a corre-
sponding increase in the number of elderly clients with complaints of

Address correspondence to Steve An Xue, Lindley Hall 201, School of Hearing &
Speech Sciences, Ohio University, Athens, Ohio 45701-2979, USA. E-mail: xue@ohio.edu

159
160 S. A. Xue and D. Deliyski

vocal difŽculties (Mueller, 1985). Second, normative data on the vocal


performance of individuals over age 70 are quite limited (Mueller,
1985, 1991; Xue & Mueller, 1996). Third, with advances in computer
technology, the sophisticated acoustic analysis of voice has become
common practice for many speech and hearing clinicians. For example,
the Multi-Dimensional Voice Program (MDVP; Kay Elemetrics, Corp.,
Lincoln Park, NJ) is one of the leading voice analysis programs. Since
its introduction in 1992, MDVP has been used extensively for voice
diagnosis and treatment in many settings, including rehabilitation
centers, voice clinics, hospitals, schools, and university research labo-
ratories. However, acoustic norms and thresholds of voice for the
elderly, especially those over age 70, have been unavailable. The devel-
opers of MDVP, as well as the manufacturers of many other acoustic
voice analysis programs, provide only acoustic norms and thresholds
computed from the voice samples of young and middle-aged adults
(Deliyski & Gress, 1998). Without normative acoustic data of voice for
elderly speakers or Žndings from cross-sectional studies of young and
middle-aged and elderly speakers, voice clinicians’ diagnostic measure-
ments of elderly speakers’ voice—and the clinical judgments based on
those measurements—may be invalid and unreliable. Therefore, the
purposes of this study were to obtain preliminary normative acoustic
data of voice for elderly male and female speakers for 15 selected
MDVP parameters and compare the acoustic norms and thresholds of
these elderly speakers with those of young and middle-aged adults to
explore the effects of aging on those acoustic parameters of voice and
identify the clinical and educational implications of such aging effects
for speech and hearing clinicians.

METHOD

Participants
Participants were 21 white elderly men (mean age D 75.43 years,
SD D 2.96) and 23 white elderly women (mean age D 74.83 years,
SD D 4.06). All participants were free of perceived speech or voice
disorders and passed a hearing screening of 35 dB HL in at least
one ear at 0.5, 1, 2, and 4 kHz. All participants used English as
their primary language and were from the same dialectal regions. All
participants were in good health, were free of neurological disease, and
were nonsmokers in the previous 5 years.
Aging and Voice Parameters 161

Speech Recording and Analysis Procedures


All participants were seated in a quiet room 20 cm from a high-
quality dynamic microphone (AT825 Audio-technica, Japan) connected
to a digital tape recorder (DA-P1 Tascam, TEAC, Japan). All recordings
were made on to digital audiotapes (RT-R64MP Panasonic, Matsushita,
Japan).
Voice samples were elicited by asking each participant to produce
sustained phonations of the /a/ sound at his or her habitual levels of
pitch and loudness. Each participant practiced sustained phonations
for about 5 min as a warm-up exercise before the recording. The inves-
tigator ensured that each participant was comfortable and competent
in producing sustained phonations at his or her habitual levels of
pitch and loudness. Three sustained phonations (with each phonation
lasting longer than 3 s), were then recorded. The second production
was used for data analysis. To rule out effects of onset and offset effects
of voicing, the segment analyzed was the 1-s portion in the middle
of the vowel production. The selected segments were later digitized
(50-kHz sampling rate) and analyzed using MDVP Model 4305 of the
Computerized Speech Lab Model 4300B Kay Elemetrics). Fifteen of
the MDVP acoustic parameters of voice were chosen for this study. The
other MDVP parameters were excluded as irrelevant for the purposes
of the study (e.g., length of analyzed data sample and highest Fo for
all extracted pitch periods) or lacking sufŽcient proof of validity in
the literature (e.g., total number of segments computed during the
autocorrelation analysis and amplitude tremor frequency).
The 15 selected acoustic parameters were deŽned according to the
Multi-Dimensional Voice Program Model 4305 Manual (Kay Elemetrics
Group, 1992).

1. Average fundamental frequency (Fo Hz) represents the average


fundamental frequency for all extracted pitch periods.
2. Absolute jitter (Jita ms) gives an evaluation of the period-to-period
variability of the pitch period within the analyzed voice sample.
3. Jitter percent (Jitt %) gives an evaluation of the variability of the
pitch period within the analyzed voice sample. It represents the
relative period-to-period (very short term) variability.
4. Phonatory fundamental frequency range (PFR semitones) is the
range between the highest (Fhi) and the lowest (Flo) fundamental
frequencies of the current sample expressed in semitones.
5. Pitch period perturbation quotient (PPQ %) gives an evaluation of
the variability of the pitch period within the analyzed voice sample
at smoothing factor 5 periods.
162 S. A. Xue and D. Deliyski

6. Relative average perturbation (RAP %) gives an evaluation of the


variability of the pitch period within the analyzed voice sample at
smoothing factor 3 periods.
7. Smoothed pitch period perturbation quotient (sPPQ %) gives an
evaluation of the short- or long-term variability of the pitch period
within the analyzed voice sample. At high smoothing factors, sPPQ
correlates with the magnitude of long-term frequency modulations
of Fo. The sPPQ smoothing factor used in this study was 55 periods.
8. Standard deviation of the fundamental frequency (STD Hz) gives
the standard deviation of the fundamental frequency within the
analyzed voice sample.
9. Fundamental frequency variation (vFo %) represents the relative
standard deviation of the period-to-period calculated fundamental
frequency. It reects the very long-term variations of Fo for all of
the analyzed voice sample.
10. Amplitude perturbation quotient (APQ %) gives an evaluation of
the variability of the peak-to-peak amplitude within the analyzed
voice sample at smoothing factor 11 periods.
11. Shimmer in decibels (ShdB dB) gives an evaluation of the vari-
ability of the period-to-period variability of the peak-to-peak ampli-
tude within the analyzed voice sample.
12. Shimmer percent (Shim %) gives an evaluation of the variability
of the peak-to-peak amplitude within the analyzed voice sample.
It represents the relative period-to-period (very short term) vari-
ability of the peak-to-peak amplitude.
13. Noise-to-harmonic ratio (NHR) is an average ratio of energy of the
inharmonic components in the range of 1500 Hz to 4500 Hz to the
harmonic components energy in the range of 70 Hz to 4500 Hz.
It is a general evaluation of the noise presence in the analyzed
signal (e.g., amplitude and frequency variations, turbulence noise,
subharmonic components, and voice breaks).
14. Soft phonation index (SPI) is an average ratio of the lower
frequency harmonic energy (70 Hz– 1600 Hz) to the higher
frequency (1600 Hz –4500 Hz) harmonic energy. Increased SPI
may be an indication of incomplete or loosely adducted vocal folds
during phonation. SPI is very sensitive to the vowel formant
structure because vowels with lower high-frequency energy result
in higher SPI.
15. Voice turbulence index (VTI) is the average ratio of the spec-
tral inharmonic high-frequency energy in the range of 2800 Hz
to 5800 Hz to the spectral harmonic energy in the range of
70 Hz to 4500 Hz in areas of the signal in which the inu-
ence of the frequency and amplitude variations, voice breaks, and
Aging and Voice Parameters 163

subharmonic components are minimal. It measures the relative


energy level of high-frequency noise and mostly correlates with the
turbulence caused by incomplete or loose adduction of the vocal
folds.

As a measure of reliability, tape-recorded speech samples from 6


participants (3 men, 3 women) were selected at random for re– analysis.
Pearson product– moment correlation coefŽcients (r D .98) showed
signiŽcant correlation for all tested and retested variables. Thus, the
reliability of the results were felt to be adequate for the purposes of
this study.

RESULTS
The norms of the 15 selected MDVP acoustic parameters of voice
computed from the voice samples of young and middle-aged adults
(Deliyski & Gress, 1998) and the norms of the elderly speakers obtained
from this study are presented in Table 1. SPSS (1994) multivariate
analysis of variance was performed with the two age groups as the
independent variable and the 15 selected MDVP acoustic parameters
as the dependent variables. The results showed that, as a group, the
elderly male and female speakers had signiŽcantly lower (p < .01) Fo
(159 Hz) than the young and middle-aged speakers (204.85 Hz). For
the other 14 acoustic parameters of voice (APQ %, Jita ms, Jitt %, PFR
semitones, PPQ %, RAP %, sPPQ %, STD Hz, vFo %, ShdB dB, Shim
%, SPI, NHR, and VTI), the elderly speakers had signiŽcantly higher
measurements (p < .01) than the young and middle-aged speakers. In
other words, the elderly speakers over age 70 had signiŽcantly higher
frequency variations of voice (Jita ms, Jitt %, PFR semitones, PPQ %,
RAP %, sPPQ %, STD Hz, and vFo %), higher intensity variations of
voice (APQ %, ShdB dB, and Shim %), and greater noise levels in the
harmonic structures of vowels (NHR, SPI, and VTI) than the young
and middle-aged speakers (see Table 2).

DISCUSSION
In this study, elderly speakers demonstrated different (or poorer)
measurements on all of the selected acoustic parameters of voice
compared with young and middle-aged adults. The natural process
of aging has a signiŽcant impact on the acoustic measurements of
speakers’ vocal output. Thus, it is very important that voice clinicians
be aware of such effects and use discretion when making acoustic
diagnoses and clinical judgments of elderly clients’ voices.
164

TABLE 1 Means and Standard Deviations of Norms of 15 Selected Acoustic Parameters From the Sustained /a/
Productions of Elderly and Young and Middle-Aged Speakers

All elderly All young Elderly men Young men Elderly women Young women
(n D 44) (n D 53) (n D 21) (n D 20) (n D 23) (n D 33)
voice
parameter M SD M SD M SD M SD M SD M SD

Fo Hz 159.02 47.09 204.85 54.82 127.62 29.18 145.22 23.41 187.70 42.15 243.97 27.46
APQ % 4.04 2.78 1.63 0.71 4.20 2.28 1.99 0.81 3.89 3.22 1.40 0.53
Jita ms 143.71 130.24 32.77 27.33 170.11 127.78 41.66 36.48 119.61 130.53 26.93 16.65
Jitt % 2.06 1.81 0.62 0.43 2.10 1.55 0.59 0.54 2.02 2.03 0.63 0.33
NHR 0.19 0.10 0.12 0.01 0.18 0.08 0.12 0.01 0.20 0.11 0.11 0.01
PFR st 4.11 3.19 2.19 1.07 3.52 1.63 2.10 1.07 4.65 4.11 2.25 1.06
PPQ % 1.25 1.17 0.36 0.24 1.24 0.98 0.34 0.29 1.26 1.35 0.37 0.21
RAP % 1.22 1.07 0.37 0.27 1.24 0.92 0.35 0.33 1.20 1.21 0.38 0.21
ShdB dB 0.48 0.37 0.19 0.08 0.49 0.31 0.22 0.09 0.48 0.42 0.18 0.07
Shim % 5.43 4.02 2.21 0.92 5.54 3.51 2.52 1.00 5.34 4.51 2.00 0.79
SPI 14.50 10.50 7.23 4.02 19.24 12.47 6.77 3.78 10.18 5.76 7.53 4.13
S. A. Xue and D. Deliyski

sPPQ % 2.11 2.71 0.54 0.26 1.74 0.98 0.56 0.30 2.45 3.64 0.53 0.22
STD Hz 5.44 8.27 2.18 1.83 3.12 1.56 1.35 0.68 7.57 11.02 2.72 2.12
vFO % 3.53 5.59 1.07 0.83 2.45 1.20 0.94 0.43 4.51 7.59 1.15 1.01
VTI 0.08 0.07 0.05 0.01 0.08 0.07 0.05 0.02 0.08 0.07 0.05 0.01

Note. Norms for young and middle-aged participants taken from Deliyski & Gress (1988). Age ranges: elderly D 70 –80 years; young and
middle-aged D 20 –55 years. Young D young and middle-aged; Fo Hz D average fundamental frequency; APQ% D amplitude perturbation
quotient; Jita ms D absolute jitter; Jitta% D jitter percent; NHR D noise-to-harmonic ratio; PFR of D phonatory fundamental Frequency
range semitones; PPQ % D Pitch period perturbation quotient; RAP % D relative average perturbation; ShdB dB = Shimmer in decibels;
Shim % D shimmer percent; SPI D soft phonation index; sPPQ D smoothed pitch perturbation quotient; STD Hz D standard deviation of the
fundamental frequency; vFo% D fundamental frequency variation; VTI D voice turbulence index.
Aging and Voice Parameters 165

TABLE 2 F Values and Levels of ConŽdence for 15 Selected Acoustic


Parameters

All elderly vs. Elderly men vs. Elderly women vs.


Voice all young young men young women
parameter F (1, 93) F (1, 39) F (1, 54)

Fo Hz 26.10 5.09 25.88


APQ % 33.49 16.21 19.24
Jita ms 35.65 18.28 16.50
Jitt % 30.21 16.48 14.93
NHR 27.13 8.35 21.60
PFR st 16.29 9.95 10.66
PPQ % 27.92 15.15 14.34
RAP % 29.83 16.01 14.88
ShdB dB 27.91 13.39 16.13
Shim % 29.53 13.21 17.62
SPI 26.30 17.85 4.29
SPPQ % 16.52 25.78 9.29
STD Hz 8.32 20.68 6.26
vFo % 9.67 27.38 6.42
VTI 8.58 3.22 5.70

Note. Fo Hz D average fundamental frequency; APQ % D amplitude


perturbation quotient; Jita ms D absolute jitter; Jitta % D jitter percent;
NHR D noise-to-harmonic ratio; PFR st D phonatory fundamental frequency
range semitones; PPQ % D pitch period perturbation quotient; RAP % D
relative average perturbation; ShdB dB D shimmer in decibels; Shim % D
shimmer percent; SPI D soft phonation index; sPPQ D smoothed pitch
perturbation quotient; STD Hz D standard deviation of the fundamental
frequency; vFo % D fundamental frequency variation; VTI D voice turbulence
index.
P < .05. P < .01.

There are a few reports in the literature regarding the effects of


aging on the acoustic parameters of voice. Most of these reports have
indicated that speakers experience certain changes, mostly deterio-
ration, of the acoustic outputs of voice as they age (Hirano, Kurita,
& Nakashima, 1983; Honjo & Isshiki, 1980; Linville & Fisher, 1985;
Mueller, 1985). Increased jitter and shimmer with age have been
observed in both men and women (Linville & Fisher, 1985; Mysak,
1959; Wilcox & Horri, 1980). Linville (1988) and Linville and Korabic
(1987) measured Fo stability in younger and older women on sustained
productions of the vowels /i/, /a/, and /u/. Older speakers indicated
higher intraspeaker variability than younger ones on the percentage
of jitter present. Increases in jitter and shimmer as a function of
aging were found to be even greater in individuals with signiŽcant
atherosclerotic disease or in individual in poor general physical condi-
tion (Ramig & Ringel, 1983; Ringel & Chodzko-Zajko, 1987). In this
166 S. A. Xue and D. Deliyski

study, 8 of the 15 selected parameters of voice are related to frequency


variations (Jita ms, Jitt %, PFR semitones, PPQ %, RAP %, sPPQ %,
STD Hz, vFo %), and 3 are related to amplitude variations (APQ %,
ShdB dB, Shim %). Across all 11 parameters, elderly speakers in this
study demonstrated signiŽcantly different (or poorer) measurements
than the norms of MDVP calculated from the vocal performance of
young and middle-aged adults (Deliyski & Gress, 1998).
Among all the vocal parameters affected by aging, Fo has been
the most extensively studied. Although contradictory results exist,
‘‘there appears to be at least a trend for pitch to increase slightly
in males and to decrease somewhat more prominently in females as
a function of normal aging (Mueller, 1991, p. 22). This trend of a
signiŽcant decrease in Fo in older women has been demonstrated in a
number of other studies (Awan & Mueller, 1996; Higgins & Saxman,
1991; Hollien & Shipp, 1972; Pegoraro Krook, 1988; Russell, Penny, &
Pemberton, 1995). The Žndings of Fo of the elderly female speakers
in this study seem to be in accordance with the previous research.
The elderly female speakers had an average Fo of 187.70 Hz, which
was signiŽcantly lower (p < .01) than the average norm (240.97 Hz) of
the young and middle-aged speakers reported by Deliyski and Gress
(1998). However, contrary to most of the previous research, the elderly
male speakers in this study had a signiŽcantly lower (p < .05) Fo
norm (127.62 Hz) than that of the young and middle-aged speakers
(145.22 Hz).
Levels of spectral noise can be analyzed through various calculations
of harmonic-to-noise ratio (Yumoto, Gould, & Baer, 1982; Yumoto,
Sasaki, & Okamura, 1984). However, few data exist concerning the
spectral characteristics of the aging voice (Colton & Casper, 1996).
Ramig (1983) analyzed the spectral noise levels of younger and older
participants in good and poor physical condition. In general, older
speakers in poor physical condition had greater spectral noise than
older speakers in good physical condition and younger speakers regard-
less of physical condition. As a group, the elderly participants in this
study had signiŽcantly higher (p < .01) VTI (0.08), SPI (14.50), and
NHR (0.19) than the norms (Deliyski & Gress, 1998) of young and
middle-aged adults (0.05, 7.23, and 0.12, respectively). The only excep-
tion was that although elderly men had higher VTI (0.081) than
young and middle-aged men (0.05), this difference was not statistically
signiŽcant (p > .05).
This study provided preliminary normative acoustic data of voice for
the elderly, which has been scarcely reported in the literature. In addi-
tion, this study also identiŽed several very important implications for
manufacturers of speech and voice analysis devices and programs, for
Aging and Voice Parameters 167

educators who train speech and hearing clinicians, and for practicing
speech and hearing clinicians. For example, acoustic voice analysis
programs (like MDVP) need norms for the elderly as well as for
the young and middle-aged because elderly speakers may have quite
different (usually signiŽcantly poorer) acoustic outputs than young and
middle-aged speakers as a result of the natural aging process. Second,
educators who train speech and hearing clinicians should be aware
of the effects of aging on the vocal outputs of the elderly and try to
integrate such knowledge into their teaching so that their students
will have greater awareness of these signiŽcant aging effects. Third,
if the acoustic analysis programs they use do not provide acoustic
voice norms for the elderly, practicing speech and hearing clinicians
must use caution and discretion when making diagnostic evaluations
and clinical judgments of the voices of elderly clients. As the elderly
population in the United States continues to increase rapidly, more
studies of this kind are needed for researchers, educators, and prac-
titioners to better understand the effects of aging on all aspects of
human speech-language communication.

REFERENCES
Awan, S. N., & Mueller, P. B. (1996). Speaking fundamental frequency characteristics
of White, African American, and Hispanic kindergartners. Journal of Speech and
Hearing Research, 39, 573– 577.
Brock, D. (1990). Characteristics of the elderly population in the United States. In
E. Cherow (Ed.), Proceedings of the Research Symposium of Communication Sciences
and Disorders and Aging (ASHA Rep. No. 19). Rockville, MD: American Speech-
Language-Hearing Association.
Colton, R. H. & Casper, J. K. (1996). Understanding voice problems (2nd ed.). Baltimore,
MD: Williams & Wilkins.
Deliyski, D., Gress, C. (1998, November). Intersystem reliability of MDVP for Windows
95/98 and DOS. Paper presented at the 1998 Annual Convention of the American
Speech-Language-Hearing Association, San Antonio, TX.
Higgins, M. B., & Saxman, J. M. (1991). A comparison of selected phonatory, behaviors
of healthy aged and young adults. Journal of Speech and Hearing Research, 34,
1000–1010.
Hirano, M., Kurita, S., & Nakashima, T. (1983). Growth, development, and aging of
human vocal folds. In D. Bless & J. Abbs (Eds.), Vocal fold physiology: Contemporary
research and clinical issues. San Diego, CA: College Hill Press.
Hollien, H., & Shipp, T. (1972). Speaking fundamental frequency and chronologic age in
males. Journal of Speech and Hearing Research, 15, 155–159.
Honjo, I., & Isshiki, N. (1980). Laryngoscopic and voice characteristics of aged persons.
Archives of Otolaryngology, 106, 149– 150.
Kay Elemetrics Corp. (1992). Multi-Dimensional Voice Program Model 4305 manual,
Pine Brook, NJ: Author.
Linville, S. E. (1988). Intraspeaker variability in fundamental frequency stability: an
age-related phenomenon? Journal of the Acoustical Society of America, 83, 741– 745.
168 S. A. Xue and D. Deliyski

Linville, S. E., & Fisher, H. B. (1985). Acoustic characteristics of women’s voices with
advancing age. Journal of Gerontology, 40, 324– 330.
Linville, S. E., & Korabic, E. (1987). Fundamental frequency stability characteristics of
elderly women’s voices. Journal of the Acoustical Society of American, 81, 1196– 1199.
Maurer, J. (1984). Introduction. In L. Jacobs-Condit (Ed.), Gerontology and communica-
tion disorders. Rockville, MD: American Speech-Language-Hearing Association.
Mueller, P. B. (1985). What is normal aging? Part XII: The senescent voice. Geriatric
Medicine Today, 41, 48– 57.
Mueller, P. B. (1991). Vocal aging. Texas Journal of Audiology and Speech Pathology,
17, 21– 24.
Mysak, E. D. (1959). Pitch duration characteristics of older males. Journal of Speech
and Hearing Research, 2, 46– 54.
Pegoraro Krook, M. I. (1988). Speaking fundamental frequency characteristics of normal
Swedish subjects obtained by glottal frequency analysis. Folia Phoniatrica, 40, 82– 90.
Ramig, L. A. (1983). Effects of physiological aging on vowel spectral noise. Journal of
Gerontology, 38, 223–225.
Ramig, L. A., & Ringel, R. L. (1983). Effects of physiological aging on selected acoustic
characteristics of voice. Journal of Speech and Hearing Research, 26, 22– 30.
Ringel, R. L., & Chodzko-Zajko, W. J. (1987). Vocal indices of biological age. Journal of
Voice, 1, 31 –37.
Russell, A., Penny, L., & Pemberton, C. (1995). Speaking fundamental frequency changes
over time in women: a longitudinal study. Journal of Speech and Hearing Research,
38, 101–109.
SPSS (1994) for Windows. Chicago, SPSS Inc.
Wilcox, K. A., & Horri, Y. (1980). Age and changes in vocal jitter. Journal Gerontology,
35, 194–198.
Xue, A., & Mueller, P. B. (1996). Speaking fundamental frequency of elderly African-
American nursing home residents: Preliminary data. Journal of Clinical Phonetics
and Linguistics, 10,(1), 65 –70.
Yumoto, E., Gould, W. J., & Baer, T. (1982). Harmonics-to-noise ratio as an index of the
degree of hoarseness. Journal of the Acoustical Society of America, 71, 1544– 1550.
Yumoto, E., Sasaki, Y., & Okamura, H. (1984). Harmonics-to-noise ratio and psycho-
logical measurement of the degree of hoarseness. Journal of Speech and Hearing
Research, 27, 2 –6.

You might also like