You are on page 1of 10

The effect of bone conduction microphone placement on intensity

and spectrum of transmitted speech items


Phuong K. Trana) and Tomasz R. Letowski
U.S. Army Research Laboratory, Human Research and Engineering Directorate, Aberdeen Proving Ground,
Maryland 21005

Maranda E. McBride
North Carolina Agricultural and Technical State University, Department of Management,
School of Business and Economics, Greensboro, North Carolina 27411

(Received 22 October 2012; revised 11 April 2013; accepted 15 April 2013)


Speech signals can be converted into electrical audio signals using either conventional air conduction
(AC) microphone or a contact bone conduction (BC) microphone. The goal of this study was to
investigate the effects of the location of a BC microphone on the intensity and frequency spectrum
of the recorded speech. Twelve locations, 11 on the talkers head and 1 on the collar bone, were
investigated. The speech sounds were three vowels (/u/, /a/, /i/) and two consonants (/m/, / /).
The sounds were produced by 12 talkers. Each sound was recorded simultaneously with two BC
microphones and an AC microphone. Analyzed spectral data showed that the BC recordings made
at the forehead of the talker were the most similar to the AC recordings, whereas the collar bone
recordings were most different. Comparison of the spectral data with speech intelligibility data
collected in another study revealed a strong negative relationship between BC speech intelligibility
and the degree of deviation of the BC speech spectrum from the AC spectrum. In addition, the head
locations that resulted in the highest speech intelligibility were associated with the lowest output
signals among all tested locations. Implications of these findings for BC communication are discussed.
[http://dx.doi.org/10.1121/1.4803870]
PACS number(s): 43.38.Si, 43.60.Dh, 43.66.Wv [SAF] Pages: 39003908

I. INTRODUCTION Watson (1937) also reported that BC speech is heard best


with occluded ears and when the listeners mouth is closed
Bone conduction technology has advanced significantly
with upper and lower teeth lightly touching each other.
in recent years and has become mature enough to be used in
Studies in BC speech communication conducted to date
radio communication. Bone conduction (BC) vibrators and
have focused primarily on BC vibrators and reception of BC
BC contact microphones have become available as commu-
speech while only very few recent studies addressed BC
nication interfaces in both civilian and military communica-
microphones and talkers head vibrations during speech pro-
tion systems. BC has several advantages over air conduction
duction. Some analogies between both modes of BC activity
(AC) speech communication interfaces including minimal
can be drawn; however, differences in location of the vibra-
signal interference caused by environmental noise, uncom-
tion source and the main axis of the acting force between
promised auditory awareness of the environment, ability to
reception and emission of BC speech make both transmis-
communicate over a network with ears open, use with or
sion processes quite different.
without hearing protectors, and inconspicuous wearing of
Extensive studies of optimum BC vibrator placement on
the BC system under a cap, helmet, or hair.
the head have been conducted by McBride and colleagues
Some insight about the potential of BC for radio com-
(McBride et al., 2005; McBride et al., 2008a; McBride et al.,
munication can be gleaned from reports about medical appli-
2008b). In their studies, they selected 11 points on the head
cations of BC for speech audiometry. Several authors
and mapped the head in terms of sensitivity to vibratory vowel
reported a high correlation between AC and BC speech
sounds and pure tones in the range of 125 Hz to 8000 Hz. The
reception thresholds (Carhart and Hayes, 1949; Goetzinger
authors concluded that after averaging responses across all
and Proud, 1955; Srinivasan, 1974; Edgerton et al., 1977;
test signals, the most sensitive locations were the mandibular
Leghayi and Karimi, 1998). A similar correlation between
condyle (on the cheek in front of the ear, hereafter referred to
speech discrimination for phonetically balanced words and
in the text as condyle) and the mastoid process (behind the
spondees delivered by AC and BC pathways was also
ear, hereafter referred to in the text as mastoid). Other lateral
reported (Watson, 1937; Robinson and Kasden, 1977; Dolan
locations such as the jaw-angle bone and the bone just above
and Morris, 1990). Unfortunately, all reported data on
the temple (hereafter referred to in the text as temple) were
BC speech audiometry were obtained at a single vibrator
also good locations. The worst locations were the chin and all
location (mastoid) except for Watsons study (forehead).
other points located along the head midline, including the
forehead. The location-related hearing thresholds for vowels
a)
Author to whom correspondence should be addressed. Electronic mail: and pure tones were in close agreement. The differences
phuong.k.tran.civ@mail.mil between the most and least sensitive locations were in the

3900 J. Acoust. Soc. Am. 133 (6), June 2013 0001-4966/2013/133(6)/3900/9/$30.00


1520 dB range depending on the type of experimental condi- differentiated, e.g., forehead (88.2%) and temple (82.2%)
tions and type of signal. Only in two cases of the vowel /a/ versus collar bone (40.0%). As in the case of the BC vibrator
and a 2000 Hz tone were the location-dependent differences studies, the speech sounds recorded at different locations
on the order of 10 dB. In another study, Hodges and McBride were presented to the listeners after being adjusted to the
(2012) reported that the same gender-related differences in au- same intensity.
ditory sensitivity exist for AC and BC signals. Prior to these A direct comparison of CAT speech recorded with AC
studies, Studebaker (1962) compared BC sensitivity at the and BC microphones and transmitted to the listener over a
mastoid, forehead, and vertex and reported that the forehead pair of Thales (Clarksburg, MD) MBITR radios in both quiet
and vertex sensitivities were about equal (the vertex was a lit- and 100 dB A-weighted noise was conducted by Tran and
tle bit more sensitive at low frequencies while the forehead Letowski (2010). The listener received the speech signals
was more sensitive at higher frequencies), but both locations through two BC vibrators located at their condyles. The AC
were less sensitive than the mastoid bone by about 10 dB. microphone was a noise-canceling boom microphone and
Speech intelligibility studies aimed to determine the the BCM was a prototype of a new microphone developed
optimum BC vibrator location confirmed the condyle and by Sensory Devices (New Eagle, PA) mounted on the left
mastoid to be the best locations also from a speech discrimi- mastoid of the talker. The authors reported similar AC
nation point of view (McBride et al., 2008a; Osafo-Yeboah (99.3%) and BC (98.9%) speech discrimination scores in
et al., 2006; Osafo-Yeboah et al., 2009; Stanley and Walker, quiet, and slightly more differentiated scores in noise: AC
2009). Locations investigated in these studies included the (98.5%) and BC (95.2%). Zimpfer and Buck (2008) and
chin, condyle, forehead, mastoid, temple, and vertex. The Buck and Zimpfer (2011) made comparative AC and BC
difference between AC and BC speech intelligibility scores microphone recordings for vowel sounds. The authors
was usually not large. Stanley and Walker (2009) reported reported 64% correct vowel identifications for the BCM and
92% and 90% for AC and BC (condyle) speech intelligibility alluded to better scores obtained with the AC microphone;
scores on the Diagnostic Rhyme Test (DRT), respectively. however, they only discussed the phoneme recognition
Even the worst BC location, the vertex, resulted in an 82% errors made for BC recordings in their papers. In a report by
score. All the differences were statistically significant. Acker-Mills et al. (2005), the authors measured the speech
Osafo-Yeboah et al. (2006) used the Callsign Acquisition intelligibility of an AC and BC microphone with the DRT.
Test (CAT) and reported 92.5% and 84.9% scores for the The results also showed that BCM recordings were less
AC and BC condition, respectively, but this difference was intelligible compared to AC microphone recordings in both
not statistically significant. In another study by a similar quiet and 106 dB A-weighted noise environments.
group of authors who used the CAT as well, the BC scores The purpose of this study was to measure and analyze
for the condyle and mastoid locations were in the the differences in BCM output for several speech sounds
99%100% range (Osafo-Yeboah et al., 2009). In both stud- registered at different locations on the head. The motivation
ies, the same vibrator (Radioear B-71, Sensory Devices, for the study was the expectation that the signal intensity and
New Eagle, PA) was used. Similar BC scores in the signal spectrum differences between various locations of
98%100% range for the condyle and mastoid locations BCM on the head may lead to a better understanding of the
were also reported by McBride et al. (2008b) using the overall differences in speech discrimination scores and spe-
CAT. All speech intelligibility data reported above were cific phonemic errors made by listeners for specific BCM
obtained in quiet and for signals adjusted to the same inten- locations.
sity. Similar in character but obviously lower CAT intelligi-
bility scores for AC and BC conditions were reported for II. METHOD
speech communication in noise (Letowski et al., 2004;
Gripper et al., 2007; Tran and Letowski, 2010). A. Participants
In the case of bone conduction microphones (BCM), Twelve talkers, five males and seven females, with ages
previous efforts in searching for the optimum location were ranging from 2060 yr participated in this study. All partici-
focused on speech discrimination and sound quality. pants were native American English speakers and had no
McBride et al. (2008c, 2011) compared speech intelligibility signs of speech pathology. The participants were recruited
ratings for CAT item recordings made with a BCM placed at from a list of volunteers from previous studies.
12 different locations and reported the forehead (75.8%) and
temple (72.4%) were the best locations and the collar bone
B. Head locations
was the worst (49.8%). These subjective intelligibility rat-
ings were based on a scale from 0% (impossible to under- Twelve locations were selected for inclusion in the
stand) to 100% (perfectly clear and very easy to understand). study: 11 locations were on the head and 1 location was on
A similar method was used to rate each recording based on the collar bone. Head locations were the right and left mas-
sound quality. These subjective ratings were from 1 (very toid, chin angle, chin, forehead, Fz, vertex, Pz, inion, a loca-
unpleasant or annoying) to 5 (very pleasant). The same tion on the frontal bone just above the temple (henceforth
ratings were obtained in a repetition of the study using a referred to as the temple), and condyle. The labels Fz and Pz
larger number of participants and longer list of CAT items are borrowed from electroencephalography location maps,
but only at eight locations (Tran et al., 2008). However, where the letters F and P are abbreviations for frontal
the best and worst speech discrimination ratings were more and parietal lobes, respectively, and z stands for zero,

J. Acoust. Soc. Am., Vol. 133, No. 6, June 2013 Tran et al.: Bone microphone location for speech 3901
which indicates the location is on the midline. All lateral loca- between recording sessions. A custom-made adjustable
tions except for the left mastoid were located on the right headband was used to hold the reference BCM at the fixed
side. The left mastoid was added to verify the assumption that reference location (close to the center of the forehead; see
the effect of the BCM placement is symmetric along the Fig. 1) and at a static force of 425 g-force during all record-
heads midline. The above locations were chosen due to their ing sessions. A custom-made constant force device (courtesy
potential for easy access by communication equipment or of Brigham Young University) was used to maintain the
were predicted as providing good speech signals. One addi- same force by pressing the other BCM against the head of
tional location on the frontal bone was chosen as a reference the talker during recordings. The 425 g-force is less than the
point for this study. All locations are shown in Fig. 1. 550 g-force standard for the testing of bone vibrators, but
there is no static force standard for the testing of bone con-
C. Test sounds duction microphones, and because a 550 g-force is uncom-
fortable for participants, a lower force was used. In addition,
Five speech soundsthree vowels /u/, /a/, and /i/ (as in Toll et al. (2011) reported that changes in static force in the
drew,math, and week, respectively) and two consonants /m/ range of 300550 g-force have minimal impact on BC signal
and / / (as in man and shoes, respectively), were used as intensity data. A B&K 2610 amplifier and a B&K 4133
speech signals in this study. These sounds represent a wide microphone were used to monitor the level of the sounds
range of frequencies and contained critical phonetic charac- produced by the participants.
teristics needed for speech intelligibility of the English The BCM HG-17 is a piezoelectric accelerometer with a
language. relatively flat frequency response across a broad range (within
6 dB from 300 Hz to 6 KHz) and the output level at 1 kHz is
D. Instrumentation 25 dB 6 5 dB (0 dB 1 V/0.5 G, company source).
The recordings were made in a sound isolated room at
E. Procedure
the U.S. Army Research Laboratory, Aberdeen Proving
Ground, MD. The noise floor level of the room was 20 dB Each participant was seated facing the sound level me-
A-weighted (Scharine et al., 2004). ter. The microphone of the sound level meter was placed
The recording equipment included a Bruel and Kjaer approximately 30 cm in front of the talkers lips and two
(B&K, Naerum, Denmark) 4133 AC microphone, two BCMs were then placed on the participants head. The par-
Temco (Tokyo, Japan) HG-17 BC microphones, and a sound ticipant was instructed to pronounce all speech sounds at a
recording system consisting of a Sony (Tokyo, Japan) laptop level of approximately 70 dB sound pressure level (SPL).
computer, 01 dB-Metravib (Limonest, France) Symphonie The talker monitored his or her speech level on the B&K
signal acquisition system, and dBFA32 (Metravib, Limonest, 2610 display. Prior to the formal recording session, several
France) sound and vibration acquisition and analysis soft- practice recordings were made to make sure the talker was
ware. Of the two BCMs, one served as the main microphone comfortable with the recording process.
used at test locations and the other served as the reference The speech sounds produced by each talker were simul-
microphone to account for produced speech level variations taneously recorded through two BC microphones and one

FIG. 1. (Color online) BC micro-


phone placements.

3902 J. Acoust. Soc. Am., Vol. 133, No. 6, June 2013 Tran et al.: Bone microphone location for speech
AC microphone, and subsequently analyzed and used in the
perceptual study. The speech sounds were recorded at one
pre-determined test location at a time. For each location, one
BCM (the movable BCM) was placed on the talker at the
test location and the entire set of five speech sounds was
recorded. The second (reference) BCM was held at one fixed
location on the frontal bone, 1 in. to the left on the midline
of the participants forehead. This microphone served as a
reference microphone to determine and to compensate for
the extent of natural variations in speech production levels
during consecutive BC recordings. The movable BCM was
attached to each test location with thin double sided tape and
pressed down with a force of 425 g-force using the constant
force device. The reference BCM was pressed against the FIG. 2. (Color online) Relative speech levels averaged across talkers and
forehead with the same force provided by the adjustable sounds recorded at 12 BCM locations. Error bars indicate the size of stand-
headband. ard error (95% confidence interval). The 0 dB level corresponds to the level
recorded at the reference BCM location.
During the recording session, the talker was instructed to
pronounce each sound several times so the best recording could
be selected for data analysis. A total of 1560 sound files were A repeated measure analysis of variance (ANOVA) was
recorded for the study ([12 BCM locations 1 AC micro- used to evaluate statistical differences in the levels of the
phone]  5 sounds  2 channels  12 participants 1560). BCM signals. The main factors were location (12 levels) and
sound (5 levels). The results of the analysis showed that
F. Data comparison and rationale there were significant effects of both main factors: BCM
placement (F 42.486, p < 0.001) and sound (F 51.526,
Data collection and analysis involved the overall sound p < 0.001). The interaction between the type of sounds and
level measurements and third-octave band frequency spec- locations was also significant (F 8.75, p < 0.001). Pair-
trum analysis of all AC and BC recorded sounds. For ease of wise comparisons with Bonferroni adjustment between
comparison, the levels of all sounds recorded with the mova- sound levels recorded at the left and right mastoid indicated
ble BCM and the AC microphone were adjusted to align close similarity of both overall levels (p 0.999). Spectral
with the reference BC recordings. The sensitivities of BCM analysis (Sec. III B) of the left and right mastoid recordings
locations were determined based on the differences of sound also showed close similarity between the spectra (62 dB).
levels between the recordings at the test locations and the This supports the notion of general symmetry of the BC
reference location. To compare AC and BC spectra, the dif- vibrations observed on both sides of the talkers head.
ferential spectra were calculated for each sound, BCM loca- Differences between speech levels recorded at the indi-
tion, and talker. vidual locations on the head (as seen in Fig. 2) have a com-
Due to the perfect speech intelligibility scores of sounds plex pattern and do not allow for unequivocal grouping of
recorded with the AC microphone based on an in-house pilot head locations into clusters that are statistically different from
study in a quiet environment, the AC spectra were used as each other. However, given that the forehead is the best loca-
benchmarks for spectral comparisons. To compare spectral tion and the collar bone is the worst location for speech dis-
properties of BC and AC recordings, the third octave spectra crimination (McBride et al., 2011), location clusters that
of all recorded sounds were calculated and analyzed. The produce statistically similar outputs (p > 0.05) mainly involv-
spectra of AC recordings were calculated in dB SPL (refer- ing these two locations are created and shown in Table I.
ence 20 l Pa). The spectra of BC recordings were calculated
in dB relative to 1 lVolt reference. In order to easily visual-
ize the relative difference between these two data sets at 2. Speech sound levels
each frequency, the BCM data were shifted upward by a
The overall speech levels for five recorded sounds (aver-
constant number of 15 dB, roughly corresponding to vol-
aged across all BCM locations) are shown in Fig. 3. The data
tages recorded for the BCM.
show surprisingly similar
levels for all speech sounds except
the / / sound. The / / sound had a significantly lower overall
III. RESULTS
level than all other sounds (p < 0.01). The differences
A. Overall levels between the levels of the four remaining sounds were not
statistically significant (p > 0.99).
1. Head location levels
The data presented in Figs. 2 and 3 are the average data
Figure 2 shows the average differences from the refer- per location and per type of sound, respectively. The ANOVA
ence location of mean speech levels recorded at each test loca- indicates a strong interaction between the BCM location and
tion and the associated standard error. The data indicate that the type of sound used to generate the BCM output. Such
the chin, chin angle, and collar bone were the most sensitive interaction indicates that the BCM output at a given location
locations among the 12 BCM locations tested and that the depends on the type of sound. The corresponding relationship
forehead, inion, and temple were the least sensitive locations. between the BCM speech level and microphone location for

J. Acoust. Soc. Am., Vol. 133, No. 6, June 2013 Tran et al.: Bone microphone location for speech 3903
TABLE I. Clusters of BCM locations (separated by horizontal lines) gener-
ating similar BC speech level outputs. The clusters are formed surrounding
the worst and the best locations for speech discrimination (collar bone and
forehead, respectively), and as a sequential string of locations generating the
highest to the lowest BCM signal output. Each cluster combines locations
that generate levels that are not statistically different from each other.
Conversely, the outputs from locations belonging to different clusters differ
significantly from each other except for the repeated locations.

Chin
Chin angle

Chin angle
Collar bone

Collar bone FIG. 4. Average speech sound levels as a function of the BCM location for
Fz each of the five speech sounds used in the study.
Vertex
Condyle
speech sounds were also found [i.e., / / (14 dB) and /a/ (7 dB)
Pz
L mastoid
sounds, respectively]. The average range of inter-individual dif-
R mastoid ferences for location was 12.1 dB 6 3.6 dB and for sound was
10.1 dB 6 3.4 dB. The large variation observed at the chin angle
Condyle as well as the chin location could be due to the large variation
Pz between participants in terms of the soft-tissue thickness at those
L mastoid
locations and to occasional changes in participants jaw position.
R mastoid
Forehead
B. BCM spectra
Inion
Pz 1. Effect of the head location on BCM sound spectrum
L mastoid
R mastoid The relative differences in BC and AC spectra of all 5
Forehead speech sounds recorded through the BCM at 6 of the 12 origi-
Inion nal head locations are shown in Fig. 5. The locations selected
Temple include the two locations that resulted in the largest and small-
est BCM signal level in this study (chin and temple, respec-
tively), the two locations that resulted in the most and least
intelligible speech recognition scores reported in the study by
all five sounds used in this study are shown in Fig. 4. Clearly McBride et al. (2011) (forehead and collar bone, respectively),
there is very little effect of sound type in four of the five
and the two most common locations used for BC vibrator
sounds used in this study. The only exception is the / / sound
placement in medical and commercial BC applications (mas-
for which the function has a very different shape. toid and condyle). Spectral analysis of speech sounds recorded
For each location or sound combination, the recorded sound
at the full set of 12 head locations indicated that the differential
level differences between individuals were quite large. The great-
spectra at these six locations were representative of the largest,
est and the smallest range of inter-subject variability with respect typical, and smallest differences across all 12 locations.
to BCM location were observed at chin angle (20 dB) and fore-
head (4 dB), respectively. Similar ranges with respect to the 2. Effect of sound type on BCM sound spectrum
Figure 6 shows the mean BC-AC differential spectra
(averaged across talkers) at six BCM locations for the five
recorded sounds (/a/, /i/, /u/, / /, and /m/). The six locations
are the same as those shown in Fig. 5.

IV. DISCUSSION
A. Overall levels
The overall levels of stimulation of BCM at various
head locations (Fig. 1) differ by 15 dB and more. This is a
large range of variation across the talkers head. Due to the
position of the vocal system relative to the head, it was
expected that the talkers head vibrations, and subsequently
FIG. 3. (Color online) Relative mean speech sound levels averaged across
the levels of BCM recordings, would be larger along the
all talkers and BCM locations. Error bars indicate the size of the standard midline than at the sides of the head; however, this was not
error (95% confidence interval). the case. Even though some level differences in the BCM

3904 J. Acoust. Soc. Am., Vol. 133, No. 6, June 2013 Tran et al.: Bone microphone location for speech
FIG. 5. Relative spectral differences
between BC and AC recordings for
six speech sounds used in this study.
Each panel represents data collected
at a different BCM location on the
talkers head.

recordings existed between midline and lateral locations, An important finding of the overall level comparison is
there was no clear statistical difference between the two the very similar overall levels of all sounds at the forehead
groups of locations (forehead, inion, and Pz versus temple, location (Fig. 4), a similarity that is in striking contrast with
mastoid, and condyle). In contrast, the locations close to the all other locations. All other locations show similar behavior
mouth and vocal folds (such as chin, chin angle, and collar on the overall levels with a notable exception of the / / sound.
bone) resulted in significantly higher signal levels as com- Since at the forehead location both the reference microphone
pared to the levels recorded at other locations (Table I). and the test microphone were very close to each other, the
McBride et al. (2011) used the same skull locations for reported finding that the / / sound level was exceptionally
assessment of intelligibility and quality of BC speech. A com- well balanced with the levels of other sounds at this location
parison of their data with signal levels recorded in this study could be an artifact caused by physical or electrical interaction
revealed that skull locations that produce large input signals between the microphones. However, no trace of such interac-
for BCMs also produce signals with limited bandwidth that tion could be found in a follow-up investigation. Therefore,
are not very intelligible. A Spearman rank order correlation we concluded that the reported finding is a real phenomenon
coefficient (q) between the ranks of 12 head locations on sig- indicating that the forehead is a superior location in terms of
nal level (this study) and signal intelligibility scales (McBride overall level balance of all speech sounds used in this study.
et al., 2011) of the BC signals is q 0.675 and is highly Assuming further that the speech sounds used in this study
significant [t(10) 2.892; p 0.0087]. This comparison were a representative sample of all speech sounds, the fore-
clearly indicates that the locations of BCM on the talkers head may be the optimal location for proper balance of energy
head that produce large BCM signals are not necessarily good between all speech sounds when recorded with the BCM
for speech communication purposes. A much better solution microphone for communication purposes. This is not to say,
seems to be to capture BC speech signals from the location however, that all the sounds would be equally intelligible. In
providing a lower level but more intelligible signals and order to address the speech intelligibility issue, the spectra of
amplifying these signals to required levels. all other speech sounds also need to be compared.

J. Acoust. Soc. Am., Vol. 133, No. 6, June 2013 Tran et al.: Bone microphone location for speech 3905
FIG. 6. Relative spectral differences
between BC and AC recordings for
six locations on the talkers head.
The data in each panel were col-
lected for a different speech sound
used in the study.

B. Effect of the head location on BCM sound spectrum low frequency energy and lacked high frequency energy in
comparison to the air recordings. Such sound should be
The spectrum of a specific speech sound recorded with a expected to have a darker timbre and be less intelligible than
BCM depends on the location of the microphone on the talk- corresponding sound recorded with the air microphone.
ers head. Relative spectra of all five speech sounds used in They also should be less intelligible than the same sound
this study at six selected BCM locations are shown in Fig. 5. recorded at the other four locations shown in Fig. 5. This
The curves in Fig. 5 show the normalized spectral difference agrees with the results of the study by McBride et al. (2011)
between the BCM recording made at a specific head location in which recordings made at the collar bone and chin were
and the reference air microphone recording of the same judged to be far less intelligible than the recordings made at
sound. If the spectrum of a given sound recorded with the all other locations shown in Fig. 5.
BCM were identical to the spectrum of the same sound The speech recordings made at the forehead location
recorded with the air microphone, the resulting curve would were judged as the most intelligible in the McBride et al.
be a straight line parallel to the abscissa axis (frequency (2011) study. The spectra of all speech sounds captured at
axis). Assuming that the spectrum of a given sound recorded this location in this study show a relative boost of sound
with the air microphone is the gold standard for recordings energy in the 26 kHz region in comparison to the air record-
made with the BCM, the differential spectrum in Fig. 5 with ings. This boost may be responsible for the relatively high
the most nearly uniform distribution would indicate the best speech intelligibility rating of bone-conducted sounds
BCM recording. recorded at the forehead. A smaller boost in the 26 kHz
Inspection of Fig. 5 shows that the spectra of most region was also found at the temple and condyle locations.
sounds recorded at the forehead, temple, mastoid, and con- These locations also provided good speech intelligibility rat-
dyle locations are relatively flat. In contrast, spectra recorded ings reported in the McBride et al. (2011) study. Therefore,
at the collar bone and chin locations are negatively sloped it can be assumed that if the differentiate curve of the BCM
indicating that the BCM recordings were relatively high in and AC microphone spectra is relatively flat at low

3906 J. Acoust. Soc. Am., Vol. 133, No. 6, June 2013 Tran et al.: Bone microphone location for speech
frequencies and there is a boost in the 24 kHz region, there generated the lowest levels. At the same time, these levels
is a good chance that the speech sounds captured with the are negatively correlated with expected speech intelligibil-
BCM will be high in intelligibility. ities as shown by McBride et al. (2011). Therefore, it can be
A common finding in all the BCM recordings is the hypothesized that strong speech-related vibrations at some
relatively low level of the /a/ sound spectrum at middle locations on the talker are due to head resonances and result
frequencies as compared to the level of the same sound spec- in speech coloration and compromised speech intelligibility.
trum recorded with the air microphone. Note that the /a/ sound Indeed, comparative analysis of speech spectra recorded
is produced with wide open lips unlike all other sounds inves- with the BCM at various head locations and with the air
tigated in the study. Wide opening of the lips boosts aired microphone indicated that the low intensity spectra of the
sound energy at mid-frequencies, a boost not captured by the temple, forehead, condyle, and mastoid placements were
BCM. This observation agrees with the report by Letowski much more similar to the air microphone speech spectrum
and Carvella (1994), who measured sound levels produced at than the spectra recorded at stronger vibrating locations.
and in the occluded ear of the talker. The authors reported that Among all the recordings made at 11 BCM head place-
the relative level of the /a/ sound produced in the occluded ear ments, the forehead and temple have shown the greatest
and at the ear through open lips was much lower than the rela- similarity of the recorded spectra to the respective spectrum
tive levels of other speech sounds. Since the sounds produced of the air conduction recording. McBride et al. (2011) also
in the occluded ear can be considered as generated by bone- showed these two locations as producing the speech that
conducted speech, this observation agrees with the result of was rated as the most intelligible across all 11 placements.
the present study. Since the /a/ sound is one of the strongest These locations, together with the condyle and mastoid,
sounds in human speech, its lower level at mid-frequencies seem to be the placements of choice for the BCM for BC
frequencies not critical to speech intelligibilityis not neces- communication.
sarily a reason for special concern for the BCM. The reported datatogether with the data reported by
McBride et al. (2011)show that when considering the
C. Effect of type of sound on BCM sound spectrum tradeoff between speech intelligibility and high level input
signal to the BCM, speech intelligibility should be the ulti-
The curves shown in Fig. 6 present the same informa-
mate criterion. Such an approach may result in reception of
tion as the curves in Fig. 5, but are organized by the type of
bone-conducted speech at places that generate 1020 dB
sound rather than by the BCM location to illustrate more
lower signals than other locations. However, the differences
clearly the effect of bone conduction on the type of sound.
in the speech intelligibility ratings between various BCM
The main observation that can be made by inspection of
placements are quite large and must be considered. The loss
Figs. 5 and 6 is that bone conducted speech contains high
of sensitivity can be easily compensated for by a miniature
frequency energy in the region exceeding 4 kHz. However,
auxiliary amplifier attached directly to the BCM while the
in comparison to air microphone recordings, bone conduc-
loss of speech intelligibility is not easily undone.
tion boosts low frequency energy up to about 20 dB. This
Notice that a BCM in-the-ear location was not investi-
effect is especially evident for /m/ and /u/ sounds at some
gated in this study. Such a location compromises the greatest
locations. The data for all locations on the head shown in
advantage of bone conduction communicationcommuni-
Fig. 6 also show a relatively large boost of high frequency
cation with unoccluded earswhich allows for simultaneous
energy for the / / sound in the 26 kHz region. A smaller
auditory awareness of the surrounding environment and
boost can be seen in the case of the /u/ sound. Assuming
direct face-to-face communication.
again that the air conducted speech spectrum is the gold
In summary, this study shows that the signal intensity
standard for bone-conducted speech, the sound /i/ shows the
and frequency spectrum of BC recordings significantly
most unfavorable BCM speech spectrum, regardless of the
depend on the BCM location and somewhat less on the type
BCM location. However, even in this case, the forehead
of speech sound. The high-level resonance-driven locations
location seems to be relatively good. In general, all five
on the talkers head are poor choices for the BCM location
sounds show relatively small variation of spectra among the
due to colored and poor intelligibility speech. From all loca-
forehead, temple, condyle, and mastoid locations. The collar
tions investigated in this study, the forehead seems to be the
bone and chin locations are equally detrimental for all
best location for the reception of bone-conducted signals.
sounds, but especially for the /a/ sound.
Some other low-output locations, such as the temple, mas-
toid, and condyle are also acceptable if preferred from the
V. CONCLUSIONS
practical (operational) point of view.
The results of this study shine some light on the underly-
ing causes of some previously documented poor experiences
Acker-Mills, B., Houtsma, A., and Ahroon, W. (2005). Speech intelligibil-
with using a BCM as an interface for communication pur-
ity with acoustic and contact microphones, in Proceedings of New
poses. The previous attempts to locate the BCM at the top of Directions for Improving Audio Effectiveness, RTO-MP-HFM-123, paper
the head or at other locations leading to high BCM output 7, Neuilly-sur-Seine, France, pp. 114.
led apparently to compromised speech intelligibility. Buck, K. K., and Zimpfer, V. V. (2011). Ambiguity in phoneme recogni-
tion when using a bone conduction microphone (A), J. Acoust. Soc. Am.
The BCM recordings made in this study at the talkers 130(4), 2469.
chin and chin angle had the highest overall intensity levels, Carhart, R., and Hayes, C. (1949). Clinical reliability of bone conduction
while the locations at the temple, inion, and forehead audiometry, Laryngoscope 59(10), 10841101.

J. Acoust. Soc. Am., Vol. 133, No. 6, June 2013 Tran et al.: Bone microphone location for speech 3907
Dolan, T. G., and Morris, S. (1990). Administering audiometric speech Osafo-Yeboah, B., Gripper, M., McBride, M., and Jiang, X. (2006). Effects
tests via bone conduction: A comparison of transducers, Ear Hear. 11(6), of bone conduction vibrator placement on speech intelligibility using the
446449. Callsign Acquisition Test CAT)A pilot study, Paper presented at the
Edgerton, B. J., Danhauer, J. L., and Beattie, R. C. (1977). Bone conduc- 2006 Annual Industrial Engineering Management Systems Conference,
tion speech audiometry in normal subjects, J. Am. Audiol. Soc. 3(2), March 1315, Cocoa Beach, FL.
8487. Osafo-Yeboah, B., Jiang, X., McBride, M., Mountjoy, D., and Park, E.
Goetzinger, C. P., and Proud, G. O. (1955) Speech audiometry by bone (2009). Using the Callsign Acquisition Test (CAT) to investigate the
conduction, A. M. A. Archives of Otolaryngol. 62(6), 632635. impact of background noise, gender, and bone vibrator location on the
Gripper, M., McBride, M., Osoafo-Yeboah, B., and Jiang, X. (2007). Using intelligibility of bone-conducted speech, Int. J. Ind. Ergonom. 39(1),
the Callsign Acquisition Test (CAT) to compare the speech intelligibility 246254.
of air versus bone conduction, Int. J. Ind. Ergonom. 37(7), 631641. Robinson, M., and Kasden, S. (1977). Bone conduction speech discrimi-
Hodges, M. L., and McBride, M. E. (2012). Gender differences in bone nation, Arch. Otolaryngol. 103(4), 238240.
conduction auditory processing: Communication equipment design Scharine, A. A., Tran, P., and Binseel, M. (2004). ARL acoustic measurements
implications, Int. J. Ind. Eng. 42, 4955. in buildings 518 and 520 at APG, ARL Technical Report ARL-TR-0580,
Leghayi, M. S., and Karimi, A. (1998). Evaluation of the speech test via U.S. Army Research Laboratory.
bone conduction in normal subjects aged between 1825 years old, Iran. Srinivasan, K. P. (1974). Bone-conducted speech reception threshold,
Audiol. 6(12), 2629 (abstract in English). Scand. Audiol. 3(4), 145152.
Letowski, T., and Caravella, J. M. (1994). Sound levels produced at and in Stanley, R. M., and Walker, B. N. (2009). Intelligibility of bone-conducted
the occluded ear of the talker, Arch. Acoust. 19(2), 139146. speech at different locations compared to air-conducted speech, in
Letowski, T., Mermagen, T., Vause, N., and Henry, P. (2004). Bone con- Proceedings of the 53rd Annual Meeting of the Human Factors and
duction communication in noise: A preliminary study, in Proceedings of Ergonomics Society, October 1923, San Antonio, TX, pp. 10861090.
Eleventh International Congress on Sound and Vibration, July 58, St. Studebaker, G. A. (1962). Placement of vibrator in bone-conducting
Petersburg, Russia. testing, J. Speech Hear. Res. 5(4), 321331.
McBride, M., Hodges, M., and French, J. (2008a). Speech intelligibility Toll, L., Emanuel, D., and Letowski, T. (2011). Effect of static force on
differences of male and female vocal signals transmitted through bone bone conduction hearing thresholds and comfort, Int. J. Audiol. 50(9),
conduction in background noise: Implications for voice communication 632635.
headset design, Int. J. Ind. Ergonom. 38(1112), 10381044. Tran, P., and Letowski, T. (2010). Speech intelligibility of air and bone
McBride, M., Letowski, T., and Tran, P. (2005). Bone conduction head sen- conducted speech over radio transmission, J. Acoust. Soc. Am. 127,
sitivity mapping: Bone vibrator, ARL Technical Report ARL-TR-3556, 1896.
U.S. Army Research Laboratory. Tran, P., Letowski, T., and McBride, M. (2008). Bone conduction micro-
McBride, M., Letowski, T., and Tran, P. (2008b). Bone conduction recep- phone: Head sensitivity mapping for speech intelligibility and sound qual-
tion: Head sensitivity mapping, Ergonomics 51(5), 702718. ity, in Proceedings of IEEE, 2008 InternationalCconference on Audio,
McBride, M., Tran, P., and Letowski, T. (2008c). Head mapping: Search Language and Image Processing, July 79, Shanghai, China, pp. 107111.
for an optimum bone microphone placement, in Proceedings of the 52nd Watson, N. A. (1937). Hearing of speech by bone conduction, J. Acoust.
Annual Meeting of the Human Factors and Ergonomics Society, Soc. Am. 9(2), 99106.
September 2226, New York, NY, pp. 503507. Zimpfer, V., and Buck, K. (2008). Ambiguity in the recognition of phonetic
McBride, M., Tran, P., Letowski, T., and Patrick, R. (2011). The effect of vowels when using a bone conduction microphone, in Proceedings of the
bone conduction microphone location on speech intelligibility and sound European Acoustic Association Meeting (Acoustics08), June 30July 4,
quality, Appl. Ergon. 42(3), 495502. Paris, France, pp. 489494.

3908 J. Acoust. Soc. Am., Vol. 133, No. 6, June 2013 Tran et al.: Bone microphone location for speech
Copyright of Journal of the Acoustical Society of America is the property of American
Institute of Physics and its content may not be copied or emailed to multiple sites or posted to
a listserv without the copyright holder's express written permission. However, users may
print, download, or email articles for individual use.

You might also like