You are on page 1of 8

Veterans

Administration

Journal of Rehabilitation Research


and Development Vol . 24 No . 4
Pages 65--74

Application of adaptive digital signal processing


speech enhancement for the hearing impaired
DOUGLAS M . CHABRIES, RICHARD W . CHRISTIANSEN, ROBERT H . BREY, MARTIN S.
ROBINETTE, and RICHARD W . HARRIS*
Brigham Young University, Provo, Utah and Mayo Clinic Audiology Department, Rochester, Minn( sotc

Abstract--A major complaint of individuals with normal problem are described, including signal processing tech-
hearing and hearing impairments is a reduced ability to niques that offer promise in the noise suppression appli-
understand speech in a noisy environment . This paper cation.
describes the concept of adaptive noise cancelling for
removing noise from corrupted speech signals . Applica-
tion of adaptive digital signal processing has long been INTRO?DUCTION
known and is described from a historical as well as
technical perspective . The Widrow-Hoff LMS (least mean Hearing impairment is not only the most prevalent
square) algorithm developed in 1959 forms the introduc- communicative disorder, it is also the number one
tion to modern adaptive signal processing . This method chronic disability affecting people in the United
uses a "primary" input which consists of the desired States . A major complaint of those with hearing
speech signal corrupted with noise and a second "ref-
impairments is a reduced ability to understand speech
erence" signal which is used to estimate the primary
noise signal . By subtracting the adaptively filtered esti- in everyday communication in a noisy environment.
mate of the noise, the desired speech signal is obtained. Even with the absence of hearing impairment, the
Recent developments in the field as they relate to noise addition of background noise can signifiantly reduce
cancellation are described . These developments include the intelligibility of speech . In 1956, Widrow pro-
more computationally efficient algorithms as well as posed an adaptive filter as shown in Figure 1 which
algorithms that exhibit improved learning performance. can be used to reduce interference when a second
A second method for removing noise from speech, for sample of the noise is available . This technique was
use when no independent reference for the noise exists, developed at Stanford University in 1959 and applied
is referred to as single channel noise suppression . Both to a pattern-recognition scheme known as Adaline.
adaptive and spectral subtraction techniques have been In 1965 the first adaptive noise cancelling system
applied to this problem—often with the result of decreased
was built by two students at Stanford University.
speech intelligibility . Current techniques applied to this
In 1972, the first all-digital adaptive filter was built
by McCool and Widrow at the Naval Undersea
Center in Pasadena, California . In 1975, several
* Douglas M . Chabries and Richard W. Christiansen are with the
Electrical Engineering Department, Brigham Young University, Provo, applications of the LMS algorithm were presented
Utah 84602. which included adaptive noise cancelling and noise
Robert H . Brey and Richard W . Harris are with the Department of suppression (32) . The LMS algorithm was simple—
Educational Psychology (Audiology), Brigham Young University, Provo, both in the number of calculations required for it's
Utah 84602.
Martin S . Robinette is with the Mayo Clinic Audiology Department, update and in it's derivation—and robust in a number
Rochester, Minnesota 55905 . of applications . An adaptive feedback constant, p ,
65
67
Section II. Noise Reduction : Chabries et al.

SIGNAL + INTERFERENCE where o represents the variance of the reference


noise process, n,.ef(k), and -r is the desired adap-
tation time in samples . In no case should the
upper limit for p, exceed oV2 or instability will
result.
3. A time delay must be inserted into either the
primary or reference channel as necessary to
insure that the desired filter is causal . That is,
Figure 2.
An adaptive LMS filter.
the reference noise signal should enter the ref-
erence input to the adaptive filter during the same
processing period in which the correlated primary
tion from the desired optimal filter results . By interference arrives at the summing junction in
reducing the size of the feedback coefficient in Figure 1.
Equation [3], the misadjustment can be made arbi- 4. In cases where there is leakage, care must be
trarily small . However, since the adaptation time is taken to minimize leakage of the desired speech
inversely proportional to µ, a large µ for fast learning signal into the reference input . In such cases, the
is required in applications where the statistics of the interference cancellation is limited to the ratio of
input signals vary with time . This selection, how- the interference signal to speech signal in the
ever, results in increased misadjustment or residual reference.
error . One may optimize the choice of µ in cases Following these four procedures in the use of an
of these nonstationary inputs by selecting a feedback adaptive filter has resulted in interference reductions
constant such that the error due to tracking the corresponding to speech enhancement of :gip to 60
nonstationarity in the signal just equals the misad- dB ., but more typically 30 dB.
justment or error residual that occurs because of A variant of the LMS algorithm that provides the
the constant updates that occur after the filter ability to adapt the feedback constant µ was devel-
coefficients, wp (k), have converged to their desired oped by Harris (17) . In this approach known as the
value (31). VS adaptive algorithm, a separate p p (k) is calculated
Since the filter parameters adapt in such a way for each stage of the filter . Results of the VS
as to provide an estimate of n p,,, (k) from the reference algorithm applied to noise cancellation show a speed-
signal n,.ef (k), the task in applying the algorithm to up in adaptation time of up to a factor of 50 without
noise cancellation becomes one of providing suffi- increasing the residual error while maintaining both
cient degrees of freedom that an acceptable solution speech quality and intelligibility.
may be obtained . The following design or selection
criteria must be followed: Frequency Domain Adaptive Filters
1. The number of digital filter stages (N in Equation One of the drawbacks of the LMS adaptive filter
[1] should be selected so that N times T, the in processing speech signals is the error criterion,
sample period of the digital system, is larger than which is selected for the minimization of mean
the impulse response or reverbertion time of the square error and which results in an adaptation
acoustic environment . For small rooms, typical process that treats frequency regions of higher
filter lengths are on the order of 1,000 to 2,000 energy content before adapting to regions of low
stages for a sample rate of 10 kHz . It is not energy. As a result, the lower frequency regions of
uncommon to require 8,000 to 16,000 stages of the speech spectrum receive an inordinate amount
adaptive filtering for moderate size rooms at a of attention at the cost of the high-frequency speech
sample rate of 10 kHz. regions—which contain much of the intelligibility.
2. The selection of the feedback constant µ is made As a second consideration, users of adaptive filters
according to the desired adaptation rate . The are always anxious to find more efficient computa-
choice of µ for a given adaption rate with common tional techniques to perform the adaptive filtering
broadband noise interference is given as tasks . As a result, several papers describing fre-
quency-domain implementations for adaptive filter-
[4] ing have been presented (7,9,10,12,26,30). One of

69
Section H . Noise Reduction : Chabries et al.

corrupted 1,y rence when no independent or


second referel i- is available is referred to as noise
suppression.
In 1978 Sambur (27) proposed to apply an LMS
version of the digital adaptive filter described in the
foregoing but using reference delays equal to one
or two voice pitch periods . Sambur reasoned that
in the speech component of the corrupted signal
there would be strong correlation between the pri-
mary and delayed reference inputs . He reported Figure 3.
improved SNR and speech quality but did not claim Time domain adaptive LMS noise suppressor.
improved intelligibility . In fact, the time domain
LMS adaptive filter concentrates its computational
3 . Reverberation is introduced because the LMS
power first on those frequencies in the signal with
algorithm in Eqs . 1-3 responds to minimize mean
the highest energy to minimize mean square error,
square error and will leave large values for w,,(k)
(i . e ., pitch and pitch harmonies where little intel-
when n,, t(k-p) becomes small or is zero—as is
ligibility is carried) and finally on frequencies with
the case during the silent portions of speech . The
the least energy (i .e ., high frequency sounds where
sound introduces is reminiscent of listening to a
most of the information in speech is carried) . This
sea-shell and li narfrig that reverberant back-
results in an output that sounds like muffled speech
ground.
deplete of high-frequency information.
The general behavior of the adaptive noise sup-
The muffling effect appears to be present in most
pressor with the inherent problems just described
speech enhancement systems, prompting the state-
may be seen in Figure 4 which shows the spectrum
ments by Lim (21) and Schafer (29) that the various
of noise-free speech before and after processing with
speech enhancement systems appear to improve the
the LMS algorithm as proposed by Sambur . Here
subjective speech quality but not speech intelligi-
it is clear that the high-frequency information above
bility, and that successful approaches must exploit
about 1 .2 kHz, which is evident in Figure 4a, is
more knowledge about the information-bearing ele-
gone in Figure 46.
ments of speech .
dB
Time Domain Filters for Noise Suppression
Figure 3 describes the application of the LMS
Widrow-Hoff time domain adaptive algorithm to the
problem of noise suppression . This filter is attractive
because of its relative simplicity . Sambur (27) pro-
posed the time domain implementation of the adap- 1.25 2. 50 KHz
tive filter shown in Figure 3, which employs the (a) Before Processing
LMS Widrow-Hoff algorithm in Eqs . 1-3 for the dB
filter coefficient updates . Implementation of this
LMS adaptive filter results in three deficiencies:
1. The speech spectrum is distorted, with the low-
frequency region enhanced due to the high energy
content at the low frequencies . This is a result
of the mean square error criterion.
2. Unvoiced speech sounds are eliminated in the 1 .25 2. 50 KHz
signal processing by large delays which are typ- (b) After Processing
ically a pitch period . This results in confusion
Figure 4.
between such words as net, nets, next, etc. Noiseless speech before and after processing by an adaptive
leading to reduced intelligibility . LMS filter .

70
Journal of Rehabilitation Research and Development Vol . 24 No . 4 Fall 1987

There are several approaches to reducing the dB


deleterious effects introduced by time domain adap-
tive processing . By forcing increased processor
attention to high frequencies, the spectral distortion
may be made acceptable. A second problem is a r
observed during the change from speech to silence 1 .25 2.50 KHz
between words . The time domain adaptive processor (a) Noiseless Speech
stops updating the filter weights when the input dB
signal power decreases significantly (during speech
silence) so the filter "remembers" the weights from
the prior speech . As the next word begins, an
annoying echo or synthetic reverberation effect is
l 1 s
produced.
1 .25 2.50 KHz
One solution for overcoming these deleterious (b) Pre-whitened Noiseless Speech
effects has been presented by Andersen (6) . His
treatment included the following: dB "
1 . Pre-whitening . The general behavior of the adap-
tive LMS filter in processing both high frequency-
low energy and low frequency-high energy por-
tions of the speech spectrum is illustrated in KHz
1 .25 2.50
Figure 4. In Figure 4b it is clear that the high
(c) Pre-whitened Noiseless Speech after Processing
frequency information bearing elements of the
Figure 5.
speech sample have been removed . By applying Pre-whitened noiseless speech before and after processing by
a pre-whitening filter with the transfer function an adaptive LMS filter.
given in Eq . 5
leak rate is proportional to p and may be adjusted
H(z) = 1 + az-' [5]
for listener preference.
where the value for a places the corner frequency The effect of these techniques in preserving the
at 100 Hz . high frequency information is shown in Figure 5c.
2. Preservation of unvoiced speech . If the speech is Other techniques to accomplish similar effects have
delayed more than .5 to 1 ms ., the high-frequency been proposed by Graupe (16) . Application of an-
portions become decorrelated and hence are other class of adaptive algorithms known as the
removed along with the decorrelated noise. least-squares algorithms (previously discussed) also
Therefore the amount of delay, A, allowed for promises to solve the problems, introduced by the
decorrelation in speech is reduced . Typical values LMS adaptive filter, that arise primarily due to the
reported for the delay in Figure 4 were on the nonuniformity of the signal spectrum and the slow
order of 3 to 5 samples with a sample rate of 14 adaptation time.
kHz . This will work well as long as the noise
interference is not still correlated for this short Spectral Subtraction
Spectral subtraction is a technique that exploits
delay.
the idea that the human hearing system is insensitive
3. Lossy LMS Algorithm . A modified weight update to phase information in monaural hearing applica-
equation as proposed by Gitlin (15) and given in tions. Boll (3) and Lim et al . (23) proposed that the
Equation [6] was employed. short-time spectral magnitude be used to estimate
wp(k + 1) = wp (k)(1– p) + 2 p,e(k)n, et (k – p) [6] the speech spectrum . While a number of methods
exist to extract this estimate of the noise spectrum
The purpose of this modification is to introduce and subtract it from the contaminated speech signal,
a "leak" factor so that the filter coefficient values the basic technique uses an estimate derived from
are not "remembered" during silent portions of Equation [7] .
speech . Instead the weights are "forgotten" with
a time constant set by the leak factor (1 – p) . The S(w) D(w) h - pri(t0 ) [7]
71
Section II . Noise Reduction : Chabries et al.

where the capital letters refer to the Fourier trans- domain techniques appear to come to the forefront
forms of the s(k), d(k), and npr;(k), respectively . In in noise suppression . The frequency domain algo-
Equation 17], the value of S(w) is set to zero if the rithms discussed earlier may be adapted to a mod-
estimate of the noise, Np,-i(w), is larger than D(w). ified linear error criterion by the selection of an
appropriate vector µ ; however, care must be taken
to avoid circular convolution effects . Implementa-
Error Criterion tion of recruitment in such algorithms is difficult.
Common to all of the above techniques is the The frequency domain technique that seems to offer
underlying notion that an increase in signal-to-noise the greatest flexibility is one proposed by Ferrara
ratio will result in improved intelligibility . Indeed, in 1985 (13) . This implementation utilizes an FFT-
the minimization of mean square error focuses upon based implementation with a bank of band-pass
portions of the speech spectrum that have the most filters and a basebanded output to accomplish adap-
energy, and tends to lightly address the higher tive LMS filtering . It would appear that such algo-
frequency portions of the speech spectrum that carry rithms offer some promise of noise suppression with
significant portions of the information required for intelligibility gains—but then only at positive signal-
speech intelligiblity . With this question is the com- to-noise ratios.
panion inquiry, "Does an increase in signal-to-noise
ratio, which does little to aid normal-hearing lis-
teners, offer relief for hearing-impaired listeners?" CONCLUSION
Can these algorithms, which provide their best
performance at higher signal-to-noise ratios (i .e.,
such as + 6 dB), provide hope for hearing-impaired Several implementations of techniques for accom-
listeners to function with processing in such an plishing adaptive noise cancelling and noise suppres-
environment where they might typically require sion have been discussed . Application of two-chan-
perhaps an additional 12 dB of signal to function? nel noise cancellation has yielded significant gains
To answer this question, considerable effort is in speech intelligibility, in some cases by 40 percent,
planned with normal and hearing-impaired popula- while simultaneously improving signal-to-noise ratio
tions . However, in the midst of these efforts one is for speech corrupted with a variety of types of
tempted to revisit the basic notions of speech intel- corrupting noise—including speech babble and
ligiblity and the human hearing system . Current broadband noise . Reports in the literature indicate
efforts in noise suppression are focusing upon models increases in signal-to-noise ratio of greater than 18
of the hearing system . Examination of the Fletcher- dB and of up 60 dB in specific applications . The
Munson curves, which describe the contours of effective application of single-channel adaptive noise
equal loudness, suggest the use of homomorphic suppression has shown progress with uniform re-
digital signal processing techniques where one min- ports of an increase in "quality ." No data indicating
imizes the error, not in the incoming acoustic pres- intelligibility improvement for normal-hearing pop-
sure field, but rather in a transformed space that ulations is presently available in the literature . Tests
represents the response of the ear . Further, espe- with hearing-impaired populations to determine the
cially in the context of the hearing-impaired listener, efficacy of noise suppression should be available to
that nonlinear response of the ear to a linear increase the community shortly ; likewise, new algorithms
in acoustic intensity known as recruitment must be which modify the error criteria in the context of the
accommodated in such a model. human hearing system would seem to provide the
While several algorithms may be proposed to greatest hope for improving the processing of acous-
facilitate this modified error criterion, the frequency- tic signals in the future .
72
Journal of Rehabilitation Research and Development Vol . 24 No . 4 Fall 1987

REFERENCES

1. Bode H and Shannon C : A simplified derivation of linear 16. Graupe D : Applications of estimation and identification
least squares smoothing and prediction theory . Proc . IRE, theory to hearing aids . In Auditory and Hearing Prosthetic
38 : 417–425, 1950. Research, ed . Stile SW, 169-181 . New York : Grune and
2. Boll SF : Adaptive noise cancelling speech using the short Stratton, 1979.
time transform . Proceedings ICASSP, 3 : 692-695, 1980. 17. Harris RW, Chabries DM, and Bishop FA : A variable step
3. Boll SF: A spectral subtraction algorithm for suppression (VS) adaptive filter algorithm . IEEE Trans . ASSP, ASSP-
of acoustic noise in speech . Proceedings of IEEE ICASSP, 34(2) :309–316, 1986.
200–203, 1979. 18. Howells P : Intermediate Frequency Side-Lobe Canceller,
4. Brey R, Chabries DM, Christiansen RW, Robinette M, and Aug 24, 1965 . US Patent 3,202,990,
Jolley : Improved intelligibility in noise using adaptive 19. Kalman R : On the general theory of control . Proc . Est
filtering I : Normal subjects . Proceedings of ASHA Con- IFAC Congress . London : Butterworth, 1960,
vention, San Francisco, 1984. 20. Lee DTL, Morf M, and Friedlander B : Recursive least
5. Chabries DM, Christiansen RW, Brey R, and Robinette M: squares ladder estimation algorithms . IEEE Trans . ASSP,
Application of the LMS adaptive filter to improve speech ASSP-29(3) :627-641, 1981.
communication in the presence of noise, Proceedings o/ 21. Rim J : Signal processing for speech enhancement . In The
IEEE ICASSP, 148-151, 1982. Vanderbilt Hearing-Aid Report, ed . Fred H . Bess, 124—
6. Christiansen RW, Chabries DM, and Anderson DB : Noise 129 . Upper Darby, PA : Monographs in Contemporary
reduction in speech using a modified LMS adaptive pre- Audiology, 1981.
dictive filter . Proceedings IEEE/EMIRS, 1100—1102, Chi- 22. Lim JS : In Speech Enhancement, Prentice-Hall, 1983.
cago, IL, 1985. "3 . Lim JS, and Oppenheim AV : Enhancement and bandwidth
7. Christiansen RW, Chabries DM, and Lynn D : Noise re- compression of noisy speech . Proc' . IEEE 67(23), 1977.
duction in speech using adaptive filtering I : Signal-proc- 24. Lucky R : Automatic equalization for digital communica-
essing algorithms 103rd ASA Conference, Chicago IL, tion . Bell System Triclinia/ Journal 44 :547–588, 1965.
1982. 25. Lucky R, Salz J, and Weldon EJ : In Principles of Data
8. Cioffi JM and Kailath T : Fast, recursive-least-squares Communication, New York : McGraw-Hill, 1968.
transversal filters for adaptive filtering . IEEE Trans . ASSP, 26. Mansour D and Gray AH Jr . : Unconstrained frequency-
ASSP-32 (2) :304–337, 1984. domain adaptive filter. IEEE' Trans . ASSP, ASSP-30(5) :726-
9. Clark GA, Parker SR, and Mitra SK : A unified approach 734, 1982.
to time- and frequency-domain realization of FIR adaptive 27. Sambur MR : Adaptive noise cancelling for speech signals.
digital filters . IEEE Trans . ASSP, ASSP-31 (5) :1073-1083, IEEE Trans . ASSP, ASSP-26(5) :419–423, 1978.
1983. 28. Satorius EH, and Pack JD : A least square adaptive lattice
10. Dentino M, McCool J, and Widrow B : Adaptive filtering equalizer algorithm . TR 575, Naval Ocean System Center,
in the frequency domain . Proceedings IEEE, 66 : 1658- 1980.
1659, 1978. 29. Schafer R : Speech processing for the hearing impaired . In
11. Falconer D, and Ljung L : Application of fast kalman The Vanderbilt Hearing-Aid Report, ed . Fred H . Bess,
estimation to adaptive equalization : IEEE Trans . on Comm ., 130–132 . Upper Darby, PA : Monographs in Contemporary
COM-26 (10) :1439–1446, 1978. Audiology, 1981.
12 Ferrara ER Jr . : Fast implementation of LMS adaptive 30. Narayan SS, Peterson AM, and Narasimha MJ : Transform
liters . IEEE Trans . ASSP, ASSP-28 (4) :474–5, 1980. domain LMS algorithm : IEEE Trans . ASSP, ASSP-
13 . Ferrara ER Jr . : Frequency-domain implementations of 31(3) :609—615,1983.
periodically time-varying filters . IEEE Trans . ASSP, ASSP- 31. Widrow B and McCool JM : Stationary and non-stationary
33 (4) :883–892, 1985. learning characteristics of the LMS filter . Proceedings of
14 . Ling F, Manolakis D, and Proakis JG : Numerically robust the IEEE, 64(8) :1151—1162, 1976.
least-squares lattice-ladder algorithms with direct updating 32. Widrow B, Glover JR, McCool JIM Kaunitz J, Williams
of the reflection coefficients . IEEE Trans . ASSP, ASSP- CS, Hearn RH, Zeidler JR, Dong E, and Goodlin RC:
34 (4) :837–845, 1986. Adaptive noise cancelling : Principles and applications,
15 . Gitlin RD, Mazo JE, and Taylor MG : On the design of Proceedings of the IEEE, 63(12) : 1692—1716, 1975.
gradient algorithms for digitally implemented adaptive 33. Wiener N : Extrapolation, Interpolation and Smoothing of
filters . IEEE Transactions on Circuit Theory, CT-20 (2) : 1973 . Stationary Time Series, with Engineering Applications.
New York : Wiley, 1949 .

73
Section IL Noise Reduction : Chabries et al.

APPENDIX

The Frequency Domain Algorithm


F [A3]
The classical adaptive noise-cancelling problem
is formulated in Figure Al . Defining the primary
input as d(n) and the reference inputs as As,,(n), with
n the sample index, the desired and reference inputs
and e Further let x (k) represent the
.2L” .m

EFT of the (k — 1)st and kth consecutive blocks of


the mth reference given as
X i„,
.,, ,( [A4]
,,,(k)

and the output of the rnth filter


yLm,m (k) = last L,,, terms of m,(k) X.21”,,m(k)]
[A5]
where the notation A B denotes the element by
element multiplication of the two vectors A and B
which results in a vector . The sum of the outputs
from all filters of various lengths, L , blocked to L m

output samples is
Figure Al.
lime-domain representation of a digital adaptive filter with M
references of length l e.,
yL,,H)i(k)
+
may be divided into blocks with index k and rep-
resented by the vectors dL(k) and xJ„=, ,,,(k) as Yr,rr( k) = [A6]
follows
L

dj :(k) = [d(kL) d(kL + 1) re d(kL+ L ] [Al]

,m (k) = [x (kL )
m m .y m (kL m and
x,,,(kL m [A2]
yL(k) y L,m (k) . [A7]
where
Similarly, the error blocked to L samples becomes
m = 0,1,2,°” ,M = reference channel number
E L (k) = dL ( k) —
l yl (k) [A8]

Padding with zeroes and transforming,


L,=2"
2L (k) = FFT 2L [A9]
and

ot,ot m = integers specifying the block lengths. where the definition


Of, = [000° . r0] L

Transforms may be obtained using the matrix FFT L

as will be used . The weight update equation using the


74
Journal of Rehabilitation Research and Development Vol . 24 No . 4 Fall 1987

method of steepest descents becomes converged to the Wiener solution ; however in the
general case, Equation [Al2] is required.
wzL, n .n,(k + 1)
The weight vector corresponding to the tnth ref-
= (1 — p)w,>„n,(k) + 2pE ,Ln,(k)XL n„n,(k) [A10] erence, w2L,,,,,,,(k), is updated once each L,,, samples
where the symbol * denotes conjugation, p specifies and the output vector y L"',n,(k) is obtained from
the rate of leakage, and the quantity 'R is Equation [A5].

1o0 • 0 0 0 •0
p µI . 0
00• •0
11Ln, I 0
1L= • [All]
µ'L ,n

0 0 • 0 0

The fact that the weights have been obtained by


circular convolution is denoted by - . To force the
resultant output yLyn, ,n (k) to correspond to a linear
convolution, the frequency-domain weight vector is
obtained as

w2L , ,n,(k) = FFT 2L, ,,, (k)]


[Al2]
where IL , is the L,,, x L,,, identity matrix . The trun- Figure A2.
Frequency-domain algorithm for the inth reference adaptive
cation of the weight vector in [Al2] ensures that the filter .
last half of a time-domain representation of the
weights is identically zero. Mansour and Gray showed Figure A2 is the block diagram for the frequency-
that this truncation was unnecessary with proper domain algorithm that is embodied in Equations
constraints on the input and that the weight vector [Al] through [Al2].

You might also like