Professional Documents
Culture Documents
I.
I NTRODUCTION
Noise that affects signals may be classified into the following types based on its time and frequency characteristics:
narrow band noise, band limited white noise, colored noise, impulse noise and transient noise pulses [1]. More than one type
of noise may be simultaneously present at any given instant of
time. Noise is commonly modeled as Additive Gaussian White
Noise and the filters that remove it use these characteristics.
However, when speech is corrupted by impulsive noise, these
filters do not perform optimally. Impulsive noise corresponds
to noise that is very short in duration i.e. tending to last up to a
few milliseconds. It is typically loud during its short duration
of occurrence. As a result, about a hundred samples at a stretch
or more can be corrupted by noise depending on the sampling
frequency. Typical examples of impulsive noise which decrease
speech intelligibility and are unpleasant include background
keyboard strokes during video conferencing, indicator clicks in
cars, rain drops hitting a hard surface, certain factory sounds,
machine gun firing etc[2], [3] . Various techniques have been
designed to tackle this problem. Some of them include: the
classical Signal Dependent Rank Order Mean Algorithm [4],
methods based on: wavelets [2] [5],soft decision and recursion
[6], diffusion filtering [7] and Bayesian frameworks [8]. Many
of these techniques however are not robust to speech signals
that are affected by impulse noise of varying time durations.
The technique used in [5] assumes that we have information
about the characteristics of the energy distribution of the
impulse noise signal. The technique used in [9] is optimized
to remove impulsive disturbances from old gramophone music
recordings as it uses the characteristics of the music signal to
perform processing. The Signal Dependent Rank Order Mean
c 2013 IEEE
978-1-4673-6190-3/13/$31.00
(1)
s and l are integers that scale and dilate the mother function
to generate different wavelet families like Daubechius, Symlet,
Coiflet, Dmey and so on.
w(x) =
k=N
X2
(2)
k=1
where ck s are the wavelet coefficients and w is the scaling function for the corresponding mother function . The
coefficients {c0 , c1 , c2 ......cn } are placed in a transformation
matrix. These coefficients are ordered using two dominant
patterns: smoothing coefficients (approximation) and detail
coefficients[10].
In the proposed method we use only level 1 approximation
and detail coefficients. Level 1 coefficients are obtained when
the discrete wavelet transform is taken just once of the original
signal. Most of the signal energy is concentrated in the
approximation coefficients [11]. The detail coefficients contain
the high frequency part of the speech.
{cA1 , cD1 } = DW T (x)
(3)
(4)
0.3
0.2
0.1
0
C. Interpolation
For longer duration impulsive sounds, some of the time
instants of occurrence of the noise may go undetected. We use
the following method to mitigate this problem
k : (j < k < m)and(m j < w)
1,
if (Z(j) = 1 and Z(m) = 1).
Z (k) =
Z(k), elsewhere.
(8)
0.1
0.2
150
200
250
300
350
400
450
500
0.2
0.1
0
0.1
III.
100
0.3
(9)
w = f s tw
50
0.2
100
200
300
400
500
0.06
0.04
Threshold
0.02
0
0.02
k=j+m
Y
Z (k) = 1,
k=j
(10)
l = j m, h = j + 2m;
t(1 : h l + 1) = |x(l : h)|
B. Median Filtering of the temporary sequence
The temporary sequence obtained from the previous step
is passed through a median filter.
w = 2m + 1
(11)
temp = medianw (t)
Here w represents the width of the window used. As w is just
more than twice the width of the local noise pulse output of
the detector Z , it is able to remove the impulsive noise.
0.04
0.06
50
100
150
200
250
300
350
400
450
500
l : j < l + 1 < j + m.
elsewhere .
(12)
The third plot in Figure 4 shows the result of the adaptive
median filtering process s. s is the estimate of the uncorrupted
speech signal s.
s(k) =
0.3
0.2
0.1
IV.
I MPLEMENTATION R ESULTS
0
0.1
0.2
50
100
150
200
250
300
350
400
0.2
0.1
0
0.1
0.2
100
100
200
300
400
0.2
0.1
0
0.1
0.2
200
300
400
0.5
0
0.5
50
100
150
200
250
300
350
400
450
500
500
Detector Output
Noise
500
0.3
500
0.3
2
1.5
450
2)
0.3
0.2
0.1
0
0.1
Wavelet
Efficiency ()
Haar
46.41%
Daubechies (db6)
93.90%
Symlet (sym2)
84.20%
Coiflet
83.01%
Dmey
94.77%
0.2
0
0.5
1.5
2.5
3.5
4
4
x 10
B. Detector efficiency
The performance efficiency of the detector presented in
section II is investigated over here. We define efficiency metric
as follows:
100(N errors)
=
(15)
N
Here,N represents the number of samples being processed and
errors gives the total number of errors made in the detection
process. An error may occur in 2 ways
1)
2)
(16)
Corrupted Speech
Signal (x2 )
3.80 dB
-3.56 dB
s,s+n
0.841
0.5591
0.3
0.2
0.1
R EFERENCES
0
[1]
0.1
[2]
0.2
[3]
0
0.5
1.5
2.5
3.5
4
4
x 10
[4]
0.3
0.2
[5]
0.1
0
[6]
0.1
0.2
[7]
0
0.5
1.5
2.5
3.5
4
4
x 10
0.3
[8]
0.2
0.1
[9]
0
0.1
[10]
0.2
0.5
1.5
2.5
3.5
4
4
[11]
x 10
[12]
[13]
[14]
[15]
Improvement in
Improvement in SNR
Proposed Technique
SDROM
Proposed Technique
SDROM
x1
8.93 dB
4.63 dB
15.80%
10.70%
x2
18.7 dB
11.98 dB
76.10%
66.50%
TABLE III: The improvement in SNR and the percentage increase in correlation coefficient are indicated for the proposed
technique and the SDROM algorithm used in [4].
V.
C ONCLUSION