You are on page 1of 4

Proceedings of National Conference on Networking, Embedded and Wireless Systems, NEWS-2010, BMSCE

Overlap-save and Overlap-Add Methods to remove DC Component from a Speech sample


Vasanth Krishna K.M . Electronics and Communication Engineering,
BNM Institute of Technology,Bangalore email-vasanth.krishna21@gmail.com

Methods for DC removal are of interest to many DSP practitioners. When the FFT is performed in this case, a large-amplitude DC spectral component at zero Hz will overshadow its spectral neighbors. This paper proposes a method for removing DC component in speech signal using overlap-save and overlap-add methods. The Continuous speech signal is sampled at a rate of 8000 samples per second. Then overlap-add and overlap-save is used to remove the DC component. 1. INTRODUCTION When we digitize analog signals using an analogto-digital (A/D) converter, the converter's output typically contains some small DC bias: i.e, the average of the digitized time samples is not zero. That DC bias may have come from the original analog signal or from imperfections within the A/D converter. Another source of DC bias contamination in DSP is when we truncate a discrete sequence from a B-bit representation to word widths less than B bits. Whatever the source, unwanted DC bias on a signal can cause problems.[2] When we're performing spectrum analysis, any DC bias on the signal shows up in the frequency domain as energy at zero Hz, the X(0) spectral sample. For an N-point FFT the X(0) spectral value is proportional to N and becomes inconveniently large for large-sized FFT. When we plot our spectral magnitudes, the plotting software will accommodate any large X(0) value and squash down the remainder of the spectrum in which we are more interested. A non-zero DC bias level in audio signals is particularly troublesome because concatenating two audio signals, or switching between two audio signals, results in unpleasant audible clicks. In modern digital communications systems, a DC bias on quadrature signals degrades system performance and increases bit error rates.[3] With that said, it's clear that methods for DC removal are of interest to many DSP practitioners. When the FFT is performed in this case, a large-amplitude DC spectral

component at 0 Hz will overshadow its spectral neighbors. Let us consider a particular speech signal, the English word majesty. Usually, speech recorded in real conditions from a microphone has a constant component, called the DC component in electronics. As explained below, it is essential that this component is removed.

Fig 1: Spectrum of word Majesty with and without DC Component. There are two methods of removing the DC component. In method 1 the forward fast Fourier transform (FFT) is generated, then the first frequency component is cleared and finally the inverse FFT (IFFT) is taken. The code follows as: xf=fft(x); % forward FFT xf(1)=0; % clear the first component xc=ifft(xf); % inverse FFT y=real(xc); % take only real part Method 2 involves subtraction of the DC from the signal. The DC component is offset relative to zero. It is calculated as the mean of the signal. For example, if the signal is represented by the number 0.0256, it is situated below the zero axis. In other words, the

101

Proceedings of National Conference on Networking, Embedded and Wireless Systems, NEWS-2010, BMSCE

signal has a negative offset. By subtracting the offset from the signal we return the signal to zero. The code follows as: ofs=mean(x); % calculate mean y=x-ofs; % correct the offset In this paper the first method explained above is used to remove the DC component from the speech signal. In practical applications involving long linear filtering of signals, the input sequence is often a long sequence. This is especially true in some real-time signal processing application concerned with signal monitoring and analysis. Since linear filtering performed via FFT involves operation on a block of data, which is limited in size, a long input sequence must be segmented to fixed-size blocks prior to processing. Since the filtering is linear, successive blocks can be processed one at a time via FFT and the output blocks are fitted together to form the overall output signal sequence. We now describe two methods for linear FIR filtering a long sequence on a block-by-block basis using FFT. The two methods are called Overlap-add Method and Overlap-save Method. For both methods we assume that the FIR filter has duration M. The input data sequence is segmented into blocks of L points, where L>>M. 2. OVERLAP ADD METHOD First, segment the input signal into sections of L, and convolve each section with the FIR of length P. The linear convolution of one section of the input and the FIR will result in a sequence y[n] of length (L + P - 1). [1] Therefore, we can use the DFT of length (L + P - 1) to compute the convolution without time aliasing. As shown in Fig 2, the nonzero points in the filtered sections will overlap by (P - 1) points, and these overlap points should be added together to construct the output. This procedure is called overlap-add method. 2.1 Algorithm 1. Break the input signal x(n) into non-overlapping blocks xm(n) of length L. 2. Zero pad h(n) to be of length N = L + M - 1. 3. Take N-DFT of h(n) to give H(k), k = 0, 1, . ., N-1. 4. For each block m: 4.1 Zero pad xm(n) to be of length N = L + M - 1. 4.2 Take N-DFT of xm(n) to give Xm(k), k = 0, 1, . . . . ,N - 1. 4.3 Multiply: Ym(k) = Xm(k).H(k), k = 0, 1, . .

. , N - 1. 4.4 Take N-IDFT of Ym(k) to give ym(n), n = 0,1, . . . . , N - 1. 5. Form y(n) by overlapping the last M - 1 samples of ym(n) with the first M - 1 samples of ym+1(n) and adding the result.

Fig 2: Overlap add method 3. OVERLAP SAVE METHOD In the overlap-add method, after computing each section, we need to store (P - 1) values of y[n] and and wait for the next data segment to add overlapped points. In cases this is not desirable, we can use an alternative method, overlap-save method. In circular convolution not all points are corrupted by time aliasing. The first (P - 1) points of each segment are time aliased, but we have L - (P - 1) = (L - P +1) points that are equal to the linear convolution. Therefore, as shown in Fig 3, the portion of each output section in the region 0 n P - 2 is discarded, and the remaining samples are saved to construct the final filtered output. 3.1 Algorithm 1. Insert M - 1 zeros at the beginning of the input sequence x(n). 2. Break the padded input signal into overlapping blocks xm(n) of length N = L + M - 1. 3. Zero pad h(n) to be of length N = L + M - 1. 4. Take N-DFT of h(n) to give H(k), k = 0, 1,. . ,N -1. 5. For each block m: 5.1 Take N-DFT of xm(n) to give Xm(k), k = 0, 1, . . . . , N - 1. 5.2 Multiply: Ym(k) = Xm(k). H(k), k = 0, 1, . . . .

102

Proceedings of National Conference on Networking, Embedded and Wireless Systems, NEWS-2010, BMSCE

, N - 1. 5.3 Take N-IDFT of Ym(k) to give ym(n), n = 0,1, . . . . , N - 1. 5.4 Discard the first M - 1 points of each output block ym(n). 6. Form y(n) by appending the remaining (i.e., last) L samples of each block ym(n).

taken from input samples.Then N-DFT of h(n) is done to get H(k) and for each block m N-DFT of xm(n) to get Xm(k). Multiply Ym(k) = Xm(k). H(k) and take N-IDFT of Ym(k). Finally discard the first 48 points of each output block ym(n) and get y(n) has shown in Fig 3.

In overlap-add method xm(n) was constructed by filling xm(n) with input samples for 1 n 80. For the remaining xm(n) i.e 81 n 128 are padded with zeros.[4] Then N-DFT of h(n) is done to get H(k) and for each block m N-DFT of xm(n) to get Xm(k). Multiply Ym(k) = Xm(k). H(k) and take N-IDFT of Ym(k). Finally form y(n) by overlapping the last 48 samples of ym(n) with the first 48 samples of ym+1(n) and adding the result as shi=own in Fig 2.

Fig 3: Overlap save method

4. EXPERIMENTAL RESULTS A particular speech signal of duration 8sec shown Fig 4 having DC component was taken and DC component is removed using overlap-add and overlap-save methods. The resultant speech signal is shown in Fig 5. The continuous speech signal is sampled at 8000 samples per second. The samples are arranged in blocks of 128. The h(0) was taken has 0 and the remaining h(n) for 1 n 128 is taken has 1. In overlap-save method x1(n) was constructed by padding first 48 values with zeros i.e x1(n) is 0 for 1 n 48. The remaining values of xm(n) is filled with input samples for 48 n 128. For the remaining xm(n) where 2 m N the first 48 values were taken from last 48 values of xm-1(n). then the remaining values of xm(n) is filled with input samples for 48 n 128. Here the value of N is number of input samples divided by 80. Divided by 80 because out of block of 128 values, 48 is taken from the previous blocks and only 80 values are

Fig 4: Spectrum of speech sample of 8sec in length with DC Component.

103

Proceedings of National Conference on Networking, Embedded and Wireless Systems, NEWS-2010, BMSCE

Fig 7: Frequency response of the speech signal after DC Component is removed Fig 5: Spectrum of speech sample depicted earlier without DC Component. The DC content present in the original signal was 3% and the DC component after the overlap-add and overlap-save was 0%. By doing frequency analysis for the original component showed a DC component as shown in Fig 6. The frequency analysis for the signal after removing DC component is as shown in Fig 7. 5. CONCLUSION In this paper an Overlap-add Method and Overlapsave Method is used to remove DC component in the speech signal. Since spectrum analysis showed any DC bias on the signal shows up in the frequency domain as energy at zero Hz, the X(0) spectral sample. For an N-point FFT the X(0) spectral value is proportional to N and becomes inconveniently large for large-sized FFT. When we plot our spectral magnitudes, the plotting software will accommodate any large X(0) value and squash down the remainder of the spectrum in which we are more interested. Hence DC component in the speech signal should be removed.

References
[1]Digital Signal Processing principles, algorithms and applications- Proakis & Manolakis, third edition, Eastern Economy Edition, prentice-hall international, inc. [2]Understanding Digital Signal Processing, Second Edition by RICHARD G LYONS. [3]Cambridge University Press- Fractal Speech Processing Marwan Al-Akaidi. [4]Fundamentals of speech recognition By lawrence Rabiner and Biing-Hwang Juang, prentice-hall international, inc. [5] Lecture 16 on Discrete-Time Signal Processing Department of Electrical Engineering and Massachusetts Institute of Technology. [6] Signal Processing Fundamentals and Applications for Communications and Sensing Systems By John Minkoff,Artech House, Boston ,London.

Fig 6: Frequency response of the original speech sample.

104

You might also like