You are on page 1of 7

See

discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/305083102

ADAPTIVE FILTER IMPLEMENTATION USING


FPGA

Article · March 2012

CITATIONS READS

0 87

4 authors, including:

Ninad Mehendale
Indian Institute of Technology Bombay
18 PUBLICATIONS 6 CITATIONS

SEE PROFILE

All content following this page was uploaded by Ninad Mehendale on 09 July 2016.

The user has requested enhancement of the downloaded file. All in-text references underlined in blue are added to the original document
and are linked to publications on ResearchGate, letting you access and read them immediately.
TechnoFocus – 2012, DJSCOE

ADAPTIVE FILTER IMPLEMENTATION


USING FPGA
Dhruvil Shah1, Ruchik Vora1, Jay Khandhar1, Mr.Ninad Mehendale2
1. Undergraduate Student, EXTC Department, DJSCOE, Vile Parle (W), Mumbai.
2. Assistant Professor, EXTC Department, DJSCOE, Vile Parle (W), Mumbai.
Email:dhruvil_shah@hotmail.com, ruchikdv@gmail.com, jaykhandhar@gmail.com, ndm243@gmail.com

Abstract:-Filtering data in real-time requires dedicated increasingly important in our daily life. Digital signal
hardware to meet demanding time requirements. If processing applications impose considerable constraints on
the statistics of the signal are not known, then area, power dissipation, speed and cost, and thus the
adaptive filtering algorithms can be implemented to design tools should be carefully chosen. The most
estimate the signal statistics iteratively. Modern Field commonly used tools are: Application Specific Integrated
Programmable Gate Arrays (FPGAs) include the Circuits (ASIC), Digital Signal Processors (DSP) and
resources needed to design efficient filtering Field Programmable Gate Array (FPGA). DSP is well-
structures. Furthermore, some manufacturers now suited to extremely complex math-intensive tasks, but
include complete microprocessors within the FPGA cannot process high sampling rate applications due to its
fabric. This mix of hardware and embedded software serial architecture. ASIC can meet all the constraints of
on a single chip is ideal for fast filter structures with DSP, however, it lacks flexibility and requires long design
arithmetic intensive adaptive algorithms. This paper cycle. FPGA can make up the disadvantages of ASIC and
aims to combine efficient filter structures with DSP. With the flexibility, time-to-market, risk-mitigation
optimized code to create a solution for various and lower system costs, FPGA has become the first choice
adaptive filtering problems. Several different adaptive for many digital circuit designers.
algorithms have been coded in VHDL. The designs
are evaluated in terms of design time, filter II. Adaptive Filter
throughput, hardware resources and power In practice, signals of interest often become contaminated
consumption. Adaptive filters learn the statistics of by noise or other signals occupying the same band of
their operating environment and continually adjust frequency. When the signal of interest and the noise
their parameters accordingly. In practice, signals of reside in separate frequency bands, conventional linear
interest often become contaminated by noise or other filters are able to extract the desired signal. However,
signals occupying the same band of frequency. When when there is spectral overlap between the signal and
the signal of interest and the noise reside in separate noise, or the signal or interfering signal’s statistics change
frequency bands, conventional linear filters are able with time, fixed coefficient filters are inappropriate.
to extract the desired signal. However, when there is
spectral overlap between the signal and noise, or the
signal or interfering signal’s statistics change with
time, fixed coefficient filters are inappropriate. In
these situations, adaptive algorithms are needed in
order to continuously update the filter coefficients
Because of their ability to perform well in unknown
environments and track statistical time-variations,
adaptive filters have been employed in a wide range of
fields. This paper will focus on one particular
application, namely noise cancellation, as it is the
most likely to require in an embedded VLSI Figure 1:
A narrowband interference N(f) in a wideband signal S(f).
implementation.
Keywords:-FPGA, Adaptive Algorithm, Filter, Noise. This situation can occur frequently when there are
various modulation technologies operating in the same
I. INTRODUCTION range of frequencies. In fact, in mobile radio systems, co-
channel interference is often the limiting factor rather
Digital Signal Processing, which spans a wide variety of
than thermal or other noise sources. If the statistics of the
application areas including speech and image processing,
noise are not known beforehand, or they change over
communications, neural networks and so on, is becoming
133
TechnoFocus – 2012, DJSCOE

time, the coefficients of the filter cannot be specified in III. ADAPTIVE FILTER
advance. In these situations, adaptive algorithms are
needed in order to continuously update the filter 1. Overview
coefficients. Adaptive filters learn the statistics of their operating
environment and continually adjust their parameters
II. FPGA ARCHITECTURE accordingly. The goal of any filter is to extract useful
An FPGA is an alternative type of Programmable Logic information from noisy data. Whereas a normal fixed
Device (PLD). FPGAs have a more flexible, gate-array- filter is designed in advance with knowledge of the
like structure with a hierarchical interconnect statistics of both the signal and the unwanted noise, the
arrangement. The fundamental part of the FPGA is the adaptive filter continuously adjusts to a changing
look-up table (LUT), which acts as a function generator, environment through the use of recursive algorithms.
or can be alternatively be configured as ROM or RAM. This is useful when the statistics of the signals are not
They also include fast carry logic to adjacent cells known beforehand or if they change with time.
making them suitable for arithmetic functions and further
DSP applications. Majority of FPGAs are SRAM-based
and can therefore be programmed as easily as standard
SRAM. Thus, FPGAs are volatile and need to be
programmed each time power is applied. This is normally
accomplished with another part of the circuit that reloads
the configuration bit stream, such as a PROM. The
configuration bit stream stored in the SRAM controls the
connections made and also the data to be stored in the
LUTs. The LUTs are essentially small memories that can Figure 3: Block diagram for the adaptive filter problem
compute arbitrary logic functions. The configurable logic
blocks (CLBs) are the basic blocks of an FPGA and are The discrete adaptive filter shown in the figure above
generally placed in an island-style arrangement. Each accepts an input u(n) and produces an output y(n) by a
logic block in the array is connected to routing resources convolution with the filter’s weights w(k). Desired
controlled by an interconnect switch matrix. references signal d(n) is compared to the output to obtain
With this layout, a very large range of connections can be an estimation error e(n). This error signal is used to
made between resources. A disadvantage to this flexible incrementally adjust the filter’s weights for the next time
routing structure is that unlike the CPLD, signal paths are instant. Several algorithms exist for the weight
not fixed beforehand, which can lead to unpredictable adjustment, such as the Least-Mean-Square (LMS) and
timing. However, FPGAs have an increased logic the Recursive Least-Squares (RLS) algorithms. The
complexity and flexibility. choice of algorithm depends upon the needed
convergence time and the computational complexity
available, as the statistics of the operating environment.

2. Applications
Because of their ability to perform well in unknown
environments and track statistical time-variations,
adaptive filters have been employed in a wide range of
fields. However, there are essentially four basic classes of
applications for adaptive filters. There are: identification,
inverse modeling, prediction, and interference
cancellation, with the main difference between them
being the manner in which the desired response is
extracted. These are shown in figure 4 a, b, c and d,
respectively. This paper will focus on one particular
application, namely noise cancellation, as it is the most
likely to be required in an embedded VLSI
implementation. Handheld radios and satellite systems
that are contained on a single silicon chip require real-
time processing.

Figure 2: Basic Architecture of an FPGA

134
TechnoFocus – 2012, DJSCOE

IV. LMS ALGORITHM


The Least Mean Square (LMS) algorithm is similar to the
method of the steepest descent in that it adapts the
weights by iteratively approaching the MSE minimum.
Widrow and Hoff invented this technique in 1960 for use
in training neural networks. The key is that instead of
calculating the gradient at every step, the LMS algorithm
uses a rough approximation to the gradient.
The LMS algorithm is a widely used algorithm for
adaptive filtering. The algorithm is described by the
following equations:
e(n) = d(n) – y(n)
wi(n+1) = wi(n) + 2ue(n)x(n-i);
In these equations, the tap inputs x(n), x(n-1),……,x(n-
M+1) form the elements of the reference signal x(n),
where M-1 is the number of delay elements. d(n) denotes
the primary input signal, e(n) denotes the error signal and
constitutes the overall system output. wi(n) denotes the
tap weight at the nth iteration. In equation (3), the tap
weights update in accordance with the estimation error.
And the scaling factor u is the step-size parameter u
stability and convergence speed of the LMS algorithm.
The LMS algorithm is convergent in the mean square if
and only if u satisfies the condition:
0<u<2 / tap input power
Where tap input power is:
∑E [|u (n −k)|^2]
K=0
The flow chart of the LMS algorithm is as shown
below:

Figure 4: Four Basic Classes of Adaptive Filtering Applications

3. Adaptive Algorithms
There are numerous methods for performing weight
update of an adaptive filter. There is Wiener filter which
is the optimum linear filter in the terms
of mean squared errorand several algorithms that attemp
t to approximate it, such as the method of steepest
descent. There is also Least-Mean-Square algorithm
developed by Widrow and Hoff originally used for
artificial neural networks. Finally, there are other
techniques such as the Recursive-Least-Squares
Figure 5: Flowchart of the LMS Algorithm
algorithm and the Kalman filter. The choice of the
algorithm is highly dependent on the signals of interest V. HARDWARE IMPLEMENTATION
and the operating environment, as well as the In this work, the ANC described in the previous section is
convergence time required and computation power implemented in the Xilinx Spartan-3E Starter Kit board
available. that provides a convenient development board for
embedded processing applications. The board contains a
Xilinx XC3S500E Spartan-3E FPGA with up to 232
User-I/O pins 320-pin FPGA package and over 10,000
135
TechnoFocus – 2012, DJSCOE

logic cells. The Spartan-3E Starter Kit also provides the LMS core by using System Generator 9.1.
MicroBlaze 32-bit embedded RISC processor and the The LMS core is divided into five blocks:
Xilinx Embedded Development Kit (EDK) [10]. With 1. The Control block arranges the timing of the whole
those features, the Xilinx Spartan-3E Starter Kit is well system. It produces four enable signals: en_x, en_d,
suited for hardware implementation of our ANC system. en_coee, en_err, which enable the Delay Block, the
The hardware implementation process is described using Weight Update Block and the Error Counting Block
the flowchart. separately.
2. The Delay Block receives the reference signal x_in and
the primary input signal d_in under control of the enable
signal en_x and en_d. And it produces the M tap delay
signal x_out.
3. The Multiply Accumulator (MAC) Block multiply the
M_tap reference signal x_out with the M_tap weight w
separately, and add them together, then we get yn.
4. The Error Counting Block subtract yn from dn and get
the error signal e_out, which is also the output of the
whole system. And it produces signal xemu as a feedback
by multiplying e_out, x_out and the scaling factor u.
5. The Weight Update Block updates the weight vector
w(n) to w(n+1) that will be used in the next iteration.

Figure 6: Flowchart of the FPGA Implementation Process

1. Hardware Architecture
The whole embedded system consists of a MicroBlaze
core [10], two Fast Simplex Link (FSL) bus systems, an
On-Chip Peripheral Bus (OPB), a Local Memory Bus
(LMB), an OPB peripheral (RS232 Controller), the on-
chip block RAM and the user core. A block diagram of
the architecture is shown in Fig.4.

Figure 8: Block Diagram of the LMS Core

VI. EXPERIMENTAL RESULTS

1. Convergence Performance
The convergence performance of LMS algorithm applied
in hardware architecture is analyzed with respect better
convergence performance. to an 8-tap LMS adaptive
filter with desired signal of constant 0.5, and input signal
of 0.25. Since the convergence speed of LMS algorithm
has closed relationship with step-size u, three cases of
performance are compared with different step-sizes. The
following 3 curves plotted in the figure below show the
traces of error between the desired signal and the output
signal. The step-sizes of the curves from left to right are:
Figure 7: Hardware Architecture 1/2, 1/4 and 1/8. From the cases, we can see that with the
increase of step-size, convergence speed rises, which is in
2. LMS Core Implementation accord with the theory of LMS algorithm. Convergence
The LMS core is the central part of our hardware performance can also be affected by bit-truncation effect.
architecture. We programmed the LMS core with VHDL
under the platform of Xilinx ISE 9.1i, and simulate it
with ModelSim 6.1b. Then we test the validity of the

136
TechnoFocus – 2012, DJSCOE

Figure 9: Convergence Behaviour of the LMS Algorithm for Three Step


Sizes.

2. Tracking Ability
Tracking ability is also an important property of LMS
algorithm. We use a LMS filter (8-tap, u = 1/4) with an Figure 11: Tracking Ability of the LMS Filter
input signal shown in Fig.10, which is a sinusoidal signal
at 1/32 of the sampling frequency corrupted by sinusoidal 3. Software Implementation
noise at 1/3.2 of the sampling frequency. The SNR of the In order to show the advantage in speed of hardware
input signal is 6 dB. Fig.11 shows the tracking ability of implementation, we use a software implementation to
the LMS filter, the desired signal is plotted in blue dashed compare with it. In this architecture, we use the
line, and the red real line represents the result signal. We MicroBlaze embedded processor to run the whole system
can see that, after about 200 iterations, the result signal is and the LMS algorithm is written in C code. The pseudo
converged to the desired one. C code of the software implementation is shown.
There are four major steps of the software
implementation:
1). delay the input signal;
2).Multiply-accumulate to get the intermediate signal y;
3). Error calculation;
4). weight update.
We make a comparison between software and hardware
implementation measured with clock cycles in one
iteration, the profiling results of the four steps are shown
in table I for three different values of N.

Figure 10: Input Signal

Figure 12: Pseudo C Code of the Pure Software Implementation

137
TechnoFocus – 2012, DJSCOE

CONCLUSION NINAD MEHENDALE


Asst. Professor, Electronics and
Telecommunication Dept. in Dwarkadas J.
In this paper the hardware design is presented to Sanghvi College of Engineering, Mumbai. He
implement the ANC system. The performance of the completed his Diploma in Industrial Electronics
LMS algorithm implemented by hardware is from V.P.M’s Polytechnic, his Engineering in
comprehensively analyzed in terms of convergence Electronics from Somaiya College of
Engineering,Vidyavihar and his M.Tech in
performance, truncation effect and tracking ability. The Embedded Systems from NMIMS university
experimental results ensure the feasibility of the high- Area of Interest: Embedded Systems, Neural Network.
speed FPGA architecture of the LMS algorithm. Also, a
software implementation is presented. A comparison
between the two architectures shows that hardware
implementation accelerates the LMS filtering process,
and the speed up over pure software implementation
increases with the filter tap. The ANC system is chosen to
validate the performance of FPGA in digital signal
processing application; FPGA implementation will be
more effective for more complex digital signal processing
systems.

REFERENCES
[1] B. Widrow, J. R. Glover, J. M. McCool, J. Kuntz, C. S. Williams, R.
H. Hearn, J. R. Ziegler, E. Dong and R. C. Goodling, “Adaptive noise
cancelling: Principles and applications”, Proc. IEEE, vol. 63, Dec. 1975,
pp. 1692-1716.
[2] S. Haykin, Adaptive Filter Theory, Prentice-Hall, third edition,
2002.
[3] M. D. Meyer and D. P. Agrawal, “A high sampling rate delayed
LMS filter architecture,” IEEE Trans., Circuits Syst. II, Analog Digit.
Signal Process., vol. 40, Nov. 1993, pp. 727–729
[4] www.xilinx.com

BIO DATA OF AUTHOR(S)

DHRUVIL SHAH
Student, Third Year, Electronics and
Telecommunication Dept. of Dwarkadas J.
Sanghvi College of Engineering, Mumbai.
Area of interest: Embedded Systems, Digital
Signal Processing and Data Compression.

JAY KHANDHAR
Student, Third Year, Electronics and
Telecommunication Dept. of Dwarkadas J.
Sanghvi College of Engineering, Mumbai.
Area of interest: Embedded Systems and Data
Compression.

RUCHIK VORA
Student, Third Year, Electronics and
Telecommunication Dept. of Dwarkadas J.
Sanghvi College of Engineering, Mumbai.
Area of interest: Embedded systems and
Signal Processing.

138

View publication stats

You might also like