You are on page 1of 14

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO.

12, DECEMBER 2012 2241

Synthesis and Array Processor Realization of a 2-D


IIR Beam Filter for Wireless Applications
Rimesh M. Joshi, Student Member, IEEE, Arjuna Madanayake, Member, IEEE, Jithra Adikari, Member, IEEE,
and Len T. Bruton, Fellow, IEEE

Abstract—A broadband digital beamforming algorithm is and multipath fading. These antenna arrays typically employ
proposed for directional filtering of temporally-broadband beamforming using analog delay-and-sum networks, fractional
bandpass space-time plane-waves at radio frequencies (RFs). delay based delay-and-sum digital networks [1], digital phased
The enhancement of desired waves, as well as rejection of un-
desired interfering plane-waves, is simulated. A systolic- and array feeds (PAFs) [11]–[13] and multi-dimensional finite-im-
wavefront-array architecture is proposed for the real-time imple- pule response/infinte impulse response (FIR/IIR) digital filters
mentation of second-order spatially-bandpass (SBP) 2-D infinite [5], [14]. Digital signal processing (DSP)-based broadband
impulse response (IIR) beam filters having potential applications smart antenna arrays have potential applications in UWB
in broadband beamforming of temporally down-converted RF sig- wireless communications [1], [2], [15], cognitive radio [5]–[8],
nals. The higher speed of operation and potentially reduced power
consumption of the asynchronous architecture of wavefront-array software-defined radio [16], microwave imaging [17], space
processors (WAPs) in comparison to the conventional synchronous science and radio astronomy [18]–[21], remote-sensing and
hardware has emerging applications in radio-astronomy, radar, navigation [22], [23].
navigation, space science, cognitive radio, and wireless com- The systolic-array and scanned-array implementation of
munications. Further, the bit error rate (BER) performance 2-D and 3-D IIR broadband frequency-planar filters for digital
improvement along with the reduced computational complexity
of the 2-D IIR SBP frequency-planar digital filter over digital beamforming have been proposed in [25]–[27]. These filters
phased array feed (PAF) beamformer is provided. A nominal are highly suitable for high-speed filtering of broadband ST
BER versus signal-to-interference ratio (SIR) gain of 10–16 dB PWs based on their direction of arrivals (DOAs). For example,
compared to case where beamforming is not applied, and a gain of a 2-D IIR beam filter has recently been practically verified for
2–3 dB at approximately half the number of parallel multipliers balanced antipodal Viraldi antennas (BAVAs) [3], [28] using
to digital PAF, are observed. The results of application-specific
integrated circuit (ASIC) synthesis of the digital filter designs are non-real time software algorithms.
also presented. We propose a second-order 2-D IIR digital filter for the
directional enhancement of temporally-broadband bandpass ST
Index Terms—Array processors, bit error rate (BER), digital
phased array feed (PAF), field-programmable gate array (FPGA), PWs [5], [29] (see Fig. 1). It is shown that the filter operates
multidimensional digital filters, spatial modulation, systolic, wave- at an intermediate frequency (IF) leading to lower-speed VLSI
front, wireless. circuits. We show that the proposed filters have lower computa-
tional complexity compared to the conventional delay-and-sum
beamformers (approximately 70% less number of multipliers
I. INTRODUCTION for similar performance [30], [31]) and are also of lower circuit
complexity compared to 2-D FIR beamformers such as fan

U LTRA-WIDEBAND (UWB) wireless communications and trapezoidal filters [29], [32]–[34]. The lower computa-
[1]–[4], cognitive radio [5]–[8], cooperative wireless tional complexity, closed-form design approach, broadband
sensor networks [9], [10] require highly directional and elec- performance, electronic steerability, and availability of rapidly
tronically steerable smart antenna arrays capable of broadband reconfigurable programming logic realizations make these
plane-wave (PW) filtering at RFs to improve the bit-error emerging 2-D ST digital filters attractive and promising for
rate (BER) caused due to interference from multiple users cognitive radio applications [5]–[8].
A massively-parallel systolic-array and wavefront-array
architectures are proposed for the real-time VLSI implemen-
Manuscript received May 25, 2011; revised September 05, 2011; accepted
October 13, 2011. Date of publication January 13, 2012; date of current version tations of the proposed digital filter. Systolic-array processors
August 02, 2012. are well-known for the implementation of real-time high
R. M. Joshi and A. Madanayake are with the Department of Electrical and
throughput beamforming algorithms [35]–[37]. These pro-
Computer Engineering, University of Akron, Akron, OH 44325-3904 USA
(e-mail: rmj17@uakron.edu; arjuna@uakron.edu). cessors have arrays of identical processors which are highly
J. Adikari is with the Department of Electrical and Computer Engi- modular, regular and highly interconnected, making them
neering, University of Waterloo, Waterloo, ON N2L 3G1, Canada (e-mail:
suitable for VLSI realizations for high speed, especially RF,
jithra.adikari@uwaterloo.ca).
L. T. Bruton is with the Department of Electrical and Computer Engineering, applications [38].
University of Calgary, Calgary, AB T2N 1N4, Canada (e-mail: bruton@ucal- This paper presents the BER performance of a second-order
gary.ca).
2-D IIR spatially-bandpass (SBP) beam filter [39] and compares
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. with that of a digital PAF beamformer for a 16-element uni-
Digital Object Identifier 10.1109/TVLSI.2011.2174167 form linear array (ULA). The asynchronous implementations

1063-8210/$26.00 © 2012 IEEE


2242 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 12, DECEMBER 2012

which seek to address the design complexities, power consump-


tion and timing issues affecting modern digital circuits [40],
is also explored here by extending the direct-form-I realiza-
tion of a second-order 2-D IIR SBP beam filter using novel
clock-free asynchronous quasi delay insensitive (a-QDI) logic
devices [40]. The removal of the global clock in the design
leads to reduction in design complexity, lower chip area, lower
power consumption, and increased speed of operation compared
to the synchronous implementations [40]. The wavefront-array
[41]–[43] a-QDI architecture of the second-order IIR filter is
implemented on a Speedster SPD60 asynchronous field-pro-
grammable gate array (FPGA) from Achronix Semiconductor
[44], which uses so-called picoPIPE acceleration technology
[44], [45] to deliver an improved speed performance.
Consider a time varying signal propagating in the far field
in the 3-D space . This signal can be approximated
by a 4-D ST PW signal [46] which is of the form
given by

(1)

where is the unit vector specifying the DOA in


the 3-D space ,
is the 1-D temporal intensity function in the DOA and
ms is the speed of propagation. When a 4-D
continuous ST PW signal is received by a ULA of sensors,
spaced apart, it reduces to the 2-D ST PW signal
with spatial DOA defined by the angle between the normal
to the -axis and the normal to the 2-D wavefront as shown in
Fig. 1, where and . The time-synchro- Fig. 1. Second-order 2-D IIR spatially-bandpass digital beam filter for asyn-
nously sampled 2-D ST PW, sampled every seconds using chronous FPGA implementation [24] showing ULA of BAVAs [3] and SC block
an array of analog-to-digital converters (ADCs) clocked at at each antenna (consisting of LNA, local oscillator for down-conversion, LPF
for image rejection and ADC). The outputs from the SC blocks are fed to an
Hz, represented by [47] is given by asynchronous FPGA implementing an array of PPCMs (described in detail in
Figs. 8 and 10) for filtering a plane-wave having desired DOA ,
. Here, the uncertainty in the DOA is indicated by angle .

(2)
of interfering signals having DOAs , .
where and in the continuous domain, Therefore, the undesired signal can be expressed as
and represent desired ST PW and undesired inter-
ferences from multi-users with additive noise.
Let us consider Gaussian modulated cosine (GMC) signals
given by
(5)
(3)

where is the carrier frequency and is a constant which


is chosen such that the signal has the required bandwidth II. 2-D IIR SBP BEAM FILTER
(double-sided bandwidth is ). Therefore, the desired 2-D ST The second-order 2-D IIR SBP digital beam filter [39] has
PW signal is given by been proposed for implementation in RF smart antenna appli-
cations for filtering temporally-broadband bandpass signals ob-
tained from a down-converted (or bandpass sampled [48]) array
of antennas. These 2-D IIR filters have possible new applica-
(4) tions in wireless communication base-stations [49] due to their
high directional selectivity and temporal broadband nature, as
where is the DOA of the desired PW. well as being fully steerable and free from fractional delays.
Similarly, the undesired signals at the communication re- The digital implementation of the second-order 2-D IIR SBP
ceivers often contain noise and interference from frequency-planar filter is shown in Fig. 1. The broadband PWs
other signals. Let us denote be the number received by the linear array of sensors are low-noise amplified
JOSHI et al.: SYNTHESIS AND ARRAY PROCESSOR REALIZATION OF A 2-D IIR BEAM FILTER 2243

C. Second-Order 2-D IIR SBP Beam Filter


The 2-D IIR SBP beam filter is an extension of a 2-D IIR
broadband frequency-planar filter [25] for carrier modulated
signals, capable of selectively enhancing broadband PWs
depending on their DOAs. The transfer function (TF) of a 2-D
IIR frequency-planar beam filter [25] based on a first-order
resistively terminated passive prototype network which are
practical bounded-input-bounded-output (P-BIBO) stable [50],
[51] is given by

(7)
Fig. 2. (a) 2-D frequency-domain ROS of the broadband bandpass plane-wave
and (b) the ROS of the plane-wave following down-conversion and down sam- where
pling, where spatio-temporal DOA (normalized to
).
(8)
(LNAd), bandpass filtered (BPFd), synchronously down-con-
verted to baseband and low pass filtered (LPFd), uniformly and for ; , and sets the selectivity of the
synchronously time-sampled and amplitude-quantized and then filter.
finally digitally processed using the 2-D IIR SBP beam filter. The 2-D frequency response of the IIR beam filter (7) has
Alternatively, we could employ bandpass sampling at the LNA a ROS along a line centered on the origin of 2-D frequency
output using a BPF, thereby removing the need for down-con- axis. The IF broadband beamforming applications, on the other
version [48]. hand require beam shaped 2-D passbands centered on a partic-
ular spatial frequency other than the 2-D frequency origin
A. ROS of the Space-Time Plane-Wave (0,0) (as in the case here for GMC PW signal). The TF of the
second-order 2-D IIR SBP filter [39] in partially separable form,
The region of support (ROS) of the 2-D frequency re- described later in (10), can therefore be obtained by applying
sponse of the ST PW lie along a line oriented at an angle spatial modulation to the impulse response of (7) multiplying
, passing though the center of the impulse response by , where is the desired
the 2-D space-time frequency , where and spatial shift from the center of the 2-D frequency spectrum [14],
are spatial and temporal frequencies, respectively [50]. [39].
The angle is referred to as the spatio-temporal Let be the impulse response of the
(ST) DOA of the PW. If there is certain uncertainty in the filter (7). Applying spatial modulation on the impulse response,
spatial DOA of the PW, say , then the 2-D ROS of the ST PW we get the desired impulse response of the spatially modulated
occupies a trapezoidal region as shown in Fig. 2, where is 2-D IIR filter
the variation in spatio-temporal DOA due to the uncertainty in
the spatial DOA . The temporal downsampling by a factor of (9)
causes the ST DOA to change to an angle given by
[29] -
Let be the TF of the
desired filter.
(6)
Using the linear system of modulated -transforms,
-
The trapezoidal ROS [29] of the broadband ST PW following for [48], where
-
down-conversion and down-sampling is shown in Fig. 2, where and , after
[39] is the spatial shift in the simplification, we obtain the TF of the second-order 2-D IIR
frequency of the PW (4). SBP frequency-planar beam filter [39] as

(10)
B. Shape of the 2-D Filter Passband
A double-trapezoidal 2-D FIR filter has been proposed in [5], for , and where
[29] for filtering temporally-broadband bandpass PWs (shown
in Fig. 2). These FIR filters achieve high performance direc-
tional enhancement of band-limited signals but are of very high (11)
order (typically 32 [5]), leading to a higher circuit complexity
compared to the proposed IIR counterparts.
The 2-D IIR ST digital beam filter encompasses the desired -
trapezoidal passband of the PW centered on as Here, 2-D discrete input
-
shown in Fig. 2(b), while achieving a much lower computational and 2-D discrete output ,
complexity compared to the FIR filters [5]. where , .
2244 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 12, DECEMBER 2012

Fig. 4. 2-D magnitude frequency response of input signal containing desired


signal having spatial DOA 30 and two other interfering waves having DOAs
Fig. 3. Magnitude frequency response of second-order 2-D IIR SBP beam filter 10 and 60 .
for , , and . Here, is given by (6).

Therefore, the complete TF of the second-order 2-D IIR SBP


filter is given by

(12)
The filter coefficients and as computed in [14], [39]
are given by

Fig. 5. 2-D magnitude frequency response of the filtered output signal showing
the directional enhancement of the desired PW.

Therefore, the signal received (2) by the ULA (considering a


noise-free case for simplicity) is given by

(14)

(13) where
where we express the filter coefficients in terms of required
statio-temporal DOA (after down-sampling) and beam-width
(selectivity). Here, is the desired spatial frequency
shift from the origin of the 2-D frequency spectrum. and is the sample delay. The received signal (14) is
The magnitude frequency response of the second-order 2-D down-converted using oscillators at 1 GHz, low-pass
IIR SBP beam filter [39] as given by the TF (12) is shown in filtered for image rejection and down-sampled before feeding
Fig. 3, which encompasses the trapezoidal ROS of the desired the signal to the filter. Here, a down-sampling factor of
down-converted down-sampled (DCDS) PW shown in Fig. 2. is used which reduces the required clock frequency to
500 MHz , expanding the baseband
III. VERIFICATION: AN EXAMPLE OF BROADBAND spectrum of the input signal by 5 times as shown in Fig. 4.
INTERFERENCE REJECTION The 2-D magnitude frequency response of the resulting output
Let us consider a element ULA and a partially- signal from the second-order 2-D IIR SBP filter is shown in
broadband GMC PW signal (3) with amplitude Fig. 5 which shows the directional enhancement of the PW
at a carrier frequency of having spatial DOA having DOA , while suppressing PWs with other
, the single-sided bandwidth 250 MHz (double-sided DOAs.
bandwidth 500 MHz) and sampling frequency The temporal cross-correlation of the input and output signals
2.5 GHz. The chosen PW signal has a fractional with a reference Gaussian wave representing the ideal desired
bandwidth (FBW) of (50%). signal is shown in Fig. 6, which demonstrates the directional
Two other interfering PWs identical to the desired PW, except enhancement capability of the 2-D IIR digital beamformer. Ob-
for their spatial DOAs and , are consid- serve that PWs with DOA 10 and 60 have been attenuated
ered for the verification of the rejection of the undesired PWs. by 40 and 38.4 dB respectively, enhancing the PW with desired
JOSHI et al.: SYNTHESIS AND ARRAY PROCESSOR REALIZATION OF A 2-D IIR BEAM FILTER 2245

Fig. 6. Temporal cross-correlation of the input signal (dashed line) and the fil-
tered output signal (solid line) with a reference Gaussian pulse showing the at-
tenuation of undesired signals.

DOA. This verifies the ideal performance of the 2-D IIR SBP
beam filter.
Fig. 7. Phased-array delay-sum beamformer implementation illustrating
IV. REVIEW OF DIGITAL PHASED ARRAY FEED BEAMFORMER transformation of the data at each sensor into frequency domain along with the
weighted combinations.
The delay-and-sum beamformer is based on the concept that
if we have a ULA with broadband antenna elements, then the output of each sensor is phase shifted prior to summation
the output of sensor at is differing only by a time [31].
delay . Therefore, if the output of each antenna is de- The time domain output of the digital delay-and-sum beam-
layed appropriately (with proper weight vector [31] applied) and former is given by
summed together, the effective radiation pattern of the array is
reinforced in the desired direction while suppressing the waves (17)
coming from other directions [31], [52].
The continuous-time output of the delay-and-sum beam-
former is given by [30], [47], [52], [53] where is the sampling period of the ADCs at each
sensor and is the spacing between the sensors
satisfying the Nyquist criterion [25].
(15)
The 1-D discrete Fourier transform (DFT) of (17) is given by

where (18)

(16)
where , are
the bin frequencies for the -point fast Fourier transform
and is the spatial DOA of the desired PW as shown in Fig. 7.
(FFT).
The delay-and-sum beamformer can be implemented in both
The digital PAF implementation for filtering a par-
time and frequency domain [54]. In time domain, the beam-
tially-broadband GMC PW is shown in Fig. 7. The phase
former works by performing time-based delay and sum opera-
compensation required at RF (before down-conversion) for
tions, delaying the incoming signal from each array element by
each FFT bins of the sensors for the beamformer is given by
a certain fractional amount of time and then finally adding them
together. The time domain beamformer requires fractional de-
(19)
lays, the digital implementation of which require accurate ap-
proximation of the fractional delays leading to high computa-
tional complexity of the digital fractional-delay based delay- The baseband frequency after down-conversion and down-
and-sum beamformer [14]. The temporal-frequency do- sampling becomes , where is the carrier
main delay-and-sum beamformer, on the other hand, applies frequency and is the down-sampling factor, while the phase
different complex phasor multipliers to each fre- remains the same as in RF
quency bin of the 1-D frequency response of the signal from
each sensors. The beamformer is steered to a specific direction (20)
by selecting appropriate phases for each sensor. The resulting
array and beamformer is termed a PAF beamformer [11] since where ; .
2246 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 12, DECEMBER 2012

The 1-D frequency response of the output of the digital PAF


is therefore given by

(21)

Expressing the output frequency response in terms of fre-


quency bins and complex multiplier coefficients, we get

(22)

where

(23)
The time domain output of the digital PAF beamformer can be
obtained by taking the inverse fast Fourier transform (IFFT) of
(22). The frequency of operation of the -point digital PAF
beamformer circuit has been reduced from sampling frequency
to .
The BER performance of the digital PAF beamformer for a
GMC partially broadband PW and its computational complexity Fig. 8. Block diagram of a PPCM of the second-order 2-D IIR SBP beam filter
is compared with the proposed second-order 2-D IIR SBP beam with the CP shown. The block marked is reused in multiple realizations of
PPCMs with different CPs.
filter in Section VII.
as shown in Fig. 9. This leads to the systolic-array processor
V. HARDWARE ARCHITECTURE OF THE SECOND-ORDER 2-D implementation [35], [36] in which the 2-D non-separable
IIR SBP BEAM FILTER filter is implemented using an array of PPCMs and
The proposed filter having closed-form coefficients (13) can the separable 1-D component is imple-
be implemented in digital VLSI hardware employing the differ- mented trivially using a delay buffer and adder at the output of
ence equation, which is obtained by inverse 2-D -transform of . The normalized 2-D space-time difference equation
(12) under zero initial conditions (ZICs), and given by for the implementation of
is given by

(25)
Normalized to (24)

The architectural block diagram for implementing (24) in The final filtered 1-D output signal using the modified direct-
direct-form-I implementation using a systolic interconnection form-I implementation of the filter is obtained by feeding the
(see Fig. 1) of parallel processing core modules (PPCMs) is output of the last PPCM in the array through a 1-D FIR filter
shown in Fig. 8. A cascaded interconnection of the PPCM having TF and is given by
blocks lead to the desired massively-parallel array proces-
(26)
sors for real-time RF throughputs. It should be noted that
the number of PPCMs for the filter circuit should equal the
number of sensors used in the implementation. Systolic-array
VI. SPEED OPTIMIZATION OF THE ARRAY PROCESSOR
implementations allow a throughput of one frame per clock
cycle (OFPCC), unlike the throughput of one pixel per clock The feed-forward path of (24) can be pipelined by adding
cycle (OPPCC) in scanned-array implementations [25], [27]. first-input first-output (FIFO) blocks in between combinational
Concurrent architectures for 2-D digital IIR filters proposed logic blocks in order to increase the speed of operation of the
in [38] utilize 1-D block processing for raster-scanned image filter. Likewise, look-ahead (LA) optimization can be applied
processing, and are suitable for video processing applications. for pipelining the feedback paths [55].
The modified direct-form-I implementation of the filter TF
(10) in partially-separable form, achieves lower computational A. Intra- and Inter-PPCM Pipelining
complexity in terms of number of adders in the design (13 The PPCMs can be pipelined in order to obtain critical paths
adders compared to 19 adders in direct-form-I implementation) (CPs) equal to that of a single multiply operation. The intra-
JOSHI et al.: SYNTHESIS AND ARRAY PROCESSOR REALIZATION OF A 2-D IIR BEAM FILTER 2247

Let us define 1-D -transforms

Therefore, (12) can be expressed in the form

(27)

where is the temporal feedback loop shown in Fig. 8

Fig. 9. Block diagram of a PPCM in modified direct-form-I implementation,


where is the temporal feedback loop as shown in Fig. 8.

PPCM pipelining consists of FIFO buffers in between the com-


binational logic blocks in the feed-forward signal paths The second-order TF in (27) has double zeros at and
, , , and . double poles at (since and
The pipelining latency for all the feed-forward paths is as ). Here, as given by (8) ensuring the poles are
shown in Fig. 8. The depth of pipelining is increased until the within the temporal frequency unit circle , satisfying
CP is determined by the temporal feedback loop in the PPCM. the 1-D digital stability criteria [48]. The PPCMs are identical
The CP for the temporal feedback loop can also be reduced to to each other and have the same second-order TF which can be
a single multiply operation using deep intra-PPCM pipelining optimized for speed using 1-D LA for higher order IIR filters.
using LA discussed in Section VI-B. 1) Clustered LA Optimization: CLA pipelining [55] is based
The intra-PPCM pipelining of latency is complemented on the addition of cancelling poles and zeros to the TF (27)
by inter-PPCM pipelines, obtained by inserting FIFO buffer of such that the coefficients of in the denomi-
length at the inputs and . Similarly, FIFO buffer nator of TF are zero for a -stage CLA pipelining. The output
of length at the inputs and length at the inputs of (24) can then be written in terms of two past out-
, . The pipeline compensation FIFO buffers puts and for a second-order
are also placed at the outputs of the PPCMs for aligning the filter, leading to a loop consisting of delay elements and a
output of the 2-D filter. single multiplication operation. The CLA pipelining of certain
order/delay could produce an unstable filter even if
B. Look-Ahead to 2-D Circuits the original filter (without LA) was stable at the first place. But,
it has been shown that CLA produces a stable filter at some crit-
LA is an optimization technique for clocked 1-D IIR digital ical delay such that the stability is assured for
filters in VLSI circuits [55]. In this paper, LA optimization tech- [55]. So, if the desired CLA pipeline order does not produce
niques for higher order IIR filters such as clustered LA (CLA) a stable filter, it should be increased to an order until a stable
and scattered LA (SLA) [25], [55], usually used for synchronous filter is obtained [55].
logic, is used for the novel application in asynchronous feedback a) CLA of Order 2 [24]: Multiplying both numerator and
loops. This helps reduce the critical path delay (CPD) by intro- denominator of (27) by , we get
ducing extra pipelining stages in the feedback path, thereby in-
creasing the speed of operation. The original systolic-array im-
(28)
plementation of the 2-D IIR SBP beam filter is now converted
to a fully asynchronous massively-parallel processor architec-
ture, as shown in Fig. 1, where number of PPCMs are inter- where ; .
connected via FIFO pipelines, in order to realize the recursive The single-delay feedback path of the feedback loop implied
computation of the difference equation. Such asynchronous par- by (12) which has a CP of multiply-then-add operation as
allel processors are known as WAPs [41]–[43]. shown in Fig. 2 has now been reduced to a single multiplier
The 1-D LA optimization technique for higher-order TF [55] because of the additional delay it added to the feedback path,
has been extended here to the 2-D case despite the non-sepa- thereby reducing the CP from
rability of the input-output TF given in (12). The wavefront- to [55]. In practice,
array implementation of the 2-D IIR filter, using interconnected .
PPCMs, however allows 1-D LA optimization of non-separable, b) CLA of Order 3 and Higher [24]: The 3-stage CLA
practical-BIBO stable multi-dimensional filters [25], [39]. optimized TF can be obtained by multiplying both numerator
2248 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 12, DECEMBER 2012

where , are poles of . Here, .


The -stage SLA optimized TF can be described by [55]

(32)

It can be shown from (32) that 2-stage SLA leads to

(33)

Similarly, the 3-stage SLA optimized is given by

(34)

VII. FILTER PERFORMANCE IN A WIRELESS


COMMUNICATION SYSTEM
The multipath and interference suppression capabilities of
the 2-D IIR frequency-planar beam filter [25] in a multi-user
Fig. 10. CLA optimized PPCM hardware architectures of 2-D IIR spatially- environment has been described in [4] with potential applica-
bandpass beam filter in direct-form-I realization of stages 2, 3, and 4 corre-
sponding to (28)–(30), respectively [24]. For brevity, block is described in tions in UWB communication systems. Here, the performance
Fig. 8. of the second-order 2-D IIR SBP beam filter [39] for varying
levels of interferences was evaluated by conducting BER simu-
and denominator of (27) by and is given lations involving the filter and the element ULA. The
by ability of the filter to reject the undesired multi-user interfer-
ences and improve the signal-to-interference ratio (SIR) of the
(29) desired signal was assessed and the result was compared with
the delay-and-sum beamformer (implemented as a digital PAF
in the frequency-domain) and the non-beamformer case. The re-
where and . sult showed reduction in BER relative to both digital PAF beam-
This leads to an ideal CP of former (of similar computational complexity) and non-beam-
[55]. Similarly, multiplying both numerator and denominator former implying potential applications in wireless communica-
of (27) by , we get the LA tion basestations.
optimization of order given by A series of Monte Carlo simulations of the 2-D IIR SBP
frequency planar filter and the digital PAF beamformer were
(30) carried out for element sensor array. The test signal
considered here was a partially-broadband GMC pulse (3). The
where and , which composite signal received by the sensors constituted one desired
allows four levels of pipelining within the loop, leading to an PW with spatial DOA and four other interfering identical
ideal CP of [55]. partially-broadband PWs with spatial DOAs to which are
The resulting signal flow graphs (SFGs) of three 2-D IIR SBP bi-phase modulated (BPSK modulation) by random streams of
filter hardware circuits after implementing CLA optimization of data bits [49].
stage 2, 3, and 4, respectively are shown in Fig. 10 (stage 5 CLA The composite partially-broadband PW signal received by
not shown). the sensor array was first down-converted to intermediate base-
2) SLA Optimization: SLA pipelining [55] requires the de- band, then low-pass filtered for image rejection, down-sampled
nominator of the TF (27) to be transformed in a way that it con- and then finally applied to the 2-D IIR SBP beam filter to get
tains two terms and . The output of (24) the desired signal at the output. The resulting DCDS signal from
can then be written in terms of two past outputs each sensor is given by [15], [49]
and for a second-order filter. SLA optimization
always leads to stable realizations, provided that the original TF (35)
is BIBO stable.
We can express from (27) as
where is the total number of users, is the total
(31) number of symbols, is the number of samples per symbol
and is the random data streams for modulating
the input signals, . represents the
JOSHI et al.: SYNTHESIS AND ARRAY PROCESSOR REALIZATION OF A 2-D IIR BEAM FILTER 2249

Fig. 11. BER curve for varying levels of SIR for different beamforming cases Fig. 12. BER curve for varying levels of SIR for different beamforming cases
for and . for and .

desired signal, while and represents the unde-


sired interference from other users. is the additive white
Gaussian noise (AWGN), at a level 18 dB relative to the
received signal, modeled for the effect of quantization noise at
the ADCs.

A. BER Simulation Example


Let us consider a 16-element ULA and a partially-broad-
band GMC PW signal (3) at a carrier frequency of 1
GHz having desired spatial DOA with the single-
sided bandwidth 500 MHz double-sided bandwidth
1 GHz (such that ). Let us consider four
other interfering identical partially-broadband PWs with spa-
tial DOAs to as 20 , 50 , 65 , and 80 , respectively.
Let us choose a sampling frequency of 3 GHz satis-
fying the Nyquist sampling frequency , a
down-sampling factor of reducing the clock frequency
to 1 GHz (which is equal to , allowing
the implementation at a lower clock frequency), and
samples per symbol having a bit rate of 50 Mbps. For Fig. 13. BER curve for varying levels of SIR for different beamforming cases
the detection, we used a cross-correlation detector at the output for and .
of the beam filter which is at the spatial location , while
the cross-correlation detector for the non-beamformer case was The BER performance of the IIR filter for a PW of 100%
used at the first sensor location . FBW but with different desired spatial DOAs 50 and 65 is
To compare the performance of the second-order 2-D IIR shown in Figs. 12 and 13, which shows almost similar perfor-
SBP filter against the conventional phased array beamformer, mance of the 2-D IIR filter compared to 8-point FFT digital PAF,
the simulations for the same number of element (i.e., 16 el- but a gain of 2 dB compared to the 4-point FFT digial PAF for a
ement) digital PAF beamformer (described in Section IV) for BER of . For the case of desired PW spatial DOA of 50 ,
-point and 8-point FFT were carried out. The simulated the interferers were chosen to have the spatial DOA of 20 , 35 ,
BER versus SIR plot for PW having FBW of 100% with de- 65 , and 80 . Likewise, the interfering PWs were chosen with
sired spatial DOA is shown in Fig. 11. It is observed spatial DOA of 20 , 35 , 50 , and 80 for a desired spatial DOA
from the figure that the gain due to second-order 2-D IIR SBP of 65 .
digital filter is approximately 17 dB compared to the non-beam- Similar sets of simulation for desired DOAs of 35 , 50 , and
forming case, while a gain of 4.5 and 4 dB compared to 4-point 65 were carried out for PWs having FBW of 50% ( 250
and 8-point FFT digial PAFs respectively for a BER of . MHz, 1 GHz at ). The corresponding BER
2250 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 12, DECEMBER 2012

Fig. 14. BER curve for varying levels of SIR for different beamforming cases
Fig. 16. BER curve for varying levels of SIR for different beamforming cases
for and .
for and .

It is observed from Table I that as the desired DOA angle in-


creases, the gain of the beamformer reduces because of the high
steering angle of the beamformer which increases the warping
effect of the 2-D IIR beam filter [25]. On the other hand, the
BER performance of the beamformer is better for PWs having
smaller fractional bandwidth.
The BER for various practical combinations of finite internal
word lengths and ADC precisions for 2-D IIR beam filters with
TF (7) have been investigated in [56]. But here, for the pro-
posed second-order 2-D IIR SBP beam filter, the internal pre-
cision level of the processors have been chosen large enough
such that the effects on BER due to internal quantization noise
is negligibly small. The effects of lower precision on BER of
the proposed 2-D IIR SBP filter due to quantization effects is
therefore a topic for future research work.

B. Computational Complexity
The computational complexity of a circuit directly depends
on the number of adder and multiplier blocks it requires. The
Fig. 15. BER curve for varying levels of SIR for different beamforming cases direct-form implementation of the second-order 2-D IIR SBP
for and . beam filter [39] requires 13 multipliers per PPCM as described
in Section V, totaling 208 multipliers for a 16-element ULA.
versus SIR plots for various cases of beamforming (for PWs The digital PAF beamformer as described in Section IV requires
with ) are shown in Figs. 14–16. From Fig. 14, it additional complex multipliers along with multi-
is observed that the gain of the 2-D IIR filter is about 17 and 1 pliers for a -point FFT [48], per each element. The conven-
dB compared to non-beamformer and 4-point/8-point FFT dig- tional method of complex number multiplication requires four
ital PAF, respectively for a BER of . The BER performance real multipliers and two real additions. Note that the Gauss com-
for PW with desired DOA of 50 and as shown in plex multiplication algorithm [57] requires only three real mul-
Fig. 15 is similar to the case where and . tiplication and five real additions. Therefore, a 4-point FFT dig-
Whereas, the performance of the 2-D IIR filter for PW with de- ital PAF beamformer requires multipliers
sired DOA of 65 at FBW of 50%, as shown in Fig. 16, is ob- per element.
served to be similar to both 4-point and 8-point digital PAFs. The computational complexity in terms of the number of
The gain of the beamformers for a bit error rate of , for multipliers for the above-mentioned beamformers is shown in
various cases of fractional bandwidth and desired DOA angle Table II, which shows that the second-order 2-D IIR SBP filter
described above is shown in Table I. The table clearly shows an has a better BER performance compared to the 4-point FFT
improved BER performance due to the 2-D IIR SBP beam filter. digital PAF with similar computational complexity. Whereas,
JOSHI et al.: SYNTHESIS AND ARRAY PROCESSOR REALIZATION OF A 2-D IIR BEAM FILTER 2251

TABLE I
BEAMFORMER GAIN FOR A BER OF FOR DIFFERENT CASES OF DESIRED DOA AND FRACTIONAL BANDWIDTH

TABLE II 1) the sender sends the data by setting one of the data wires;
NUMBER OF MULTIPLIERS PER EACH SENSOR FOR DIFFERENT BEAMFORMERS 2) the receiver latches the data and lowers the “enable” wire;
3) the sender lowers all data wires;
4) the receiver raises the “enable” wire when it is ready to
accept new data.
The time required to complete one four-phase handshake is
the digital PAF beamformer implementing 8-point FFT has referred to as cycle time of a pipeline stage and the inverse of
a much higher computational complexity compared to the the cycle time represents the throughput which gives
2-D IIR SBP beam filter, but with a slightly poor or a similar the rate at which tokens travel through the pipeline, provided
BER performance. A 2-D FIR trapezoidal filters [5] for fil- the dataflow netlist is free from loops and reconvergent paths
tering temporally-broadband bandpass signals have very high [58]. Since the pipeline stages can contain a “no-data” state
interference rejection ability but at a cost of larger number along with conventional “logic 0” and “logic 1”, pipelining the
of multipliers in the design. The 2-D IIR SBP beam filter, dataflow path does not affect the functionality of the design,
on the other hand, still provides a significant reduction in the causing these a-QDI pipelines to be “slack elastic” [40], which
interference signal with less number of multipliers and provides is in contrast to the conventional synchronous designs. When
a better BER performance compared to digital PAFs of similar pipeline stages are inserted they are initialized to “no-data” state
computational complexities. at global reset, whereas actual registers defined in RTL design
are initialized to either “logic 1” or “logic 0” based on the de-
VIII. ASYNCHRONOUS LOGIC REALIZATION scription.
WITH ACHRONIX FPGA The two constraints that determine the final operation speed
FPGAs from Achronix Semiconductor [44] which employ of a-QDI circuits are loops and reconvergent paths [58]. A loop
a-QDI logic [45] facilitate asynchronous implementation of is a feedback path in the dataflow. IIR filter circuits such as the
conventional circuits. The removal of the global clock in a-QDI proposed 2-D IIR SBP filter contains loops in the design. If loop
circuits result in reduced design complexity, lower power is the critical path, as in our case, the frequency of operation is
consumption and higher speed of operation [40]. The clock in , where is the number of initialized
a-QDI implementation using Achronix FPGA refers to the clock tokens within the loop and is the entire loop delay. In-
present at the synchronous input/output (I/O) frame (as shown creasing causes increase in the speed of operation. A recon-
in Fig. 1) which consists of synchronous to asynchronous con- vergent path occurs in a fan-in node where one of the path has
verters at the input and asynchronous to synchronous converters fewer pipeline stages (shorter path) than the other (longer path).
at the output. Therefore, the design register transfer level (RTL) In this situation, the data token that arrived earlier in the shorter
does not need to be targeted to picoPIPE technology and is the path has to wait for the coherent data token in the longer path,
same for conventional synchronous implementations. It is the making it a critical path. This can be eliminated by balancing
core fabric of the Achronix FPGA that performs the a-QDI the pipeline stages in the two paths adding extra delay to the
implementation. The core contains large number of fine-grain shorter path. Achronix CAD environment (ACE) tool [58] al-
pico-pipeline stages called “picoPIPE” used for both logic and lows elimination of reconvergent paths by adding a constraint
routing [44], [45], which lead to high-throughput architectures. to the place and route tool. The RTL design of the second-order
Unlike in synchronous circuits where the global clock is used 2-D IIR SBP filter is sent through ACE tool flow for final place
to sequence the computation and for synchronization, the and route on the Achronix SPD60 FPGA, which is described in
sequencing and synchronization in asynchronous circuits are Section IX-A.
achieved using local handshake protocol between adjacent
pipeline stages. Data are passed through the pipeline stages as IX. VHDL IMPLEMENTATION, SIMULATION, AND

messages called “data tokens”. A three wire channel (two data VERIFICATION
wires and one enable wire) is present in between the pipeline Five prototype designs (with CLA up to stage 5) of the
stages which consists of “wire 0”, “wire 1”, and “enable” [40]. second-order 2-D IIR SBP filter in direct-form-I realization,
The data tokens are encoded in the wires such that setting “wire consisting of PPCMs have been implemented using
0” represents “logic 0” and setting “wire 1” represents “logic VHDL. First, the difference equation of the filter for the
1”, while resetting both wires represent “no-data” state. The different CLA optimized designs (see Section VI) were im-
third wire is an acknowledge signal used for the asynchronous plemented using behavioral VHDL. The circuit employed 2’s
handshake protocol. The handshake protocol employed by the complement fixed-point binary arithmetic, with a precision of
a-QDI logic consists of the following four phases [40]: and (where is the word length and is the
2252 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 12, DECEMBER 2012

TABLE III
SPEEDS OF SECOND-ORDER 2-D IIR SBP FILTER IN DIRECT-FORM-I [24]

Fig. 17. Magnitude frequency response of 2-D IIR SBP beam filter, for
position of binary point), chosen large enough to make quanti- , , and , obtained from (a) closed form expressions
with infinite precision and (b) prototype FPGA implementation on the Xilinx
zation noise effects small [56]. The VHDL was then imported ML402 board for word size of 16 bits, for PPCMs. The response ap-
into Simulink model using a blackbox implementation provided proaches the ideal case as the value of and word size is increased.
by Xilinx system generator (XSG) design tool. The complete
design with five of these blackbox (for ) with necessary Xilinx Virtex-4 SX35FF668-10 synchronous FPGA, employing
ZICs was created in Simulink. Finally, a VHDL-mappable the Xilinx ML402 board which facilitates on-chip hardware
bit-true cycle accurate Simulink models of the designs were co-simulation (HCS) with MATLAB/Simulink, while the same
created using the XSG design tool, with all Xilinx specific opti- 65-nm CMOS Achronix Speedster SPD60 asynchronous FPGA
mizations turned off. The resulting VHDL code would therefore on a SPC60 development board was used for the verification of
be generated using behavioral VHDL. These behavioral VHDL the asynchronous implementation.
designs were passed through synchronous and asynchronous
FPGA flows as well as to application-specific integrated circuit B. ASIC Realizations at 65- and 90-nm CMOS
(ASIC) synthesis to get the timing and power analysis. The behavioral VHDL designs (conventional synchronous
design) of the 2-D IIR SBP filter circuits for PPCMs with
A. FPGA Prototyping
word size of 16 bits ( , ) were also synthesized
The five prototype designs for CLA optimized a-QDI filter for ASIC in 65- and 90-nm CMOS technology. The ASIC syn-
circuits, employing PPCMs have been tested on an thesis was performed with the Synopsys Design Compiler Ver-
SPC60 development board with a 65-nm CMOS Achronix sion D-2010.03 using DesignWare building block libraries
Speedster SPD60 asynchronous FPGA. The VHDL design is and TSMC TCBN65G standard-cell library version 140b and
passed through Mentor Graphics’ Precision Synthesis tool for TSMC TCBN90GHP standard-cell library version 210a for 65-
RTL synthesis and then the resulting post-synthesis netlist is and 90-nm technology, respectively. The optimization goal was
fed to ACE tool [58] for place and route on the asynchronous to maximize the speed of operation of the filter. The global
fabric. A wrapper RTL for communication with PC through a operating voltage for both the technologies was 900 mV. The
USB port is used to read the filtered output from the Achronix results of the ASIC synthesis of the CLA optimized filter de-
FPGA. signs for 65- and 90-nm technologies are shown in Table IV.
To obtain a speed comparison of the asynchronous imple- It can be seen that the speed of operation of the filter increases
mentation on Achronix FPGA with conventional designs, the as the CLA pipelining is increased (at the cost of added hard-
five designs are also synthesized and placed and routed using ware complexity thereby increasing cell area and power). But
synchronous FPGA design tool for high capacity device of same once the CLA pipelining reaches a certain limit, there is no fur-
65-nm CMOS technology Xilinx Virtex-5 LX330FF1760-2 ther speed improvement in the filter circuit. For 65-nm CMOS
using the timing driven place and route algorithms (ensuring technology, CLA with was the limit beyond which the
the CP to be the tightest loop via pipelining). The results for speed of the filter remained the same (1.064 GHz). Similarly,
the synchronous and asynchronous implementations of the for 90-nm CMOS technology, the filter with CLA of
second-order 2-D IIR SBP beam filter employing 5 PPCMs are had the highest speed of 694 MHz.
shown in Table III. We can observe a significant improvement We consider the equation ; for the area-time
in speed with the asynchronous implementation using Achronix complexity [59]. Since our optimization goal was to improve
FPGAs, for LA optimized designs, which are as high as 31% the speed of operation of the circuit, we chose a higher value of
for a CLA of order . leading to performance. Alternatively, for area
The correct operation of the 2-D IIR SBP filter was veri- efficient designs, the performance can be compared among
fied for both synchronous and asynchronous implementation various designs. The area-time complexity, total gate count and
by exciting the inputs of the filter by a 2-D unit impulse power details of the prototype designs obtained from the ASIC
function and measuring the impulse response synthesis are also indicated in Table IV. It can be observed that
-
from the on-chip realizations. the circuit with CLA of is the optimum design in terms
A 2-D magnitude frequency response of of performance based on the synthesis results. For the
the measured filter output within the 2-D Nyquist square power analysis, a PW input (partially-broadband GMC pulse)
, , 2, is shown in Fig. 17. For the ver- with one desired signal and four interfering signals including
ification of the synchronous implementation of the filter, the noise (as used for the BER simulation in Section VII) was mod-
design was realized in hardware using using a 90-nm CMOS eled as test patterns. A switching activity information format
JOSHI et al.: SYNTHESIS AND ARRAY PROCESSOR REALIZATION OF A 2-D IIR BEAM FILTER 2253

TABLE IV
RESULTS OF ASIC SYNTHESIS OF THE SECOND-ORDER 2-D IIR SBP FILTER CIRCUITS FOR 65-nm AND 90-nm CMOS TECHNOLOGIES

(SAIF) file was generated using 10 000 test vectors for the gate- [3] L. Liang and S. V. Hum, “Experimental characterization of UWB
level simulation in Cadence NCSim version 06.11. Then the beamformers based on multidimensional beam filters,” IEEE Trans.
Ant. Propag., vol. 59, no. 1, pp. 304–309, Jan. 2011.
SAIF file was back annotated with the gate level netlist. Finally, [4] S. V. Hum, A. Madanayake, and L. T. Bruton, “UWB beamforming
Power Compiler was used to calculate power consumption in using 2D beam digital filters,” IEEE Trans. Ant. Propag. (TAP), vol.
the circuit. It can be seen in Table IV that as the CLA stages in 57, no. 3, pp. 804–807, Mar. 2009.
[5] T. Gunaratne and L. Bruton, “Adaptive complex-coefficient 2D FIR
the design is increased, the total power consumption of the de- trapezoidal filters for broadband beamforming in cognitive radio sys-
sign also increases. Therefore, there exists a trade-off between tems,” Circuits, Syst., Signal Process., vol. 30, pp. 587–608, 2011.
power-efficient and timing-efficient design and the choice of de- [6] K. Hamdi, W. Zhang, and K. Ben Letaief, “Joint beamforming
and scheduling in cognitive radio networks,” in Proc. IEEE Global
sign is based on the specific requirements on speed or power. Telecommun. Conf. (GLOBECOM), 2007, pp. 2977–2981.
[7] G. Zheng, S. Ma, K. kit Wong, and T.-S. Ng, “Robust beamforming
in cognitive radio,” IEEE Trans. Wirel. Commun., vol. 9, no. 2, pp.
X. CONCLUSION 570–576, Feb. 2010.
A second-order 2-D IIR SBP beam filter is proposed for the [8] K. Cumanan, L. Musavian, S. Lambotharan, and A. Gershman, “SINR
balancing technique for downlink beamforming in cognitive radio net-
directional enhancement of a PW based on its DOA. The filter works,” IEEE Signal Process. Lett., vol. 17, no. 2, pp. 133–136, Feb.
is highly steerable, algebraically-defined in terms of the desired 2010.
DOA, computable, and is based on the impulse response mod- [9] Y. Zhao, R. Adve, and T. Lim, “Beamforming with limited feedback
in amplify-and-forward cooperative networks,” IEEE Trans. Wirel.
ulation of a practical-BIBO stable 2-D IIR frequency-planar Commun., vol. 7, no. 12, pp. 5145–5149, Dec. 2008.
beam PW filter. The performance of the 2-D IIR digital filter [10] Y. Zhang, X. Li, and M. Amin, “Distributed beamforming in multi-
user cooperative wireless networks,” in Proc. 4th Int. Conf. Commun.
for interference rejection is verified for PWs in the presence Network. China (ChinaCOM), 2009, pp. 1–5.
of interfering PWs at different DOAs. Further, the BER versus [11] B. Jeffs, K. Warnick, J. Landon, J. Waldron, D. Jones, J. Fisher, and R.
SIR performance of a phased array beamformer and the second- Norrod, “Signal processing for phased array feeds in radio astronom-
ical telescopes,” IEEE J. Sel. Topics in Signal Process., vol. 2, no. 5,
order 2-D IIR SBP beam filter were studied and simulated for pp. 635–646, Oct. 2008.
a partially broadband PW. The performance improvement of [12] M. Elmer and B. D. Jeffs, “Beamformer design for radio astronomical
the second-order 2-D IIR SBP beam filter (which is based on phased array feeds,” in Proc. IEEE Int. Acoust. Speech Signal Process.
(ICASSP) Conf., 2010, pp. 2790–2793.
a systolic/wavefront array implementation) over a digital PAF [13] K. F. Warnick, B. D. Jeffs, J. Landon, J. Waldron, D. Jones, J. R.
beamformer is proposed here. Also, a massively-parallel array Fisher, and R. Norrod, “Beamforming and imaging with the BYU/
architecture of the filter is proposed for real-time implementa- NRAO L-band 19-element phased array feed,” in Proc. 13th Int. Symp.
Ant. Technol. Appl. Electromagn. Canadian Radio Sci. Meet. (ANTEM/
tions using synchronous and asynchronous FPGAs, which en- URSI), 2009, pp. 1–4.
able spatial filtering of broadband PWs at a very high throughput [14] A. Madanayake, “Real-time FPGA architectures for frequency-planar
having potential applications in wireless communications, cog- MDSP,” Ph.D. dissertation, Dept. Elect. Comput. Eng., Univ. Calgary,
Calgary, AB, Canada, 2008.
nitive radio, radio-astronomy aperture-arrays, and radar. The [15] Z. N. C. Huseyin Arslan and M.-G. D. Benedetto, Ultra Wideband
ASIC synthesis of the filter designs was also carried out for Wireless Communication. Hoboken, NJ: Wiley-Interscience, 2006.
65- and 90-nm CMOS technologies. The results show that the [16] T. H. Khine, K. Fakuwa, and H. Suzuki, “Systolic OMF-RAKE: Linear
interference canceller-utilizing systolic array for mobile communica-
speed of operation of the filter is as high as 1.064 GHz for a tions,” IEICE Trans. Commun., vol. E88-B, no. 5, pp. 2128–2135, May
stage CLA pipelined design for 65-nm CMOS tech- 2005.
nology. Higher order 2-D IIR filter and its performance im- [17] E. M. Staderini, “UWB radars in medicine,” IEEE Aerosp. Electron.
Syst. Mag., vol. 17, no. 1, pp. 13–18, 2002.
provement is a topic to be researched in future. [18] A. V. Ardenne, “Concepts of the square kilometre array; Toward
the new generation radio telescopes,” in Proc. IEEE Int. Symp. Ant.
Propag., 2000, pp. 158–161.
REFERENCES [19] S. W. Ellingson, “A DSP engine for a 64-element array,” in Proc. Per-
[1] S. Ries and T. Kaiser, “Towards beamforming for UWB signals,” in spectives for Radio Astronomy—Technol. for Large Ant. Arrays, 1999,
Proc. EUSIPCO, 2004, pp. 829–832. pp. 235–242.
[2] UWB Communication Systems—A Comprehensive Overview, M.-G. D. [20] M. C. VanBeurden, A. B. Smolders, and M. E. J. Jeuken, “Design of
Benedetto, T. Kaiser, A. F. Molisch, I. Oppermann, C. Politano, and D. wideband phased antenna arrays,” in Proc. Perspectives for Radio As-
P. , Eds. New York: Hindawi, 2006. tronomy—Technol. for Large Ant. Arrays, 1999, pp. 347–352.
2254 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 20, NO. 12, DECEMBER 2012

[21] A. Faulkner, P. Alexander, A. Van-Ardenne, R. Bolton, J. Bregman, A. [45] S. Ramaswamy, L. Rockett, D. Patel, S. Danziger, R. Manohar, C. W.
V. Es, M. Jones, D. Kant, S. Montebugnoli, P. Picard, S. Rawlings, S. Kelly, J. L. Holt, V. Ekanayake, and D. Elftmann, “A radiation hard-
Torchinsky, J. G. B. D. Vaate, and P. Winlinson, “The aperture arrays ened reconfigurable FPGA,” in Proc. IEEE Aerosp. Conf., 2009, pp.
for the SKA: The SKADS white paper,” SKA Memo 122, 2010. [On- 1–10.
line]. Available: http://www.skatelescope.org [46] T. K. Gunaratne, “Beamforming of temporally broadband bandpass
[22] K. Gold, R. Silva, R. Worrel, and A. Brown, “Space navigation with plane waves using 2D FIR trapezoidal filters,” M.Sc. thesis, Dept.
digital beam steering GPS receiver technology,” presented at the 59th Elect. Comput. Eng., Univ. Calgary, Calgary, AB, Canada, 2006.
Annu. Meet. ION, Alberquerque, NM, 2003. [47] D. E. Dudgeon and R. M. Mersereau, Multidimensional Digital Signal
[23] R. Silva, R. Worrel, and A. Brown, “Reprogrammable, digital beam Processing. Englewood Cliffs, NJ: Prentice-Hall, 1984.
steering GPS receiver technology for enhanced space vehicle opera- [48] J. G. Proakis and D. G. Manolakis, Digital Signal Processing—Prin-
tions,” presented at the Core Technologies for Space Syst. Conf., Col- ciples, Algorithms, and Applications, 3rd ed. Englewood Cliffs, NJ:
orado Springs, CO, 2002. Prentice-Hall, 1995.
[24] R. M. Joshi, A. Madanayake, and L. T. Bruton, “A 2D IIR spa- [49] A. Madanayake, S. V. Hum, and L. T. Bruton, “A systolic array 2D
tially-bandpass antenna beamformer on a 65 nm Achronix SPD60 IIR broadband RF beamformer,” IEEE Trans. Circuits Syst. II, Expr.
asynchronous FPGA,” presented at the 54th IEEE Int. Midw. Symp. Briefs, vol. 55, no. 12, pp. 1244–1248, Dec. 2008.
Circuits Syst. (MWSCAS), Seoul, Korea, 2011. [50] L. T. Bruton and N. R. Bartley, “Three-dimensional image processing
[25] A. Madanayake and L. T. Bruton, “A speed-optimized systolic array using the concept of network resonance,” IEEE Trans. Circuits Syst.,
processor architecture for spatio-temporal 2-D IIR broadband beam vol. 32, pp. 664–672, Jul. 1985.
filters,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 55, no. 7, pp. [51] P. Agathoklis and L. T. Bruton, “Practical-BIBO stability of N-dimen-
1953–1966, Aug. 2008. sional discrete systems,” Proc. IEE, vol. 130, no. 6, pt. G, pp. 236–242,
[26] A. Madanayake and L. T. Bruton, “A systolic-array architecture for Dec. 1983.
first-order 3D IIR frequency-planar filters,” IEEE Trans. Circuits Syst. [52] D. Dudgeon, “Fundamentals of digital array processing,” Proc. IEEE,
I, Reg. Papers, vol. 55, no. 6, pp. 1546–1559, Jul. 2008. vol. 65, no. 6, pp. 898–904, Jun. 1977.
[27] A. Madanayake and L. Bruton, “A review of 2D/3D IIR plane-wave [53] M. Ghavami, L. B. Michael, and R. Kohno, Ultra Wideband Signals
real-time digital filter circuits,” in Proc. IEEE Canadian Conf. Elect. and Systems in Communication Engineering. West Sussex, U.K.:
Comput. Eng. (CCECE), 2005, pp. 1935–1941. Wiley, 2004.
[28] L. Liang and S. Hum, “Experimental verification of an adaptive [54] R. Armstrong, J. Hickish, K. Adami, and M. E. Jones, “A digital broad-
UWB beamformer based on multidimensional filtering in a real radio band beamforming architecture for 2-PAD,” in Proc. Widefield Sci.
channel,” in Proc. IEEE Ant. Propag. Soc. Int. Symp. (APSURSI), Technol. for the SKA, SKADS Conf., 2009, pp. 284–288.
2010, pp. 1–4. [55] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and Im-
[29] T. K. Gunaratne and L. T. Bruton, “Beamforming of broad-band band- plementation. New York: Wiley, 1999.
pass plane waves using polyphase 2-D FIR trapezoidal filters,” IEEE [56] A. Madanayake, S. V. Hum, and L. T. Bruton, “Effects of quantization
Trans. Circuits Syst. I, Reg. Papers, vol. 55, no. 3, pp. 838–850, Mar. in systolic 2D IIR beam filters on UWB wireless communications,”
2008. Circuits, Syst., Signal Process., pp. 1–16, Jun. 2011.
[30] D. H. Johnson and D. E. Dudgeon, Array Signal Processing: Concepts [57] M. Tull, G. Wang, and M. Ozaydin, “High-speed complex number mul-
and Techniques. Englewood Cliffs, NJ: Prentice-Hall, 1992. tiplier and inner-product processor,” in Proc. 45th Midw. Symp. Cir-
[31] B. D. V. Veen and K. M. Buckley, “Beamforming: A versatile approach cuits Syst. (MWSCAS), 2002, pp. 640–643.
to spatial filtering,” IEEE ASSP Mag., vol. 5, no. 2, pp. 4–24, Apr. 1988. [58] “Achronix CAD Environment User Guide,” ver. 2.3.0, Oct. 2009.
[32] T. K. Gunaratne and L. T. Bruton, “Tracking broadband plane waves [59] C. D. Thompson, “A complexity theory for VLSI,” Ph.D. dissertation,
using 2D adaptive FIR fan filters,” in Proc. IEEE Int. Symp. Circuits Dept. Comput. Sci., Carnegie-Mellon Univ., Pittsburgh, PA, 1980.
Syst. (ISCAS), 2006, pp. 4923–4926.
[33] Q. Gu and M. N. S. Swamy, “On the design of a broad class of 2-D
recursive digital filters with fan, diamond and elliptically-symmetric
responses,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 41, no. 9, Rimesh M. Joshi (S’10) received the B.E. degree in electronics and communi-
pp. 603–614, Sep. 1994. cation engineering from Tribhuvan University, Kathmandu, Nepal, in 2008, and
[34] L. Khademi and L. T. Bruton, “Reducing the computational complexity the M.S. degree in electrical engineering from the University of Akron, Akron,
of narrowband 2D fan filters using shaped 2D window functions,” in OH, in 2011.
Proc. Int. Symp. Circuits Syst. (ISCAS), 2003, pp. 702–705.
[35] S. Y. Kung, VLSI Array Processors. Englewood Cliffs, NJ: Prentice-
Hall, 1988. Arjuna Madanayake (M’03) received the B.Sc. degree in electronic and
[36] E. S. E. K. Bromley and S. Y. Kung, “Systolic Arrays,” presented at telecommunication engineering from the University of Moratuwa, Moratuwa,
the 2nd Int. Conf., Los Alamitos, CA, 1988. Sri Lanka, in 2002, and the M.Sc. and Ph.D. degrees in electrical engineering
[37] N. Rama Murthy and M. N. S. Swamy, “On the real-time computation from the University of Calgary, Calgary, Canada, in 2004 and 2008, respec-
of DFT and DCT through systolic architectures,” IEEE Trans. Signal tively.
Process., vol. 42, no. 4, pp. 988–991, 1994. He is a Tenure-track Assistant Professor with the Department of Electrical
[38] K. K. Parhi and D. G. Messerschmitt, “Concurrent architectures for and Computer Engineering, University of Akron, Akron, OH.
two-dimensional recursive digital filtering,” IEEE Trans. Circuits Syst.,
vol. 36, no. 6, pp. 813–829, Jun. 1989.
[39] A. Madanayake and L. T. Bruton, “A real-time systolic array processor Jithra Adikari (M’07) received B.Sc. degree in electronic and telecommuni-
implementation of two-dimensional IIR filters for radio-frequency cation engineering from the University of Moratuwa, Moratuwa, Sri Lanka, in
smart antenna applications,” in Proc. IEEE Int. Symp. Circuits Syst. 2002, the M.Sc. degree in information technology from the Royal Institute of
(ISCAS), 2008, pp. 1252–1255. Technology (KTH), Stockholm, Sweden, in 2005, and the Ph.D. degree in elec-
[40] J. Teifel and R. Manohar, “An asynchronous dataflow FPGA architec- trical and computer engineering from the University of Calgary, Calgary, AB,
ture,” IEEE Trans. Comput., vol. 53, no. 11, pp. 1376–1392, Nov. 2004. Canada, in 2010.
[41] S. Y. Kung, S. C. Lo, S. N. Jean, and J. N. Hwang, “Wavefront array He is with Elliptic Technologies, Canada. He was with the University of Wa-
processors-concept to implementation,” Computer, vol. 20, pp. 18–33, terloo, Waterloo, ON, Canada.
May 1987.
[42] S.-Y. Kung, K. Arun, R. Gal-Ezer, and D. Bhaskar Rao, “Wavefront
array processor: Language, architecture, and applications,” IEEE Len T. Bruton (F’81) is Professor Emeritus with the Department of Electrical
Trans. Comput., vol. C-31, no. 11, pp. 1054–1066, Nov. 1982. and Computer Engineering, University of Calgary, Calgary, AB, Canada.
[43] S. Y. Kung, “VLSI array processors: Designs and applications,” in Prof. Bruton was a recipient of many awards including the 2002 IEEE Circuits
Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), 1989, pp. 313–320. and Systems Education Award, the 1994 IEEE Outstanding Engineering Award,
[44] Achronix Semiconductor Corporation, Santa Clara, CA, “Achronix and the 1991 Manning Principal Award. In 1994, he was elected a fellow of the
Semiconductor Corporation website,” 2011. [Online]. Available: Royal Society of Canada. He has been featured in the 1997 Great Canadian
http://www.achronix.com Scientists by Barry Shell.

You might also like