You are on page 1of 4

1

Adaptive Blind Equalization Based on the


Minimum Entropy Principle
Asoke K. Nandi1 and Shafayat Abrar2
Department of Electrical Engineering & Electronics
1 The University of Liverpool, Liverpool L69 3GJ, UK
2 COMSATS Institute of Information Technology, Islamabad 44000, Pakistan

AbstractIn this article, we introduce the principle of minimum


entropy in the context of blindly equalizing a digital communication
channel. We discuss how to use this principle to design an inter-symbolinterference sensitive cost for an exemplary amplitude-phase shift-keying
signaling. We also discuss the admissibility of the proposed cost and the
stability of the derived adaptive algorithm.

deconvolution (MED). Later in early 90s, Satorius and Mulligan employed MED principle and came up with proposals to blindly equalize
the communication channels [8]; however, those marvelous signalspecific proposals regrettably failed to receive serious attention.

Index TermsAdaptive equalization, blind equalization, minimum


entropy deconvolution

II. M INIMUM ENTROPY DECONVOLUTION P RINCIPLE

I. I NTRODUCTION
DAPTIVE trained equalization was developed by Lucky for
telephone channels [1]. Lucky proposed the so-called zeroforcing method to be applied in FIR equalization. In blind equalization, on the other hand, the desired signal is unknown to the
receiver, except for its probabilistic or statistical properties over some
known alphabets. As both the channel and its input are unknown,
the objective of blind equalization is to recover the unknown input
sequence based solely on its statistical properties [2]. Historically,
the possibility of blind equalization was first discussed by Allen
and Mazo in 1974 [3]. They proved analytically that an adjusting
equalizer, optimizing the mean-squared sample values at its output
while keeping a particular tap anchored at unit value, is capable of
inverting the channel without needing a training sequence.
The first comprehensive analytical study of the blind equalization
problem was presented by Benveniste, Goursat, and Ruget in 1980
[4]. They established that if the transmitted signal is composed
of non-Gaussian, independent and identically distributed samples,
both channel and equalizer are linear time-invariant filters, noise is
negligible, and the probability density functions of transmitted and
equalized are equal, then the channel has been perfectly equalized.
This mathematical result is very important since it establishes the
possibility of obtaining an equalizer with the sole aid of signals
statistical properties and without requiring any knowledge of the
channel impulse response or training data sequence. The second
analytical landmark occurred in 1990 when Shalvi and Weinstein
significantly simplified the conditions for blind equalization [5]. They
showed that the zero-forcing equalization can be achieved if the
fourth order cumulant (kurtosis) is maximized and the second order
cumulant (energy) remains the same. Before this work, it was usually
believed that one need to exploit infinite statistics to ensure zeroforcing equalization.
Interestingly, designing a cost function for blind equalization has
been more of an art than science; majority of the cost functions
tend to be proposed on intuitive grounds and then validated. Due
to this reason, a plethora of cost functions for blind equalization
is available in literature. On the contrary, the fact is that there
exist established methods which facilitate the designing of blind cost
functions requiring statistical properties of transmitted and received
signals. One of such methods originated in late 70s in geophysics
community who sought to determine the inverse of the channel in
seismic data analysis [6], [7] and it was named minimum entropy

The MED principle was introduced by Wiggins in seismic


data analysis in the year 1977, who sought to determine the
inverse channel w that maximizes the kurtosis of the deconvolved data yn [6]. For seismic data, which are superGaussian
he P
suggested to maximize the cost: J =
PK in nature,
K
4
2 2
1
1
k=1 |ynk+1 | /( K
k=1 |ynk+1 | ) . This scheme seeks the
K
smallest number of large spikes consistent with the data, thus maximizing the order, or equivalently, minimizing the entropy or disorder
in the data [9]. Note that the cost has the statistical form of sample
kurtosis and the expression is scale-invariant. Later, Gray generalized
it with two degrees of freedom as follows [7]:
PK
1
|ynk+1 |p
(p,q)
(1)
Jmed  K k=1
p
P
K
1
q q
|y
|
nk+1
k=1
K
In the context of digital communication where the underlying distribution of the transmitted (possibly pulse amplitude modulated) data
symbols are closer to a uniform density (sub-Gaussian), we can obtain
a blind equalizer by solving:

arg min J(p,q)


if p > q,
med ,

w
w =
(2)
arg max J(p,q)
,
if p < q.
med
w

Note that, in the derivation of (1), it is assumed that the original signal
an can be modeled as realization of independent non-Gaussian process with distribution pA (a; ) = /(2(1/)) exp(|a| / ),
where signal an is real-valued, is the shape parameter, is the
scale parameter, and () is the Gamma function.
III. M INIMUM E NTROPY (B LIND ) E QUALIZATION OF APSK
We employ MED principle and use the PDFs of transmitted
amplitude-phase shift-keying (APSK) and ISI-affected received signal
to design a cost function for blind equalization. Consider a continuous
APSK signal, where signal alphabets {aR + aI } A are assumed
to be uniformly distributed over a circular region of radius Ra and
center at the origin. The joint PDF of aR and aI is given by
p

(Ra2 )1 ,
a2R + a2I Ra ,
pA (aR + aI ) =
(3)
0,
otherwise.
p
Now consider the transformation Y =
a2R + a2I and =
(aR , aI ), where Y is the modulus and () denotes the angle in the range (0, 2) that is defined by the point (i, j). The
joint distribution of the modulus Y and can be obtained as
= y/(Ra2 ), y 0, 0 < 2. Since Y and
pY, (
y , )
are independent, we obtain a triangular distribution for Y given by

pY (
y : H0 ) = 2
y /Ra2 , y 0, where H0 denotes the hypothesis that
signal is distortion-free.
Let Yn , Yn1 , , YnK+1 be a sequence of size K, obtained
by taking modulus of randomly generated distortion-free signal
alphabets A, where subscript n indicates discrete time index.
Let Z1 , Z2 , , ZK be the order statistic of sequence {Y}. Let
pY (
yn , ..., ynK+1 : H0 ) be an K-variate density under the hypothesis H0 . Incorporating scale-invariancy, we obtain
Z
pY (
yn , ..., ynK+1 : H0 ) =
pY (
yn , ...,
ynK+1 : H0 )K1 d
0

K
Y
2K1
ynk+1 ,
K (
zK )2K k=1

(4)
where z1 , z2 , ..., zK are the order statistic of elements
yn , ..., ynK+1 , so that z1 = min{
y } and zK = max{
y }.
Now consider the alternative (H1 ) that signal suffers with multi-path
interference as well as with additive Gaussian noise. The in-phase
and quadrature components of the received signal are modeled as
normal distributed (owing to central limit theorem). It means that
the modulus of the received signal follows Rayleigh distribution,


y
y2
pY (
y : H1 ) = 2 exp 2 , y 0, y > 0.
(5)
y
2y
The K-variate density pY (
yn , , ynK+1 : H1 ) is obtained as

pY (
yn , yn1 , , ynK+1 : H1 )
!
P
QK
Z
2
2 K
nk
nk+1
+1
k =1 y
k=1 y
exp
2K1 d
=
2y2
y2K
0

Q
2K1 (K) K
nk+1
k=1 y
=
P
K
K
2
y

nk+1
k=1
(6)
The scale-invariant rank-discrimination uniformly most powerful test
of pY (
yn , ..., ynK+1 :H0 ) against pY (
yn , ..., ynK+1 :H1 ) is [10]
"P
#K
K
2
H0
nk+1
pY (
yn , ..., ynK+1 : H0 )
1
k=1 y
O(
yn ) =
=
C
2
pY (
yn , ..., ynK+1 : H1 )
K!
zK
H1
(7)
where
C
is
some
threshold.
Assuming
large
K,
we
can
approximate


PK
2
1
nk+1
E |yn |2 . It helps obtaining a statistical cost for
k=1 y
K
the blind equalization of APSK signal as follows:


E |yn |2

w = arg max
(8)
w (max {|yn |})2
Maximizing (8) can be interpreted as determining the equalizer coefficients, w, which drives the distribution of its output, yn , away from
Gaussian distribution toward uniform, thus removing successfully the
interference from the received APSK signal.
A. Admissibility of the proposed cost
The cost (8) demands maximizing equalizer output energy while
minimizing the largest modulus. Since the largest modulus of transmitted signal an is Ra , incorporating this a priori knowledge, the
unconstrained cost (8) can be written in a constrained form as follows:


w = arg max E |yn |2 s.t. max {|yn |} Ra .
(9)
w

By incorporating Ra , it would be possible to recover the true


energy of signal an upon successful convergence. We note that
the cost (9) is quadratic, and the feasible region (constraint) is a
convex set. The problem, however, is non-convex and may have

multiple local maxima. Nevertheless, we have the following theorem:


Theorem: Assume w is a local optimum in (9), and t is the
corresponding total channel-equalizer impulse-response and additive
noise is negligible. Then it holds |tl | = ll .
Proof: Without loss of generality we assume that the channel
and
P
equalizer
are
real-valued,
we
obtain
max{|y
|}
=
R
|t
|
and
n
a
l
l


 2
P
2
E |yn |2 = a2 l |tl |2 , where
P 2 a =PE |a| . We re-write (9) as

given by w = arg maxw l tl s.t.


l |tl | 1. Now consider the
following quadratic problem in t domain
X 2
X
t = arg max
tl s.t.
|tl | 1 .
(10)
t

P 2
Assume
t(f ) is a feasible solution to (10). We have
l tl
P
( l |tl |)2 P
1; where the
equality
is
achieved
if
and
only
if
all
cross
P 2 P P
terms in ( l |tl |)2 =
l tl +
l1
l2 , l2 6=l1 |tl1 tl2 | are zeros.
Now assume that t(k) is a local optimum of (10), i.e., this
P proposition
(k) 2
holds > 0, t(f ) , kt(f ) t(k) k2

l (tl )
P (f ) 2
(k)
(t
)
.
Suppose
t
does
not
satisfy
the
Theorem.
Consider
l l

(c)
(k)
(c)
(k)
(c)
t(c) defined by tl1 = tl1 + / 2, tl2 = tl2 / 2, and tl =
(k)
(k)
(k)
tl , l 6= l1 , l2 . We also assume that tl2 < tl1 . Next, we have
P (c)
P (k)
kt(c) t(k) k2 = , and l |tl | = l |tl | 1. However,
 one can
P (k) 2 P (c) 2
(k)
(k)
observe that l (tl ) l (tl ) = 2 tl2 tl1 2 < 0,
which means t(k) is not a local optimum to (10). This counterexample
shows that all local maxima of (10) satisfy the Theorem.
B. Adaptive optimization of the proposed cost

For a stochastic gradient-based adaptive implementation of (9),


we need to modify it to involve a differentiable constraint; one of the
possibilities is


w = arg max E |yn |2 s.t. fmax(Ra , |yn |) = Ra ,
(11)
w

where we have used the following identity (below a, b C):






|a| + |b| + |a| |b|
|a|, if |a| |b|
fmax(|a|, |b|)
=
|b|, otherwise.
2
(12)
The function fmax is differentiable, viz

fmax(|a|, |b|)
a/(2|a|), if |a| > |b|
=
(13)
0,
if |a| < |b|
a

In [11], we have solved the problem (11) and obtained the following
gradient-based adaptive algorithm:
wn+1 = wn + f(yn ) yn xn ,

1, if |yn | Ra
where f(yn ) =
, if |yn | > Ra

(14)

In (14), = (2M a2 /(ML Ra2 ) 1), where M is the total number


of distortion-free signal alphabets (complex symbols) and Ma is the
number of complex symbols on the modulus Ra . We denote the
algorithm (14) as Beta Constant Modulus Algorithm (CMA).
C. Stability of the derived adaptive algorithm
In this Section, we carry out a deterministic stability analysis of
CMA for a bounded magnitude received signal. The analysis relies
on the analytical framework of [12]. We shall assume that we update
the equalizer only when its output amplitude is higher than certain
threshold; we stop the update otherwise. In our analysis, we assume

that the threshold is Ra . So we only consider those updates when


|yn | > Ra ; we denote the active update steps with index k.
We study the following form: wk+1 = wk + k k xk , k 6=
0, k = 0, 1, 2, , where k (yk ) = f(yk )yk . Let w denote
vector of the optimal equalizer and let zk = wH
xk = ak is the
optimal output so that |zk | = Ra . Define the a priori and a posteriori
p
estimation errors eak := zk yk = wH
k xk and ek := zk sk =
H
wk+1 xk , where wk = w w k . We assume that |eak | is small and
equalizer is operating in the vicinity of w . We introduce (x, y):
(x, y) :=

(y) (x)
f(y) y f(x) x
=
, (x 6= y)
xy
xy

(15)

Using (x, y) and simple algebra, we obtain k = f(yk ) yk =


(zk , yk )eak and epk = (1 k (zk , yk )/k )eak , where k =
kxk k2 . For the stability of adaptation, we require |epk | < |eak |; to
ensure it, we require to guarantee |1 k (zk , yk )/k | < 1, k, zk
and yk . Now we require to prove that the real part of the function
(zk , yk ) defined by (15) is positive and bounded from below. Recall
that |zk | = Ra and |yk | > Ra . We start by writing zk /yk = rej for
some r < 1 and for some [0, 2). Then expression (15) leads to
(=)

(=0)

z }| {
z }| {
f(yk ) yk f(zk ) zk
yk

(zk , yk ) =
=
=
.
z k yk
yk z k
1 rej

(16)

It is important for our purpose to verify whether the real part of


/(1 rej ) is positive. For any fixed value of r, we allow the angle
to vary from zero to 2, then the term /(1 rej ) describes a
circle in the complex plane whose least positive value is /(1 + r),
obtained for = , and whose most positive value is /(1 r),
obtained for = 0. This shows that for r (0, 1), the real part of
the function (zk , yk ) lies in the interval

R (zk , yk )
1+r
1r

r
r
I (zk , yk )
.
1 r2
1 r2

32
3
< k <
(1 + 2 ) kxk k2
(1 + 2 ) kxk k2

(18)

(23)

Note that, arg min 2 /(1 + 2 ) = 0, and arg max /(1 + 2 ) =


1. So making = 0 and = 1 in the lower and upper bounds,
respectively, and replacing kxk k2 with E[kxk k2 ], we find the widest
stochastic stability bound on k as 0 < < 3/(2 E[kxk k2 ]) . The
significance of this result is that it can easily be measured from the
equalizer input samples. In adaptive filter theory, it is convenient to
replace E[kxk k2 ] with trace(R), where R = E[xk xH
k ]. Note that,
when noise is negligible and channel coefficients are normalized, we
can write trace(R) = N a2 ; N is the equalizer length. Finally, the
stochastic bound for the range of step-size for CMA is given by
0<<

I
6

(17)

Referring to Fig. 1, note that the function (zk , yk ) assumes values


that lie inside a circle in the right-half plane. From this figure, we
can obtain the following bound for the imaginary part of (zk , yk )
(that is I (zk , yk )):

Let {Ao , Bo } be such that (1 Bo ) = Ao , where 0 < < 1.


To satisfy (22), we need 0.5 r < 1 and < (1 r)/r. From (19),
Bo must be such that (1 Bo )2 2 + Bo2 < 1 which reduces to
2
the following
Bo2 22 Bo +
 quadratic inequality in Bo : 1 +
2
1 < 0. If we find a Bo that satisfies this inequality, then
a pair {Ao , Bo } satisfying (19) and
 (22) exists. So consider
 the
quadratic function fB := 1 + 2 B 2 22 B+ 2 1 . It has
a negative minimum
 and it crosses the real axis at the positive roots
B (1) = 1 2 / 1 + 2 , and B (2) = 1. This means that there
exist many values of B, between the roots, at which the quadratic
function in B evaluates to negative values.
2
2
Hence, Bo falls in the interval
 (1 )/(1 + ) < Bo < 1; it
2
further gives Ao = 2/ 1 + . Using {Ao , Bo }, we obtain

(0, 0)


7



,0
1+r

3
2N a2

s
AK
A

,
1 r2 1 r2

s


s
AK
A

,
0
1 r2

(24)

s
o
S 
S

,0
1r

Let A and B be any two positive numbers satisfying


A2 + B 2 < 1.

(19)

We need to find a k that satisfies |k I (zk , yk )/k | < A


k < A k /|I (zk , yk )| and |1 k R (zk , yk )/k | < B k >
(1 B)k /R (zk , yk ) Combining these results, we obtain
0<

(1 B)k
A k
< k <
<1
R (zk , yk )
|I (zk , yk )|

(20)

Using the extremum values of R (zk , yk ) and I (zk , yk ), we obtain


(1 + r)(1 B)
(1 r 2 )A
<

<
k
kxk k2
rkxk k2

(21)

We need to guarantee that the upper bound in the above expression


is larger than the lower bound. This can be achieved by choosing
{A, B} properly such that


1r
0 < (1 B) <
A<1
(22)
r
From our initial assumptions that the equalizer is in the vicinity of
open-eye solution and |yk | > Ra , we know that r < 1. Note that for
0.5 r < 1 we have fr 1.

Fig. 1.

Plot of (zk , yk ) for arbitrary , r and [0, 2).

IV. S IMULATION R ESULTS


In this Section, we compare CMA with traditional constant
modulus algorithm (CMA) [13] and Shtrom-Fan algorithm (SFA)
[14], which are expressed as follows:

CMA : wn+1 = wn + Rcm |yn |2 yn xn ,
(25)


Rcm b
SFA : wn+1 = wn +
E[|yn |2 ] |yn |2 yn xn .
(26)
a2
b n |2 ] is an iterative
where Rcm = E[|a|4 ]/a2 is a constant and E[|y
estimate of the energy of deconvolved sequence as obtained by:
b n+1 |2 ] = E[|y
b n |2 ] + n1 (|yn |2 E[|y
b n |2 ]).
E[|y

We consider transmission of amplitude-phase shift-keying (APSK)


signals over a complex voice-band channel (channel-1), taken from
[15], and evaluate the average ISI traces at SNR = 30dB. We employ
a seven-tap equalizer with central spike initialization and use 8and 16-APSK signalling. Step-sizes have been selected such that all

1
N=7
0.8

N = 17
N = 27

P_div

algorithms reached steady-state requiring same number of iterations.


The parameter is 1.535 for 8-APSK and it is 1.559 for 16-APSK.
Simulation results are depicted in Fig. 2. We can note that the CMA
performed better than CMA and SFA and yielded much lower steadystate ISI floor. Also, SFA performed slightly better than CMA.

0.6
Channel 1 [PP]
Channel 2 [KD]

0.4

8
0.2

Residual ISI [dB] traces


8.APSK: SNR = 30dB
CMA: = 8E4
SFA: = 7E4
CMA: = 6E4, = 1.53

12

16

8APSK
0
1

1.1

1.2

_norm

1.3

1.4

1.5

a)
1

Test signal

20

CMA

SFA

N=7
0.8

24

N = 27

2
2

500

P_div

CMA

28
0

N = 17

1000

1500

2000
2500
Iterations

3000

3500

4000

0.6
Channel 1 [PP]

0.2

a)

Channel 2 [KD]

0.4

16APSK

8
0
1

Residual ISI [dB] traces


16.APSK: SNR = 30dB
CMA: = 1.5E4
SFA: = 1.3E4
CMA: = 3E4, = 1.55

12

16

Test signal

20

CMA

28
0

_norm

1.3

1.4

1.5

Fig. 3. Probability of divergence on channel-1 and channel-2 with three


equalizer lengths, no noise, Nit = 104 iterations and Nrun = 100 runs for
a) 8-APSK and b) 16-APSK.
SFA

R EFERENCES
CMA

3
3

1.2

b)

24

1.1

500

1000

1500

2000
2500
Iterations

3000

3500

4000

b)
Fig. 2. Residual ISI: a) 8-APSK and b) 16-APSK. The inner and outer
moduli of 8-APSK are 1.000 and 1.932, respectively. And the inner and outer
moduli of 16-APSK are 1.586 and 3.000, respectively. The energies of 8APSK and 16-APSK are 2.366 and 5.757, respectively.

Next we validate the stability bound (24). Here we consider a


second complex channel (as channel-2) taken from [16]. In all
cases, the simulations were performed with Nit = 104 iterations,
Nrun = 100 runs, and no noise. In Fig. 3, we plot the probability of
divergence (Pdiv ) for three different equalizer lengths, against the
normalized step-size, norm = /bound . The Pdiv is estimated
as Pdiv = Ndiv /Nrun , where Ndiv indicates the number of times
equalizer diverged. Equalizers were initialized close to zero-forcing
solution. It can be seen that the bound does guarantee a stable
performance when < bound .
V. C ONCLUSIONS
The key challenge of adaptive blind equalizers lies in the design
of special cost functions whose minimization or maximization results
in the removal of inter-symbol interference. Based on minimum
entropy principle, we described the idea of designing specific cost
function for the blind equalization of given transmitted signal. We
presented a case study of amplitude-phase shift-keying for which
a cost is derived and corresponding adaptive algorithm is obtained.
We addressed the admissibility of the proposed cost and stability of
the derived algorithm. The blind equalizer implementing the derived
algorithm is shown to possess better convergence behavior compared
to two existing (constant modulus) equalizers.

[1] R.W. Lucky. Automatic equalization for digital communication. The


Bell Systems Technical Journal, XLIV(4):547588, 1965.
[2] C.R. Johnson, Jr., P. Schniter, T.J. Endres, J.D. Behm, D.R. Brown, and
R.A. Casas. Blind equalization using the constant modulus criterion: A
review. Proc. IEEE, 86(10):19271950, 1998.
[3] J.B. Allen and J.E. Mazo. A decision-free equalization scheme for
minimum-phase channels. IEEE Trans. Commun., 22(10):17321733,
1974.
[4] A. Benveniste, M. Goursat, and G. Ruget. Robust identification of a
nonminimum phase system: Blind adjustment of a linear equalizer in
data communication. IEEE Trans. Automat. Contr., 25(3):385399, 1980.
[5] O. Shalvi and E. Weinstein. New criteria for blind equalization of nonminimum phase systems. IEEE Trans. Inf. Theory, 36(2):312321, 1990.
[6] R.A. Wiggins. Minimum entropy deconvolution. Proc. Int. Symp.
Computer Aided Seismic Analysis and Discrimination, 1977.
[7] W. Gray. Variable Norm Deconvolution. PhD thesis, Stanford Univ.,
1979.
[8] E.H. Satorius and J.J. Mulligan. An alternative methodology for blind
equalization. Dig. Sig. Process.: A Review Jnl., 3(3):199209, 1993.
[9] A.T. Walden. Non-Gaussian reflectivity, entropy, and deconvolution.
Geophysics, 50(12):28622888, Dec. 1985.
[10] Z. Sidak, P.K. Sen, and J. Hajek. Theory of Rank Tests. Academic Press;
2/e, 1999.
[11] S. Abrar and A.K. Nandi. Adaptive minimum entropy equalization
algorithm. IEEE Commun. Lett., 14(10):966968, 2010.
[12] M. Rupp and A. H. Sayed. On the convergence analysis of blind
adaptive equalizers for constant modulus signals. IEEE Trans. Commun.,
48(5):795803, May 2000.
[13] D.N. Godard. Self-recovering equalization and carrier tracking in twodimensional data communications systems. IEEE Trans. Commun.,
28(11):18671875, 1980.
[14] V. Shtrom and H.H. Fan. New class of zero-forcing cost functions in
blind equalization. IEEE Trans. Sig. Process., 46(10):2674, 1998.
[15] G. Picchi and G. Prati. Blind equalization and carrier recovery using
a stop-and-go decision-directed algorithm. IEEE Trans. Commun.,
35(9):877887, 1987.
[16] R.A. Kennedy and Z. Ding. Blind adaptive equalizers for quadrature
amplitude modulated communication systems based on convex cost
functions. Opt. Eng., 31(6):11891199, 1992.

You might also like