You are on page 1of 9

Welcome to Adaptive Signal Processing!

1
From Merriam-Websters Collegiate Dictionary:
Main Entry: adaptation
Pronunciation: a-dap-tA-sh&n, -d&p-
Function: noun
Date: 1610
1 : the act or process of adapting : the state of being adapted
2 : adjustment to environmental conditions: as
a : adjustment of a sense organ to the intensity or quality
of stimulation
b : modication of an organism or its parts that makes it more
t for existence under the conditions of its environment
3 : something that is adapted; specically : a composition rewritten
into a new form
- adaptational /-shn&l, -sh&-n&l/ adjective
- adaptationally adverb
Adaptive Signal Processing 2011 Lecture 1
Lectures and exercises 2
Lectures: Tuesdays 08.15-10.00 in room E:1406
Exercises: Wednesdays 08.15-10.00 in room E:1145
Computer exercises: Wednesdays 13.15-15.00 in room E:4115, or
Thursdays 10.15-12.00 in room E:4115
Laborations: Lab I: Adaptive channel equalizer in room E:4115
Lab II: Adaptive lter on a DSP in room E:4115
Sign up on lists on webpage
from Monday Nov 1.
Adaptive Signal Processing 2011 Lecture 1
Course literature 3
Book: Simon Haykin, Adaptive Filter Theory, 4th edition,
Prentice-Hall, 2001.
ISBN: 0-13-090126-1 (Hardcover)
Kapitel: Backgr., (2), 4, 5, 6, 7, 8, 9, 13.2, 14.1
(3:e edition: Intr., 1, (5), 8, 9, 10, 13, 16.1, 17.2 )
Exercise material:Exercise compendium (course home page)
Computer exercises (course home page)
Laborations (course home page)
Other material: Lecture notes (course home page)
Matlab code (course home page)
Adaptive Signal Processing 2011 Lecture 1
Contents - References in the 4:th edition 4
Vecka 1: Repetition of OSB (Hayes, or chap.2),
The method of the Steepest descent (chap.4)
Vecka 2: The LMS algorithm (chap.5)
Vecka 3: Modied LMS-algorithms (chap.6)
Vecka 4: Freqency adaptive lters (chap.7)
Vecka 5: The RLS algoritm (chap.89)
Vecka 6: Tracking and implementation aspects (chap.13.2, 14.1)
Vecka 7: Summary
Adaptive Signal Processing 2011 Lecture 1
Contents - References in the 3:rd edition 5
Vecka 1: Repetition of OSB (Hayes, or chap.5),
The method of the Steepest descent (chap.8)
Vecka 2: The LMS algorithm (chap.9)
Vecka 3: Modied LMS-algorithms (chap.9)
Vecka 4: Freqency adaptive lters (chap.10, 1)
Vecka 5: The RLS algoritm (chap.11)
Vecka 6: Tracking and implementation aspects (chap.16.1, 17.2)
Vecka 7: Summary
Adaptive Signal Processing 2011 Lecture 1
Lecture 1 6
This lecture deals with
Repetition of the course Optimal signal processing (OSB)
The method of the Steepest descent
Adaptive Signal Processing 2011 Lecture 1
Recap of Optimal signal processing (OSB) 7
The following problems were treated in OSB
Signal modeling Either a model with both poles and zeros or a
model with only poles (vocal tract) or only zeros (lips).
Invers lter of FIR type Deconvolution or equalization of a chan-
nel.
Wiener lter Filtrering, equalization, prediction och deconvolution.
Adaptive Signal Processing 2011 Lecture 1
Optimal Linear Filtrering 8
Filter
w


- -
?
-
Output signal
y(n)
Desired signal
d(n)
Estimation error
e(n)=d(n)y(n)
Input signal
u(n)
The lter w=

w
0
w
1
w
2
. . .

T
which minimizes the estimation
error e(n), such that the output signal y(n) resembles the desired
signal d(n) as much as possible is searched for.
Adaptive Signal Processing 2011 Lecture 1
Optimal Linear Filtrering 9
In order to determine the optimal lter a cost function J, which punish
the deviation e(n), is introduced. The larger e(n), the higher cost.
From OSB you know some dierent strategies, e.g.,
The total squared error (LS) Deterministic description of the
signal.
J =
n
2
X
n
1
e
2
(n)
Mean squared error (MS) Stochastic description of the signal.
J =E{|e(n)|
2
}
Mean squared error with extra contraint
J =E{|e(n)|
2
}+|u(n)|
2
Adaptive Signal Processing 2011 Lecture 1
Optimal Linear Filtrering 10
The cost function J(n) =E{|e(n)|
p
} can be used for any p1, but
most oftenly for p=2. This choice gives a convex cost function which
is refered to as the Mean Squared Error.
J = E{e(n)e

(n)} = E{|e(n)|
2
} MSE
Adaptive Signal Processing 2011 Lecture 1
Optimal Linear Filtrering 11
In order to nd the optimal lter coecients J is minimized with
regard to themselves. This is done by dierentiating J with regard to
w
0
, w
1
, . . ., and then by setting the derivative to zero. Here, it is
important that the cost function is convex, i.e., so that there is a global
minimum.
The minimization is expressed in terms of the gradient operator ,
J =0
where J is called gradient vector.
In particular, the choice of the squared cost function Mean Squared
Error leads to the Wiener-Hopf equation-system.
Adaptive Signal Processing 2011 Lecture 1
Optimal Linear Filtrering 12
In matrix form, the cost function J =E{|e(n)|
2
} can be written
J(w)=E{[d(n)w
H
u(n)][d(n)w
H
u(n)]

}
=
2
d
w
H
pp
H
w+w
H
Rw
dar
w=

w
0
w
1
. . . w
M1

T
M 1
u(n)=

u(n) u(n 1) . . . u(n M + 1)

T
M 1
R=E{u(n)u
H
(n)}=
2
6
6
6
4
r(0) r(1) . . . r(M 1)
r

(1) r(0) r(M 2)


.
.
.
.
.
.
.
.
.
.
.
.
r

(M 1) r

(M 2) . . . r(0)
3
7
7
7
5
p=E{u(n)d

(n)}=

p(0) p(1) . . . p((M 1))

T
M 1

2
d
=E{d(n)d

(n)}
Adaptive Signal Processing 2011 Lecture 1
Optimal Linear Filtrering 13
The gradient operator yields
J(w) =2
J(w)
w

= 2

w

(
2
d
w
H
pp
H
w+w
H
Rw)
=2p + 2Rw
If the gradient vector is set to zero, the Wiener-Hopf equation system
results
Rwo =p Wiener-Hopf
which solution is the Wiener lter.
wo =R
1
p Wienerlter
In other words, the Wiener lter is optimal when the cost is controlled
by MSE.
Adaptive Signal Processing 2011 Lecture 1
Optimal Linear Filtrering 14
The cost functions dependence on the lter coecients w can be made
clear if written in canonical form
J(w) =
2
d
w
H
pp
H
w+w
H
Rw
=
2
d
p
H
R
1
p+(wwo)
H
R(wwo)
Here, Wiener-Hopf and the expression of the Wienerlter have been
used in addition to the fact that the following decomposition can be
made
w
H
Rw=(wwo)
H
R(wwo)w
H
o
Rwo+w
H
o
Rw+w
H
Rwo
With the optimal lter w=wo the minimal error J
min
is achieved:
J
min
J(wo) =
2
d
p
H
R
1
p MMSE
Adaptive Signal Processing 2011 Lecture 1
Optimal Linear Filtrering 15
Error-Performance Surface for FIR-lter with two coecients,
w=[w
0
, w
1
]
T
0
1
2
3
4
4
3
2
1
0
0
5
10
15
2
4
6
8
10
12
14
4 3 2 1 0
0
0.5
1
1.5
2
2.5
3
3.5
4
J
(
w
)
w
0
w
1
w
1
wo
w
0
p=

0.5272 0.4458

T
R=

1.1 0.5
0.5 1.1


2
d
=0.9486
wo =

0.8360 0.7853

T
J
min
=0.1579
Adaptive Signal Processing 2011 Lecture 1
Steepest Descent 16
The method of the Steepest descent is a recursive method to nd
the Wienerltret when the statistics of the signals are known.
The method of the Steepest descent is not an adaptive lter, but serves
as a basis for the LMS algorithm which is presented in Lecture 2.
Adaptive Signal Processing 2011 Lecture 1
Steepest Descent 17
The method of the Steepest descent is a recursive method that leads
to the Wiener-Hopfs equations. The statistics are known (R, p). The
purpose is to avoid inversion of R. (saves computations)
Set start values for the lter coecients, w(0) (n=0)
Determine the gradient J(n) that points in the direction in
which the cost function increases the most. J(n) =2p+
2Rw(n)
Adjust w(n+1) in the opposite direction to the gradient, but
weight down the adjustment with the stepsize parameter
w(n+1) =w(n) +
1
2
[J(n)]
Repete steps 2 and 3.
Adaptive Signal Processing 2011 Lecture 1
Convergence, lter coecients 18
Since the method of the Steepest Descent contains feedback, there
is a risk that the algorithm diverges. This limits the choices of the
stepsize parameter . One example of the critical choice of is
given below. The statistics are the same as in the previous example.
wo
0
wo
1
0 20 40 60 80 100
Iteration n
w
(
n
)
0
=1.5
=0.1
=1.0
=1.25
p=

0.5272
0.4458

R=

1.1 0.5
0.5 1.1

wo =

0.8360
0.7853

w(0)=

0
0

Adaptive Signal Processing 2011 Lecture 1


Convergence, error surface 19
The inuence of the stepsize parameter on the convergence can be seen
when analyzing J(w). The example below illustrates the convergence
towards J
min
for dierent choices of .
2 1 1.5 0.5 0 -1 -0.5 -1.5 -2
2
1
1.5
0.5
0
-1
-0.5
-1.5
-2
w
0
w
1
w(0)
wo
=0.1
=0.5
=1.0
p=

0.5272
0.4458

R=

1.1 0.5
0.5 1.1

wo =

0.8360
0.7853

w(0)=

1
1.7

Adaptive Signal Processing 2011 Lecture 1


Convergence analysis 20
How should be chosen? A small value gives slow convergence, while
a large value constitutes a risk for divergence.
Perform an eigenvalue decomposition of R in the expression of
J(w(n))
J(n) =J
min
+(w(n)wo)
H
R(w(n)wo)
=J
min
+ (w(n)wo)
H
QQ
H
(w(n)wo)
=J
min
+
H
(n)(n) =J
min
+
X
k

k
|
k
(n)|
2
The convergence of the cost function depends on (n), i.e., the
convergence of w(n) through the relationship (n) =Q
H
(w(n)
wo).
Adaptive Signal Processing 2011 Lecture 1
Convergence analysis 21
With the observation that w(n) =Q(n)+wo the update of the cost
function can be derived:
w(n+1)=w(n)+[pRw(n)]
Q(n+1)+wo =Q(n)+wo+[pRQ(n) Rwo]
(n+1)=(n) Q
H
RQ(n) = (I )(n)

k
(n + 1)=(1
k
)
k
(n) ( Element k i (n) )
The latter is a 1:st order dierence equation, with the solution

k
(n)=(1
k
)
n

k
(0)
For this equation to converge it is required that |1
k
| < 1, which
leads to the stability criterion of the method of the Steepest Descent:
0 < <
2
max
Stabilitet, S.D.
Adaptive Signal Processing 2011 Lecture 1
Convergence, time constants 22
The time constants indicates how many iterations it takes until the
respective error has decreased by the factor e
1
, where e denotes the
base of the natural logarithm. The smaller time constant, the better.
The time constant
k
for eigenmode k (eigenvalue
k
) is

k
=
1
ln(1
k
)
<<1

k
Time constant
k
If the entire convergence for whole coecient vector w(n) is considered,
the speed of convergence is limited by the largest and the smallest
eigenvalue of R, max och
min
. This time constant is denoted a:
1
ln(1 max)
a
1
ln(1
min
)
Time constant a
Adaptive Signal Processing 2011 Lecture 1
Learning Curve 23
0 10 20 30 40 50 60
0
0.2
0.4
0.6
0.8
1
Iteration n
K
o
s
tn
a
d
s
f
u
n
k
tio
n
J
(
n
)
=0.1
=0.5
=1.0
p=

0.5272
0.4458

R=

1.1 0.5
0.5 1.1

wo =

0.8360
0.7853

w(0)=

0
0
J
min
How fast an adaptive lter converges is usually shown in a learning
curve, which is a plot of J(n) as a function of the iteration n.
For the SD, J(n) approaches J
min
. Since SD is deterministic it is
misleading to talk about learning curves in this case.
Adaptive Signal Processing 2011 Lecture 1
Summary Lecture 1 24
Repetition av OSB
Quadratic cost function, J = E{|e(n)|
2
}
Denition of u(n), w, R and p
The correlations R and p is assumed to be known in advance.
The Gradient vector J
Optimal lter coecients is given by wo = R
1
p
Optimal (minimal) cost J
min
= 0
Adaptive Signal Processing 2011 Lecture 1
Summary Lecture 1 25
Summary of the method of the Steepest Descent
Recursiv solution to the Wiener lter
The statistics (R och p) is assumed to be known
The gradient vector J(n) is timedependent but deterministic,
oand points in the direction in which the cost function increases
the most.
Recursion of lter veights: w(n+1) =w(n) +
1
2
[J(n)]
The cost function J(n) J
min
, n , dvs w(n)
wo, n
For convergence it is required that the stepsize satises 0 < <
2
max
The speed of convergence is determined by and the eigenvalues
of R
1
ln(1 max)
a
1
ln(1
min
)
Adaptive Signal Processing 2011 Lecture 1
To read 26
Repetition OSB: Hayes, or Haykin kapitel 2.
Background on adaptive lters, Haykin Background and Preview.
Steepest descent, Haykin kapitel 4.
Exercises: 2.1, 2.2, 2.5, 3.1, 3.3, 3.5, 3.7, (3.2, 3.4, 3.6)
Computer exercise, theme: Implementation of the method of the Stee-
pest descent.
Adaptive Signal Processing 2011 Lecture 1
Exempel pa adaptiva system 27
Har foljer ett antal exempel pa tillampningar som kommer att disku-
teras under kursens gang. Materialet har ar langt ifran heltackande;
ytterligare exempel aternns i Haykin Background and Preview.
Tanken med dessa exempel ar att du redan nu ska borja fundera
pa hur adaptiva lter kan anvandas i olika sammanhang.
Formuleringar av typen anvand ett lter av samma ordning som
systemet fortjanar ett fortydligande. Normalt kanner man inte syste-
mets ordning, utan man far undersoka era alternativ.

Okas ordningen
successivt, sa kommer man i forekommande fall att marka att efter en
viss langd, sa ger ytterligare okning av ltrets langd ingen ytterligare
forbattring.
Adaptive Signal Processing 2011 Bilaga, Forelasning 1
Exempel: Inversmodellering 28
Adaptiv
algoritm
Undersokt
system
FIR
lter
z


- - -
-
-
?

Exciterande signal d(n)


Fordrojning
u(n)
b
d(n)
e(n)
Vid inversmodellering kopplas det adaptiva ltret i kaskad med det
undersokta systemet. Har systemet endast poler, kan ett adaptivt lter
av motsvarande ordning anvandas.
Adaptive Signal Processing 2011 Bilaga, Forelasning 1
Exempel: Modellering/Identiering 29
Adaptiv
algoritm
Undersokt
system
FIR
lter


-
?
-
- -

u(n) d(n)
b
d(n)
e(n)
Vid modellering/identiering kopplas det adaptiva ltret parallellt med
det undersokta systemet. Om det undersokta systemet endast har
nollstallen, sa ar det lampligt att anvanda motsvarande langd pa ltret.
Har systemet bade poler och nollstallen kravs i regel ett langt adaptivt
FIR-lter.
Adaptive Signal Processing 2011 Bilaga, Forelasning 1
Exempel: Ekoslackare I 30
Adaptiv
algoritm
Hybrid
FIR
lter

-
-
?

u(n) Talare 1
b
d(n)
d(n) e(n)
Talare 2
Vid telefoni sa lacker talet fran Talare 1 igenom vid hybriden. Talare 1
kommer da att hora sin egen rost som ett eko. Detta vill man ta bort.
Normalt stoppas adaptionen da Talare 2 pratar. Filtreringen sker dock
under hela tiden.
Adaptive Signal Processing 2011 Bilaga, Forelasning 1
Exempel: Ekoslackare II 31
Adaptiv
algoritm
FIR
lter

-
-
?

@
@


?
?

+
+

Impulssvar
Ekovag

u(n) Talare 1
b
d(n)
d(n) e(n)
Talare 2
Denna struktur ar tillampbar pa hogtalartelefoner, videokonferenssystem
och dylikt. Precis som vid telefonifallet hor Talare 1 sig sjalv i form av
ett eko. Dock ar denna eekt mer pataglig har, eftersom mikrofonen
fangar upp det som sands ut av hogtalaren. Adaptionen stoppas normalt
nar Talare 2 pratar, men ltreringen sker under hela tiden.
Adaptive Signal Processing 2011 Bilaga, Forelasning 1
Exempel: Adaptive Line Enhancer 32
Adaptiv
algoritm
FIR
lter

+
+

-
?
?
?
z


-
-
-

-
Periodisk
signal
s(n)
v(n) Fargad storning
d(n)
b
d(n)
e(n)
u(n)
Fordrojning
Information i form av en periodisk signal stors av ett fargat brus som
ar korrelerat med sig sjalv inom en viss tidsram. Genom att fordroja
signalen sa pass mycket att bruset (i u(n) respektive d(n)) blir
okorrelerat, kan bruset tryckas ned genom linjar prediktion.
Adaptive Signal Processing 2011 Bilaga, Forelasning 1
Exempel: Kanalutjamnare (Equalizer) 33
Adaptiv
algoritm
FIR
lter
Kanal
C(z)

+
+


6
-

*
-
?
- -
-
-

p(n)
Sluten under traning d(n)
Fordrojning
b
d(n)
e(n)
u(n)
Brus v(n)
En kand pseudo noise-sekvens anvands for att skatta en invers modell av
kanalen (traning). Darefter stoppas adaptionen, men ltret fortsatter
att verka pa den oversanda signalen. Syftet ar att ta bort kanalens
inverkan pa den oversanda signalen.
Adaptive Signal Processing 2011 Bilaga, Forelasning 1

You might also like