You are on page 1of 72

1

Adaptive Signal Processing


Adaptiivne signaalittlus
Leon H. Sibul
Kevadsemester, 2007
2
Course Outline - ppekava
I Introduction Overview of applications and basic concepts
of adaptive signal processing.
1. Brief overview of applications
a. Linear prediction.
b. Speech coding
c. Noise cancellation
d. Echo cancellation
e. Adaptive filtering
f. System identification
g. Equalization and deconvolution
h. Adaptive beamforming and array processing
i. Signal separation.
3
2. Introduction to basic concepts of optimization and adaptive signal
processing.
a. Optimization criteria.
Mean square error
Minimum variance.
Maximum signal to noise ratio
Maximum likelihood..
Bit error.
b. Introduction to basic adaptive algorithms.
Gradient search.
The least mean-square (LMS) algorithm.
Stochastic approximation.
Nonlinear algorithms.
Linear algebra and orthogonal decomposition algorithms.
3. Matrix notation and basic linear algebra.
4
II Theory of optimum and adaptive systems.
1. Review of discrete-time stochastic processes.
2. Mean-square error
3. Finite impulse response Wiener filters.
4. Gradient decent algorithm.
5. Stability, convergence and properties of error surfaces.
6. Examples of applications.
III Basic adaptive algorithms and their properties.
1. The least mean-square (LMS) algorithm.
a. Derivation of basic LMS algorithm.
b. Learning curve, time constants, misadjustment, and stability.
c. Step size control.
d. Variations of LMS algorithm.
5
2. Recursive least-squares algorithm.
3. Lattice algorithms.
4. Linear algebra and orthogonal decomposition algorithms.
5. Frequency domain algorithms.
IV Applications.
1. Linear prediction and speech coding.
2. Noise cancellation.
4. Echo cancellation.
5. Adaptive beamforming and array processing.
a. Linear adaptive arrays.
b. Constrained adaptive arrays.
Minimum variance desired look constraint.
Frost beamformer
c. Generalized sidelobe canceller.
d. Robust adaptive arrays.
6
Bibliography
1. Vary, P. and Martin, R., Digital Speech Transmission- Enhancement,
Coding and Error Concealment, John Wiley & Sons, LTD., Chichester,
England, 2006.
2. Schobben, D. W. E., Real Time Concepts in Acoustics, Kluwer
Academic Publishers, Dordrecht, The Netherlands, 2001.
3. Poularikas, A. D. and Ramadan, z. M., Adaptive Filter Primer with
MATLAB, CRC, Taylor & Francis, Boca Raton, FL., USA, 2006.
4. Haykin, S., Adaptive Filter Theory, Third Ed., Prentice Hall, Upper
Saddle River, NJ, USA, 1996.
5. Alexander, S.T., Adaptive Signal Processing, Theory and Applications,
Springer-Verlag, New York, USA, 1986.
6. Widrow, B. and Sterns, S.D., Adaptive Signal Processing, Prentice-Hall,
Englewood Cliffs, NJ, USA, 1985.
7
7. Adaptive Signal Processing, Edited by L.H Sibul, IEEE Press, New
York, USA 1987.
8. Manzingo R.A. and Miller, T.W., Introduction to Adaptive Arrays, John
Wiley-Interscience, New York, USA, 1980.
9. Swanson C. D., Signal Processing for Intelligent Sensors, Marcel
Dekker, New York, USA, 2000.
10. Colub,G.H., and Van Loan, Matrix Computations, The Johns Hopkins
University Press, Baltimore, MD, USA, 1983.
11. Tammeraid, Ivar, Lineaaaralgebra rakendused, TT Kirjastus, Tallinn,
Estonia, 1999. 2. Lineaaralgebra avutusmeetodid. 2.3 Singulaarlahutus.
12. Van Trees, H.L., Optimum Array Processing, Part IV of Detetection,
Estimation and Modulation Theory, Wiley-Interscience, New York,
USA, 2002. Chapter 6 Optimum Waveform Estimation, Chapter 7-
Adaptive Beamformers, A- Matrix Operations.
8
13. Allen, B., and Ghavami, M., Adaptive Array Systems;
Fundamentals and Applications, Wiley, Chichester, England, 2005.
14. Cichocki, A., and Amari, S-I, Adaptive Blind Signal and Image
Processing, Wiley,West Sussex, England, 2002.
9
ppenuded ja hindamine:
1. Semestri t ja aruanne: Rakenduslesande
lahendus kasutades adaptiivset signaalittlust ja
MATLABi. Teema valik oleneb pilase huvidest ja
oskustest. 60% hindest.
2. Kodutd ja harjutused. 20% hindest Kodutd ja
harjutused peavad olema sooritatud, et
lppeksamile pseda.
3. Suuline lppeksam. Ksitab peamiselt semestri td
ja phiteooriat. pilane vib kasutada kuni 20
leheklge enda tehtud mrkmeid. 20% hindest.
10
Semestrit ja aruande nuded.
1. Sissejuhastus- lesande definitsioon, selle rakendus, selle
thtsus ja lhike aruande levaade.
2. Teooria ja algoritmi tuletus.
3. Kasutatud algoritm ja kuidas see lahendab kesoleva
rakenduslesande.
4. MATLABIi programm.
5. Graafikud ja nende selgitus.
6. Tulemuste anals, selgitused ja jreldused.
7. Kokkuvte.
8. Kirjandus.
Mrkused: Word, Powerpoint, PDF, umbes 10 kuni 15 leheklge, eesti vi inglise
keeles.
11
Basic Concepts of Adaptive Signal Processing.
Applications.
Optimization Criterion or Performance Measures.
Adaptive or Learning Algorithms.
Improved System Performance.
Noise reduction, echo cancellation,..
Performance Measures.
Signal and
noise
environment
12
Linear Prediction Filter of Order n.
T T T T
1
a
1
n
i=

2
a
3
a
n
a

d(k)
x(k)

( ) x k

( ) { }
{ }
( ) ( )
{ }
1
2
2
( ) Prediction error ( ) ( ) ( ) .
Adaptive algorithms minimize mean square prediction error:
( )
n
i
i
x k a x k i d k x k x k
E d k E x k x k
=
= =
= (

V
Vary and Martin, 2006, Ch. 6, Haykin,
1996, Ch. 6.
13
Optimum Linear Prediction.
{ }
( ) ( )
{ }
{ }
( )
( )
( ) ( ) { }
{ }
( )
{ }
( ) ( ) { } ( ) ( ) ( )
2
2
2
2 2
2
2
1
Minimize mean square error: ( ) ( ) .
( )
2 2 0 1, 2,... .
( )
2 0 Minimum.
=
n
i
i
E d k E x k x k
E d k
d k
E d k E d k x k n
a a
E d k
E x k
a
E d k x k E x k a x k i x k
E


=
=


= = = =
`

)


| |
=
`
|
\
)

( ) ( ) { } ( ) ( )
1
1
{ }
( ) ( ) 0
n
i
i
n
xx xx
i
x k x k a E x k i x k
R R i


=
=

= =

14
Optimum Linear Prediction.
1
2
(1) (0) ( 1) (1 )
(2) (1) (0) (2 )
( ) ( 1) ( 2) (0)
In vector matrix notation:
is a positive defi
xx xx xx xx
xx xx xx xx
xx xx xx xx n
R R R R n a
R R R R n a
R n R n R n R a


( ( (
( ( (

( ( (
=
( ( (
( ( (


= =
1
xx xx opt xx xx xx
R a a R R

( )
1 2
2 2 2
2 2
2
2
nite, Toeplitz matrix.
( ) ( 1) ( 1) ( 1), ( 2),..., ( )
( , ,..., )
2 2
prediction gain
T
T
n
d x x
x x
x
p
d
x k k k x k x k x k n
a a a
G

=
= + = +
= =
=
T
T T T 1 T 1 1
xx xx xx xx xx xx xx xx xx xx
T 1 T
xx xx xx xx
a x x
a
a a R a R R R R
R a

.
15
Linear Prediction and Speech Coding.
0
N
Noise
generator


Variable
Filter
Impulse
generator
( ), ( ) h k H z
Filter parameters
Discrete time speech production model.
( ) u k
( ) v k
( ) x k
g
S
0
: pitch period
: voiced/unvoiced
: gain
( ) : impulse response
( ) : excitation signal
( ) :speech signal
N
S
g
h k
v k
x k
i
1
Autoregressive (AR) model for speech.
1
( )
1 ( )
C(z)= - c
m
i
i
H z
C z
z

=
=

16
Example of application of Linear Predictor to Speech
Coding.
Transmitter Channel Receiver

-
+
+
+
+
+
( )
k a ( )
k a
Linear Prediction Filter Coefficients.
Adaptive Analysis Filter Speech Synthesis
x(k)

( ) x k
d(k)
y(k)

Vary and Martin, 2006, Ch. 8.


17
Model-Based Speech Coding.
Speech production LP encoder Channel LP Decoder
Model
1
( )
1 ( )
V z
C z
( ) X z
1 ( ) A z
( ) D z
1
1 ( ) A z
( ) Y z
( ) A z
1 ( )
( ) ( ) if ( ) ( ) then ( ) ( ) (excitation)
1 ( )
1 1
Synthesis filter is: ( ) , ( ) ( ) ( ) ( ) ( ).
1 ( ) 1 ( )
Bit rate of encoded speech: 2 bit/sample,
sampling frequenc
s
s
A z
D z V z A z C z D z V z
C z
H z Y z H z D z V z X z
A z C z
B
w
f
f

= = =

= = =

=
y=8 kHz, transmision rate in bits/sec. B
Vary and Martin,
2006, Ch. 8.
18
Adaptive Noise Canceller.
Primary input
Signal
Source s(k)
Noise
Source n(k)

+
Adaptive
Filter
LMS algorithm
Noise Reference
Noise estimate

( ) n k


Output
s(k)+n(k)
n(k)
Auxiliary noise sensor obtains signal free noise sample.
Widrow and Sterns, 1985.
19
Echo Cancellation for Hands-free Telephone Systems.
Local speaker
M
LS
A/D
D/A
( ) distant speaker echo x t

s(t)
n(t)

Adaptive algorithm.
Signal from distant speaker.
( ) ( ) ( ) ( ) y k s k n k x k = + +

( ) x k
( ) x k
-

( ) s k
| |

( ) ( ) ( ) ( ) ( ) s k s k n k x k x k = + +

( ) y k
+

Vary and Martin, 2006, Ch. 13


20
System Identification and Modeling.
Excitation signal
Plant or unknown
system
Adaptive
processor

x(k) d(k)
e(k)
+
-
y(k)
x(k) must be the persistent excitation.
21
System Identification.
Clark, G., JASA 2007
22
Blind Equalization
Channel h(n)
Unobserved data
sequence
Blind equalizer
x(n)
v(n)
noise
u(n)
Minimize intersymbol interference in unknown multipath channels.
x^(n)
+
23
Bussgang Algorithm for Blind Equalization
Transversal
filter
{w^(n)}
Transversal
filter
{w^(n)}
Zero-memory
nonlinear estimator
g(.)
Zero-memory
nonlinear estimator
g(.)
LMS
Algorithm
LMS
Algorithm

Received
signal
u(n)
y(n)
x^(n)
+
-
e(n)
Error
(Haykin,1996)
24
Finite Impulse Response (FIR) Wiener Filters.
( ) s k
( ) n k

( ) x k
( ), ( ) h k H z

( ) s k
( ) ( ) d k s k =

( ) k
( )
{ }
( )
{ }
0
2
2

( ) ( ) ( )

( ) ( )
N
l
s k h l x k l
E k E s k s k
=
=
=

25
Finite Impulse Response (FIR) Wiener Filters.
( )
{ }
( )
( )
( )
( )
( )
( )
( ) ( ) ( ) ( ) ( ) { }
( ) ( ) ( )
| |
2
1
0

( )

by WSS assumption 0 0
in matrix notation where ( )
N
l
N
xx sx
l
il xx
E k
s k
s k
E s k E s k
h i h i h i
E h l x k l x k i E s k x k i
h l R i l R i i N
R R i l

=
=



=
` `


) )

=
`
)
= =
= =
=

sx xx
1
xx sx
R R h
h R R
26
Example of Identification of FIR Filter Coefficients.
( ) x n
0 1
([ , ],1: ) filter w w x
([1, 0.38],1: ) filter x
+
( ) v n
+
+
+
+
+
+
_

( ) d n
( ) n
( ) d n
( ), ( ) x n v n randn
FIR under test.
Wiener filter
+
+
27
MATLAB Example of System Identification.
varx=100;
x=sqrt(varx)*randn(1,20);
>> v=randn(1,20);
>> r=xcorr(x,1,'biased')
r =
10.5420 80.7933 10.5420
>> rx=[80.7933 10.5420];
>> Rx=toeplitz(rx)
Rx =
80.7933 10.5420
10.5420 80.7933
>> y=filter([1 0.38],1,x);
>> dn=y+v;
>> pdx=xcorr(x,dn,'biased');
>> p=pdx(1,19:20)
p =
39.0133 83.4911
>> w=inv(Rx)*p'
w =
0.3541
0.9872
28
Frequency Domain Wiener Filter.
( ) ( ) ( )
( )
{ }
( ) ( )
{ }
( ) ( ) ( )
{ }
( ) ( ) ( ) ( )
{ }
( ) ( ) ( )
{ }
( ) ( )
{ }
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( )
( )
( )
( ) ( )
( )
( )
( )
( )
( )
2
2
2
( )

XX XS SX SS
SX SX
XX
XX XX
SX
SS
XX
h k x k H X
E E S S
H E X X H H E X S
H E X S E S S
H P H H P H P P
P P
H P H
P P
P
P
P








=
=
+
= +
| | | |
=
| |
| |
\ \
+
29
Frequency Domain Wiener Filter.
Optimum Wiener filter:
( )
( )
( )
( )
{ }
( )
( )
( )
2
2
SX
OPT
XX
SX
SS
OPT
XX
P
H
P
P
E P
P

=
=
30
Adaptive Array Structure with Known Desired Signal.
Array

1
( ) x t
( )
i
x t
( )
N
x t
1
w
i
w
N
w

output ( ) y t

( )
Reference signal
d t
Error signal ( ) t
+
-
31
The Mean Square Error (MSE) Performance Criterion.
{ } { }
{ }
{ }
{ }
{ }
2 2
2 2
( )
1
( )
2
Error signal: ( ) ( ) ( )
Squared error: ( ) ( ) 2 ( ) ( ) ( ) ( )
MSE: ( ) ( ) 2
( ) ( )
( ) ( ) ( ) ( )
( ) ( )
d t
d t i
N
t d t t
t d t d t t t t
E t E d t
E x t d t
E x t d t E t t
E x t d t
E

=
= +
= +
(
(
(
(
= =
(
(
(

T
T T T
T T
x xx
T
x xx
w
w x
w x w x x w
w R w R w
R R x x

{ }
( ) ( )
( )
( ) 2 2 0 Wiener-Hopf equation :
"Wiener solution":
d t opt d t
opt d t
t

= + = =
=
x xx xx x
1
xx x
R R w R w R
w R R
32
Minimum Variance (MV) Optimization.
Array Steering Adaptive Weights


1

1
w
2
w
N
w
1
x
1
z
2
x
2
z
N
x
N
z
{ }
( ) ( )
( ) exp ( )
i i i
y t t
z t j x t
=
=
T
w z
( ) t x ( ) t z
w
Adaptive array steered to the signal direction.
33
Minimum Variance (MV) Optimization
| | | | ( )
( )
Array input: ( ) ( ) ( ) ( ) ( )
Signal direction vector:
1, exp , exp 2 , , exp 1
2 sin , sensor distance between linear array elements,
wavelength.
Beam steering matri
t t t s t t
j j j N
d
d

= + = +
(
= (


=
T
x s n d n
d
| |
1
k
2
N-1
1 0 0 0
0
x: = exp .
0
0
0 0
jk

(
(
(
(
=
(
(
(

34
Minimum Variance (MV) Optimization
| |
| |
' '
'( ) ( ) is a unitary transformation that leaves array
noise variance of the array out put unchanged.
Minimize: var ( )
Subject to constraint: 1 1,1, ,1
Constrained optimizatio
t t
y t
=
= =
= =
T T
nn n n
T
T
n n
w R w w R w
w 1 1
( )
( )
MV
min
1
n problem: 1
2
( ) 0
1
Using the constraint: 1
1
var
MV
MV
y t


( = +

= = =
= =
= = (

T T
nn
1
w nn MV nn
T
T 1
nn
1
nn
MV
T 1 T 1
nn nn
w w R w w 1
w R w 1 w R 1
w 1
1 R 1
R
w
1 R 1 1 R 1
35
Sidelobe Cancellation (SLC) System.
Main channel
Main beam
Auxiliary sensors

1
w
N
w
+

Adaptive weight
Adjustment.*
-
-

*Minimize cross-correlation between main and auxiliary channels.
36
Generalized Sidelobe Canceller.
Sensor Data
B

Adaptive
a
w
Fixed Beamformer
c
w
B Blocking matrix.
37
Example of Blocking Matrix for Sidelobe Canceller.
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
0
2
4
6
8
10
12
14
16
Main beam
Auxiliary beam from
blocking matrix.
For blocking of signal from desired look direction blocking matrix must be
orthogonal to "desired look" steering vector .
1
0
1 1 1 1 1
Example:
1
0
1
(
(
(
(

(
(
(
= = = =
(
(
(
(
(
(



H
1
H H
H
N
B
d
b d
B d 0 d B
b

1 1 1 1
(
(


38
Blind Source Separation
Independent
Sources
Array of
Sensors
Uncorrelated
Normalized Data
Separated
Sources
Mixing
Matrix
A
Linear
Preprocessor
T
Source
Separation
W .

.

.
.

.

.
.

.

.
.

.

.
Nonlinear
Adaptive
Algorithm
PCA ICA
( ) t
1
s
( ) t
2
s
( ) t s

( ) t s ( ) t x
( ) t
1
x
( ) t
2
x
( ) t
1
v
( ) t
2
v
1
u
2
u
u
( ) t
r
s
W
M r
( ) t
M
x
( ) t
r
v
r
u
39
Beamforming & Source Separation
SVD or ULVD

1
r
H
r
U
W
Subspace Filter Source Separation
Subspace
Estimation
Eigenstructure
&
Parameter
Estimation
TA Estimates
1
s
2
s
3
s
Estimated Source
Signals
s

X
M 2 1
,.. ,
40
Optimization Criteria and Basic Algorithm
Minimize or maximize a scalar performance measure ( )
Basic Adaptive Algorithm:
( 1) ( ) ( ) ( )
( ) search direction
( ) step size
Examples:
Steepest decent - ( ) ( )
LMS - estimated gradient
J
k k k k
k
k
k J k

+ = +
=
w
w w d
d
d
Stochasic Approximation
Newton's and Quasi-Newton
41
Common Adaptive Algorithms.
( ) ( )
( ) ( )
Steepest Decent:
( 1) ( ) -
Least-Mean Squares (LMS) algorithm:
( 1) ( ) 2
Estimation and Direct Matrix Inversion (DMI).
Recursive Least- Squares (RLS).
Affine Projection.
k k J k
k k k k

(
+ = +

+ = +
W
w w w
w w x
42
Error Performance Surface.
0
20
40
60
80
100
0
50
100
440
450
460
470
480
490
43
The Least Mean-Square Algorithm.
Widely used in many adaptive signal processing
applications.
Solves Wiener-Hopf equation without matrix inversion.
Simple to implement.
Convergence, learning curve and stability are well
understood.
Robust.
Basic algorithm has several variations and
improvements.
Widrow and Sterns, 1985: Alexander, 1986; Sibul,1987; Haykin, 1996; Van Trees,
2002, Poularkas and Ramadan, 2006.
44
Derivation of the LMS Algorithm.
2
0 0
2
( ) ( ) ( ) ( )
LMS algorithm assumes that performance measure is ( ) ( ).
( ) ( )

( ) 2 ( ) 2 ( ) ( ).
( )
( )
LMS weight adjustment alg
L
L
k d k k k
J k
k k
w w
J k k k
k
k
w
w

=
=
(
(
(
(

(
(
(
= = = (
(
(

(
(
(
(



T
2
w
x w
w
w x
| |
| |
0
orithm is:

( 1) ( ) ( )
( ) 2 ( ) ( ) step size.
( ) ( ), , ( ) filter weights at time .
( ) ( ), ( 1), , ( ) input data.
T
L
T
k k J
k k k
k w k w k k
k x k x k x k L


+ =
= +
=
=
w
w w w
w x
w
x

45
The LMS Algorithm for M-th Order Adaptive
Filter.
Inputs: M=filter length
=step-size factor
( ) input data to the adaptive filter
(0) int alize the
n

=
=
x
w weight vector=0

Outputs: ( ) adaptive filter output= ( ) ( ) ( )


( ) ( ) ( ) error
Algorithm: ( 1) ( ) 2 ( ) ( )
y n n n d n
n d n y n
n n n n

= =
= =
+ = +
T
w x
w w x
46
LMS Function.
function[w,y,e,J]=lms(x,dn,mu,M)
N=length(x);
y=zeros(1,N);
w=zeros(1,M);
for n=M:N
x1=x(n:-1:n-M+1);%for each n
% the vector x1 of length M with produced from x
%with elements in reverse order.
y(n)=w*x1';
e(n)=dn(n)-y(n);
w=w+2*mu*e(n)*x1;
w1(n-M+1,:)=w(1,:);
end;
J=e.^2;
47
Convergence of the Mean Weight Vector of LMS
{ } { } ( ) ( ) { }
{ } ( ) ( ) ( )
( )
( )
{ }
{ } ( ) { } ( ) ( )
{ }
{ }
( )
{ } { } ( )
( ) { }
( ) { } ( )
( 1) ( ) 2
( ) 2
( ) 2 ( ) 2 ( ) !
! Assumed that ( ) and are independent.
( ) 2 ( )
2 ( ) 2
Define ( ) 1
d
OPT OPT d
OPT
E k E k E k k
E k E d k k k k
E k E d k k E k k E k
k k
E k E k
E k
k E k k



+ = +
= +
= +
= +
= + =
= + =
T T
T
x xx
1
xx xx xx x
w w x
w w x x
w x x x w
x w
w R R w
I R w R w w R R
v w w v
( ) ( )
( )
( )
( )
( ) ( ) ( )
2
1 2
1 2 ( ) ( )
k
k k
k k k k

+ = = =
+ = =
xx
T T T T T
xx xx xx
T
xx
I R v
UU v UU U U v UU I R U U
h I h U v h
48
Convergence of the Mean Weight Vector of LMS
( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
( )
( )
( )
( ) | | | |
{ }
2
1
1 2 0
2 2 1 2 0
1 2 0
1 2 0 0
0
1 2
0
0 0 0 1 2
1
lim 0 if 1 1 2 1 0
lim ( /
k
k
k
m
k
L
MAX MAX
k
MAX
OPT
k
k
k tr tr
E k

=
= =
+ =
(

(
(
(

=
(
(
(
(


= < < < < =
=
xx
xx xx
xx
xx
h I h
h I h I h
h I h
h R
w w




49
Convergence Rate of the LMS Algorithm.
( )
( )
2
max
max
max min
2
LMS weight convergence is geometric with geometric ratio
for coordinate:
1 1 1
1 2 exp 1
2!
For large
Conditio
:
1
n number) o
1
1
f
2 1
2
1
(
p
p p
p p p
p
p p p
p p
r
p th
r
r

| |
= +
|
|
\
=


xx
R
50
Learning Curve and Misadjustment.
{ } { }
{ } { }
{ } { }
1
2 2
2 2
min min
2 2
MSE: 2
Minimum MSE: 2
=

opt d
opt opt opt
d d d opt
E E d
E E d
E d E d

=
= +
= = +
=
xx x
T T 1
xd xx
T T 1
xd xx
1
x xx x x
w R R
w R w R w
w R w R w
R R R R w
( ) ( ) ( ) ( )
( ) ( )
min
min min
min
Excess MSE: EMSE

( )
( ) ( ) ( ) ( )
( ) ( )
note:
= ( )

opt opt
d
pt
d
o
k k
k
k
k
k
k
k
k
k
k

+
= + = +
=
=
=

= +


T
xx
T
T
x
T T
xx
T
x x
T
xx
T
x
w w R
R R
w w
v R v v U U v
h h
R
R
w w w
U U
( )
= ( ) ( ) ( ) ( )

opt
k k k k

=
T T
xx
w
v R v h h
51
Misadjustment Due to Gradient Noise

Estimated gradient: ( ) 2 ( ) ( ) ( ) ( )
( ) true gradient
( ) zero-mean gradient estimation noise
At minimum mean-square error (mse) point ( ) =0 and

( ) 2 ( ) ( ) ( )
Gradient noise
k k k k k
k
k
k
k k k k

= = +

= =
x n
n
x n
{ }
{ } { }
{ }
2 2
min
covariance:
( ) ( ) 4 ( ) ( ) ( ) 4 ( ) ( ) ( )
4 ( )
( ) and ( ) ar
E k k E k k k E k E k k
k
k k

= =
=
H H H
xx
n n x x x x
R
x e uncorrelated
52
Misadjustment Due to Gradient Noise
( )
( )
( ) ( )
( ) ( )
{ }
LMS a lgorithm with noisy gradient:

( 1) ( ) ( ) ( ) ( ) ( )
1 ( ) 2 ( ) ( )
Transforming by :
( 1) 2 ( ) ( ) ( )
At close to optimum ( ) 0 (learning transients hav
k k k k k k
k k k k
k k k k k
E k


+ = + = + +
+ = + +
+ = + =
=
xx
T
T
w w w n
v v R v n
U
h I h n n U n
h

{ }
{ }
( )
{ }
( )
{ }
{ } { }
2
e died out)
Using the fact that ( ) ( ) =0 covariance of ( ) is:
( 1) ( 1) 2 ( ) ( ) 2 ( ) ( )
Close to optimum value ( ) are wide-sense stationary
( 1) ( 1) ( ) ( )
E k k k
E k k E k k E k k
k
E k k E k k
+ + = +

+ + =
T
T T T
T T
h n h
h h I h h I n n
h
h h h h


{ }
( )
{ }
( )
{ } | |
1
mi
2
n
2
min
( ) ( ) 2
( ) )
( ( )
(
) 4 E k k E
E k
k
k
k

= +
= =
T
T T
hh
h h I
h I
h h
R h
53
Misadjustment Due to Gradient Noise
{ } ( )
{ }
2
1
min
1
1
min
Excess MSE= ( ) ( )
Average excess MSE= ( ) ( ) ( )
=
1
average excess MSE
M=misadjustment =
1
If 1 for all (usual cas
N
p p
p
N
p
p
p
N
p
p
p
p p
k k
E k k E h k



=
=
=

T
T
h h
h h

| |
pmse
1
1
1 1 1
4
e), th
long filter large misadjustment.
fast convergence large m
en
M=
isadjustment.

1,
1 1
2
F
M


4
or

N
p
p
p
p pmse p
pmse
N
p
tr

=
=
| |
= =

=
=
|
|
\

xx
R

54
Sensitivity of Square Linear Systems.
Results from numerical analysis.
( )
( )
( )
( )
1
Let be a nonsingular matrix and:
The solution to approximates the
solution of with error estimate:
1
where denotes norms and is the

<
+ = +
=
| |

+
|
|
| |
\

|
\
1
A A
A
x A A x b b
x Ax b
x x b A A
x b A A
A
A
A

( )
( )
( )
max
2 2
min
2
max
2
min
conditioning number
of marix : .
For norm (ratio of singular values).
If
Large eigenvalue spread causes
is Hermeti
slow convergence and large errors
an: .
!

=
=
=
1
A A A A
A
A A

55
Variations of the LMS Algorithm.
1. Basic LMS algorithm.
2. Error sign LMS algorithm.
3. Normalized LMS.
4. Variable step-size LMS.
5. Leaky LMS.
6. Constrained LMS.
Algorithms for constrained beamforming.
7. Block LMS.
8. Transform domain LMS.
9. Complex LMS algorithms.
56
Variations of the LMS Algorithm.
( ) ( ) ( ) ( )
( )
( ) ( ) ( ) ( )
( ) ( )
1 2
1 0
0 0
1 0
1 0
( ) ( )
( 1) ( ) 2 ( )

The error sign LMS algorithm:
No

rmalized LMS:
Varia

ble step-size LMS:
p p p
k k sign k k
sign
k k k k
k k
w k w k k k x k p


+ = + (

>

= =

<

+ = + > (

+
+ = + (

T
w w x
w w x
x x
0,1, , 1. p N =
57
Time Varying Step Size
1
2
b
( ) 0
( )
( )
These conditions are satisfied by sequences:
c
(k)= 0.5 1.0
k
Example:
1 1 1
{ (k)}=c{1, , ,... ,...}
2 3
k
k
k
k
b
k

=
>
=
<
<

58
Leaky LMS Algorithm.
Wiener optimum weight calculation requires
inverting possibly an ill-conditiond matrix:
this causes numerical errors and slow convergence.
if the mode (1 2 ) does not converge.
1
2
opt d
p p
p
p

1
xx x
w R R

slow convergence for small .


Leaky LMS algorithm:
( 1) (1 2 ) ( ) 2 ( ) ( ) 0.
use ( ) ( ) ( ) ( ) we have:
( 1) 2 ( ) ( ) ( ) 2 ( ) ( ).
p
k k k k
k d k k k
k k k k d k k


+ = + >
=
(
(
+ = + +


T
T
w w x
x w
w I x x I w x
59
Leaky LMS Algorithm.
{ } | | { }
{ } | |
max
1
Mean weights:
( 1) 2 ( ) 2 ( )
1
if 0 leaky LMS algori
Algorithm is also used for
thm converges:
lim ( ) Biased Wiener
robust array processi
solution.
ng.
What



d
d
k
E k E k k
E k

( + = + +

< <
+
= +
xx x
xx x
w I R I w R
w R I R
is the excess MSE?
60
Block LMS Algorithm
| |
input signal ( ) ( ), ( 1), , ( 1
block 1
0 B 2B 3B B time samples
n x n x n x n M
k
= + x

| |
0 1 1
1
1

, , # ,
, 0,1, , 1 0,1,
( ) ( ), ( ), ,
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) (
M
M
n
k block time n sample time B samples in a block M FIR length
sample time n kB i i M k
filter weights k w k w k w
output y n y kB i k kB i w k x kB i
error n y n d n

=

= + = =
=
= + = + = +
=

T
T
w
w x

) ( ) ( ) ( ) kB i d kB i y kB i + = + +
61
Block LMS Algorithm
1
0
1
0
:
( 1) ( ) ( ) ( )
1
( 1) ( ) ( )
2
2
( ) ( ) ( )
B
i
B AVE
B
AVE B
i
In block LMS error signal is averaged over blocks B
k k kB i kB i
or k k k
k kB i kB i B
B

=
+ = + + +
+ =
= + + =

w w x
w w
x
62
Fast FFT Based LMS Algorithm.
Fast linear convolution
Fast linear correlation
| |
tap weights of FIR filter are padded with zeros.
point FFT 2
( )
Frequency domain weight vector ( ) 1 null vector.
( ) ( ), , ( 1), ( ), , ( 1)

M M
N N M
k
k FFT M
k FFT kM M kM kM kM M
=
(
=
(

= +
w
W 0
0
X x x x x
{ }{ }
| |
( 1)
( ) ( ), ( 1), , ( 1)
( )
.
k th block k th block
k y kM y kM y kM M
last M elements of IFFT k
element by element multiplication of matrices

= + +
( =

T
y
X W

63
Fast FFT Based LMS Algorithm. II
| |
| |
| |
(0) 0 0
0 (1)
Define: ( ) ( )
0
0 0 ( 1)
( ) ( ), ( 1), , ( 1)
( ) ( )
( ) ( ), ( 1), , ( 1)
( ) ( )
( )
k
k
k
X
X
k diag k
X M
k y kM y kM y kM M
last M elements of IFFT k k
k kM kM kM M
k FFT k first M elements of IFFT
k

(
(
(
= =
(
(


= + +
=
= + +
(
= =
(

T
T
U X
y
U W
e
0
E
e

( ) ( )
( )
( 1) ( )
k k
k
k k
(

(
+ = +
(

H
U E
W W
0
64
Recursive Least-Squares Algorithm.
1
1 2 1
1 1
1 1 1
1
1 1 1
samples, 1 filter coefficients.
n n n M
n n n M
n N n N n M N
n n n n M
x x x
x x x
N M
x x x
x x x x




+ +
+ + +
+ +
+ + + + + +
(
(
(
= +
(
(

( =

( (
(
= = = +
( (

(

n
n 1
n 1 n
H H H H H
n 1 n 1 n 1 n n n n n n
n n
X
z
z z
X X X z X z z X X
X X

( ) ( )
1 1
1
1 1
1
1 1
1
1
Using lemma:




+ +

+ +

+ +

+ +

+
+
(

+ = +
(
( ( (
(

( ( =
(
(
( +
(

(


( ( =

(

=
1
1
1 1 1 1 1
1 1
H H H
1 1
n n n n n n
H H
n 1 n 1 n n
1
H H
n n n n
1
H
n n n n
1
H
n n n
n
A BCD A A B C DA B DA
X X z z X X
X X X X
z X X z
I K z X X
X X z
K
1
1 1

+ +
(

(
( +

(

H
n
1
H H
n n n n
z
z X X z
65
Recursive Least-Squares Algorithm
1
1
1
1 1 1
1 1
1
1 1
n
n
y
y

+ +
+ +

+ + + + +

+ + + +

+ +

+ + +
( (
= =
( (

( ( =

( ( ( = +

( ( ( ( =

( ( +

n 1
n 1 n 1
n n
1
H H
n n 1 n 1 n 1 n 1
1
H H H
n n n n n n n 1
1 1
H H H H
n n n n n n n n n n
1
H
n n n n n
z
X y
X y
W X X X y
I K z X X X y z
X X X y K z X X X y
I K z X X z
| | | |
1
1
1 1 1
1 1 1 1 1

n
n
n n n
y
y
y y

+ + +

+ +
+ + + + +
(

(

= +
( +

= =
H
1
1
H H
n n n 1
n n n n
1
H H
n 1 n n n 1
n n n n
X X z
W K z W
z X X z
W K W K
66
Approximations to RLS
Projection algorithm:
0 2 0 1
Stochastic approximation:
0 2
LMS algorithm:
2

+
+
+ +
+
+
+ +
+ +
= < < < <
+
= < <
=
H
PA
n 1
n 1
H
n 1 n 1
H
SA
n 1
n 1
H
n 1 n 1
LMS H
n 1 n 1
z
K
z z
z
K
z z
K z
67
Constraints to Maintain Look-Direction
Frequency Response.
| | | | | | | |
| | | | | | | |
1
1 2
2



Equivalent look-direction Filt
K
J
JK K K
w w w
w w w
x
x






i


i



| | | | | | | |
1 2
er:

J
d
f f f
s

signal output
( )
array output
y k
Frost, 1972
68
Constraints to Maintain Look-Direction
Frequency Response.
Weight constraint to maintain desired look frequency response:
1, 2, , number of filter taps.
0
0
1
1 is j th group of K elements.
1
0
0
j j
j j
f j J J = =
(
(
(
(
(

(
(
(

(
(
(
=
(
(
(
(
(

(
(
(
(
(
(

(

T
c w
c c

1 2
, , , ,
j J
(

C c c c c
1
Define: constraint
j
J
f
f
f
(
(
(
(

=
(
(

T
C w

69
Constrained Optimization
{ } { }
( )
( )
( )
( )
1
2
1
Minimize
Subject to constraint
1
=
2
0
E y E
J
J


= = =
= =
=
+
= +

( =
= =
( =

=

=
T
T T T T T T
1 T
T T T
xx
T
T 1
opt xx opt
w
1 T 1
op
x
T T T
xx
w x
x
T 1
xx
x
x
x
t x
x
w xx w w R w
C w
w
C w C w w C
w R C C R
w R C C w C R C
C
C R
w R w C w
w R w C
w
C
C C
Frost,1972
70
Derivation of the Constrained Adaptive Algorithm
( ) ( )
( ) ( ) ( )
( 1) ( ) ( )
( 1) must satisfy the constraint:
=
Solving for LaGrange multiplier and substituting
into weight iteration equatio
1
n:
( 1) ( )
k
k k J k k k
k
k
k
k
k

+ = = + ( (

+
+ =
+ =

T T T
w xx
xx
T
C w C w C C
w w w w R w C
w
R w C
w w I
( ) ( ) ( ) ( )
( )
( )
( ) ( )
Define KJ-dimensional vector: and KJ KJ-dimensonal matrix:

The deterministic constrained gradient decent algoritm
( 1)
is:
k k k
k k

(
( +

(

+ =
1 1
T T T
xx
T
xx
1
T
1
T T
w
C C C C R w C C C
f C C C
P I C
P w R
C
w
C C
C
w

( )
Stochastic Constrained L
(0)
(
MSAlgoritm:
1) ( ) ( ) k k y k k
+ (

=
+ = + (

f
w f
w P w x f
71
>> %Levinson-Durbin Recursion.
>> r=[1,0.5,0.25,0.0625]; %auto-correlation sequence
>> a=levinson(r,3)
a =
1.0000 -0.5000 -0.0417 0.0833
>> h=filter(1,a,[1 zeros(1,25)]);
>> stem(h)
>> [bb,aa]=prony(h,3,3)
bb =
1.0000 0.0000 -0.0000 0.0000
aa =
1.0000 -0.5000 -0.0417 0.0833
>>
72
0 5 10 15 20 25 30
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
Impulse Response of anAll Pole Filter Computed by Levision Recursion.
I
m
p
u
l
s
e

R
e
s
p
o
n
s
e
n

You might also like