Professional Documents
Culture Documents
Optimization
by Rainer Storn
Siemens AG, ZFE T SN2, Otto-Hahn Ring 6, D-81739 Muenchen, Germany, currently on leave at ICSI,
1947 Center Street, Berkeley, CA 94704, storn@icsi.berkeley.edu
1
x r1 ,G which is perturbed to yield v i ,G +1 has no where rand() is supposed to generate a random
relation to x i ,G but is a randomly chosen number ∈ [0,1):
population member. Fig. 1 shows a two- L = 0;
dimensional example that illustrates the different
do {
vectors which play a part in the vector-generation
L = L + 1;
scheme. The notation: DE/rand/1 specifies that
the vector to be perturbed is randomly chosen, }while(rand()< CR) AND (L < D));
and that the perturbation consists of one
Hence the probability Pr(L>=ν) = (CR)ν-1, ν > 0.
weighted difference vector.
CR is taken from the interval [0, 1] and
X1
constitutes a control variable in the design
x N P Param e ter vectors from g eneration G
N ewly generated param eter vector v process. The random decisions for both n and L
F( x r2 ,G - x r3 ,G ) are made anew for each newly generated vector
M INIM UM ui,G+1.
x x x
x
x
x To decide whether or not it should become a
x
x i,G x member of generation G+1, the new vector
x r3 ,G x r ,G
2
x
x r1 ,G x ui,G+1 is compared to x i ,G . If vector ui,G+1
x
yields a smaller objective function value than
x
v i,G + 1 = x r1 ,G
x i,G , then x i,G +1 is set to ui,G+1; otherwise, the
+ F ( x r2 ,G - x r3 ,G )
X0 old value x i,G is retained.
Fig.1: An example of a two-dimensional 3 Scheme DE/best/1
objective function showing its contour
Basically, scheme DE/best/1 works the same
lines and the process for generating
way as DE/rand/1 except that it generates the
vi,G+1 in scheme DE/rand/1.
vector vi,G+1 according to:
v i,G +1 = x best ,G + F ⋅ ( x r1 ,G − x r2 ,G ) . (5)
In order to increase the potential diversity of the
This time, the vector to be perturbed is the best
perturbed parameter vectors, crossover is
performing vector of the current generation.
introduced. To this end, the vector:
Again, the computation of ui,G+1 is defined by
u i,G +1 = (u0i,G +1 , u1i,G +1 ,..., u( D −1)i,G +1 ) (3) eq. (4). This will be also be the case for the
remaining variants.
4 Scheme DE/best/2
with
%Kv j = n D , n + 1 D ,..., n + L − 1
Scheme DE/best/2 uses two difference vectors
=&
ji, G +1 for D
as a perturbation:
K' x
u ji,G +1
ji, G for all other j ∈[ 0, D − 1] v i,G +1 = x best ,G + F ⋅ ( x r1 ,G + x r2 ,G − x r3 ,G − x r4 ,G ) . (6)
(4)
Due to the central limit theorem the random
is formed. The acute brackets denote the
D variation is shifted slightly into gaussian direction
modulo function with modulus D. The starting which seems to be beneficial for many functions.
index, n, in (4) is a randomly chosen integer from
5 Scheme DE/rand-to best/1
the interval [0,D-1]. The integer L, which denotes
the number of parameters that are going to be Scheme DE/rand-to-best/1 places the
exchanged, is drawn from the interval [1, D]. perturbation at a location between a randomly
The algorithm which determines L works chosen population member and the best
according to the following lines of pseudo code population member:
2
vi,G+1 = xi,G + λ ⋅ ( xbest,G − xi,G ) + F ⋅ ( xr2 ,G − xr3 ,G ) . (7) crucial. The more knowledge one includes, the
more likely the minimization is going to converge.
λ controls the greediness of the scheme. To
The sum of error squares is not always a good
reduce the number of control variables we
choice as it has the potential to hide the path to
usually set λ = .
the global minimum. To minimize the maximum
error is often a better objective but seems to
6 Rules for the usage of DE yield more local minima.
Since it's invention [1], DE's has been tested
extensively against artificial and real-world
7 Design of a howling remover
minimization problems. So far, the following set
In order to demonstrate DE's applicability to real-
of linguistic rules has emerged to be useful when
world problems a howling removal unit has been
it comes to choose the control variables F, CR
designed with DE. In modern audio
and NP:
communication applications hands free
# At initialization the population should be spread environments are the current trend where
as much as possible over the objective function headsets are replaced by loudspeakers and
surface. microphones. The preferred way of audio
communication is full duplex, i.e. all
# Most often the crossover probability CR ∈ [0,
loudspeakers and microphones are active as
1] must be considerably lower than one (e.g.
opposed to half duplex or "walkie talkie" mode
0.3). If no convergence can be achieved,
where only one party is allowed to talk at a time.
however, CR ∈ [0.8, 1] often helps.
Howling is one of the problems in full duplex
# For many applications NP=10*D is a good communication and builds up due to the acoustic
choice. F is usually chosen ∈ [0.5, 1]. feedback path. One way to reduce howling is to
# The higher the population size NP is chosen, frequency-shift the signal that is picked up by a
the lower one should choose the weighting factor microphone by 10Hz to 20Hz before it is sent to
F. the other communicating parties. This shift is
usually not perceived as unnatural by the human
# watching the parameters: it's a good ear. The shifted signal appears at the destination
convergence sign if the parameters of the best loudspeakers and travels back to the originator,
population member change a lot from generation shifted by another 10Hz to 20Hz. The signal
to generation, especially at the beginning of the travels many times through this acoustic path
minimization and even if the objective function and is quickly shifted out of band, thus reducing
value of the best population member decreases the feedback problems. Fig. 2 shows the block
slowly. diagram of the howling removal unit.
# watching the objective function: it is not B and pa ss D ow nsam p ler
an indication that the minimization might take a Fig. 2: Howling removal unit.
long time or that the increase of the population
The upsampler fills in three zero samples
size NP might be beneficial for convergence.
between the adjacent signal samples xk and
# The objective function value of the best xk+1. The bandpass, which operates at four
population member shouldn't drop too fast, times the sampling frequency ωA, retains the
otherwise the optimization might get stuck in a components of the spectrum which are
local minimum. sopposed to be frequency-shifted by Δω. The
# The proper choice of the objective function is actual shift is performed via multiplication with an
3
appropriate cosine signal. The lowpass removes scheme the magnitude response of the
some artifacts which appear due to the shifting corresponding filter was sampled in the
operation, and the downsampler takes every frequency domain. The number of samples that
fourth sample of the lowpass result to yield the were used are indicated in figs. 3 and 4 which
output signal yk at the original sampling also show the final results of the design
frequency. procedure.
1.2
One of the most important features of the 10 sam ples 4 0 s am p les 10 sam ples
tra ns itio n b an d p as s ba nd transition b and
howling remover is low computational complexity
1
1.0 1
as the unit has to operate in a real-time
0.99
environment. Therefore it was crucial to design a 0.8
10 sam ples
M agnitude
20 sam ples
bandpass as well as a lowpass with minimum
stop b and sto p b an d
degree. To this end the bandpass was chosen to 0.6
0.8
a) For all parameters par[i]:
p4 = 2 + par[i ]* 100 if par[i ] >= 1 The bandpass filter result of fig. 3 was obtained
using strategy DE/best/2 with NP=300, F=0.5
e) For all p[i] denoting a pole angle: and CR=1. The entire design took 2,210,400
evaluations of the bandpass transfer function. A
p5 = 2 + par[i ]* 100 if
total of 24 parameters was used, 6 zero radii, 6
par[i ] ∈" stopband of filter " zero angles, 6 pole radii and 6 pole angles in the
complex z-plane [3]. The overall gain constant a0
To determine the deviation from the tolerance
4
was set to 0.005. The magnitude response of the π "# .
filter still violates some parts of the tolerance
interval
! 2$
x ∈ 0, The strategy used was
scheme slightly, yet the design was satisfactory DE/best/1 with NP=20, F=0.9 and CR=1. It took
for the howling remover. 30,020 function evaluations to get the result of
The lowpass filter result of fig. 4 was obtained fig. 5. The final speed increase of the cosine
using strategy DE/best/2 with NP=200, F=0.5 function computed by opti(x) was 17% compared
and CR=1. The entire design took 83,800 to the library function cos(x).
evaluations of the lowpass transfer function. A
total of 16 parameters was used, 8 zero radii and Conclusion
8 zero angles in the complex z-plane [3]. The Several variants of Differential Evolution (DE)
overall gain constant a0 was set to 0.005. have been introduced and general hints about
The third optimization for the howling remover their usage have been provided. Three real-world
was concerned with the cosine function the design tasks appearing in the development of a
evaluation of which takes up a non-neglectable howling remover for audio communications have
been solved successfully by applying DE. All
0, π "#
amount of computing time. Hence an
three design tasks could have been performed
approximation of cos(x) in the interval
! 2$ with specialized design tools; the advantage of
using DE, however, was that neither specialized
was performed using a polynomial opti(x) of third
and most probably expensive tools nor expert
degree. Fig. 5 shows that opti(x) yields an
knowledge concerning the design tasks
improved approximation compared to the taylor
themselves was necessary.
polynomial taylor(x) of third degree.
1
References
0.8 opti(x)=0.9975575805 [1] Storn, R. and Price, K., Differential Evolution
cos(x), taylor(x), p(x)
+0.03400468081*x
0.6 - 0.6044035554*x^2
- a simple and efficient adaptive scheme for
0.4 +0.1129638031*x^3 global optimization over continuous spaces,
0.2 Technical Report TR-95-012, ICSI,
0 http://http.icsi.berkeley.edu/~storn/litera.html
-0.2
-0.4 [2] Storn, R. and Price, K., Minimizing the real
-0.6 cos(x) functions of the ICEC'96 contest by
taylor(x) = 1 + 0.5*x^2
-0.8 Differential Evolution, Int. Conf. on
-1
0 0.5 1 1.5 2 2.5 π Evolutionary Computation, Nagoya, Japan.
x
[2] Mitra, S.K. and Kaiser, J.F., Handbook for
Fig. 5: Approximation of cos(x) by means of a digital signal processing, John Wiley, 1993.
polynomial of third order.