You are on page 1of 23

32 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 15, NO.

1, FEBRUARY 2011
Compact Differential Evolution
Ernesto Mininno, Member, IEEE, Ferrante Neri, Member, IEEE, Francesco Cupertino, Member, IEEE, and
David Naso, Member, IEEE
AbstractThis paper proposes the compact differential evo-
lution (cDE) algorithm. cDE, like other compact evolutionary
algorithms, does not process a population of solutions but its
statistic description which evolves similarly to all the evolutionary
algorithms. In addition, cDE employs the mutation and crossover
typical of differential evolution (DE) thus reproducing its search
logic. Unlike other compact evolutionary algorithms, in cDE,
the survivor selection scheme of DE can be straightforwardly
encoded. One important feature of the proposed cDE algorithm
is the capability of efciently performing an optimization process
despite a limited memory requirement. This fact makes the cDE
algorithm suitable for hardware contexts characterized by small
computational power such as micro-controllers and commercial
robots. In addition, due to its nature cDE uses an implicit
randomization of the offspring generation which corrects and
improves the DE search logic. An extensive numerical setup
has been implemented in order to prove the viability of cDE
and test its performance with respect to other modern compact
evolutionary algorithms and state-of-the-art population-based
DE algorithms. Test results show that cDE outperforms on a
regular basis its corresponding population-based DE variant.
Experiments have been repeated for four different mutation
schemes. In addition cDE outperforms other modern compact
algorithms and displays a competitive performance with respect
to state-of-the-art population-based algorithms employing a DE
logic. Finally, the cDE is applied to a challenging experimental
case study regarding the on-line training of a nonlinear neural-
network-based controller for a precise positioning system subject
to changes of payload. The main peculiarity of this control
application is that the control software is not implemented
into a computer connected to the control system but directly
on the micro-controller. Both numerical results on the test
functions and experimental results on the real-world problem
are very promising and allow us to think that cDE and future
developments can be an efcient option for optimization in
hardware environments characterized by limited memory.
Index TermsAdaptive systems, compact genetic algorithms,
differential evolution (DE), estimation distribution algorithms.
Manuscript received November 27, 2009; revised March 8, 2010, May 18,
2010, June 6, 2010, and June 16, 2010. Date of publication December 23,
2010; date of current version February 25, 2011. This work was supported
by the Academy of Finland, Akatemiatutkija 130600, Algorithmic Design
Issues in Memetic Computing, and by Tekes (the Finnish Funding Agency
for Technology and Innovation), under Grant 40214/08 (Dynergia).
E. Mininno is with the Department of Mathematical Information
Technology, University of Jyv askyl a, Jyv askyl a 40700, Finland (e-mail:
ernesto.mininno@jyu.).
F. Neri is with the Department of Mathematical Information Technology,
University of Jyv askyl a, Jyv askyl a 40700, Finland, and also with the Academy
of Finland, Helsinki FI-00501, Finland (e-mail: ferrante.neri@jyu.).
F. Cupertino is with the Department of Electrical and Electronic En-
gineering, Technical University of Bari, Bari 70100, Italy (e-mail: cuper-
tino@deemail.poliba.it).
D. Naso is with the Department of Electrical and Electronic Engineering,
Polytechnic Institute of Bari, Bari 70126, Italy (e-mail: naso@poliba.it).
Digital Object Identier 10.1109/TEVC.2010.2058120
I. Introduction
I
N MANY real-world applications, an optimization problem
must be solved despite the fact that a full power computing
device may be unavailable due to cost and/or space limitations.
This situation is typical of robotics and control problems. For
example, a commercial vacuum cleaner robot is supposed to,
over the time, undergo a learning process in order to locate
where obstacles are placed in a room (e.g., a sofa, a table, and
so on) and then perform an efcient cleaning of the accessible
areas. Regardless of the specic learning process, e.g., a neural
network training, the robot must contain a computational core
but clearly cannot contain all the full power components of
a modern computer, since they would increase the volume,
complexity, and cost of the entire device. Thus, a traditional
optimization meta-heuristic can be inadequate under these
circumstances. In order to overcome this class of problems
compact evolutionary algorithms (cEAs) have been designed.
A cEA is an evolutionary algorithm (EA) belonging to the
class of estimation of distribution algorithms (EDAs) (see
[1]). The algorithms belonging to this class do not store and
process an entire population and all its individuals therein but
on the contrary make use of a statistic representation of the
population in order to perform the optimization process. In this
way, a much smaller number of parameters must be stored in
the memory. Thus, a run of these algorithms requires much less
capacious memory devices compared to their correspondent
standard EAs.
The rst cEA was the compact genetic algorithm (cGA)
introduced in [2]. The cGA simulates the behavior of a
standard binary encoded genetic algorithm (GA). In [2] can
be seen that cGA has a performance almost as good as that of
GA. As expected, the main advantage of a cGA with respect
to a standard GA is the memory saving. An analysis of the
convergence properties of cGA by using Markov chains is
given in [3]. In [4] (see also [5]) the extended compact genetic
algorithm (ecGA) has been proposed. The ecGA is based on
the idea that the choice of a good probability distribution
is equivalent to linkage learning. The measure of a good
distribution is based on minimum description length models:
simpler distributions are better than the complex ones. The
probability distribution used in ecGA is a class of probability
models known as marginal product models. A theoretical
analysis of the ecGA behavior is presented in [6]. A hybrid
version of ecGA integrating the Nelder-Mead algorithm is
proposed in [7]. A study on the scalability of ecGA is given
in [8]. The cGA and its variants have been intensively used
1089-778X/$26.00 c 2010 IEEE
MININNO et al.: COMPACT DIFFERENTIAL EVOLUTION 33
to perform hardware implementation (see [9][11]). A cGA
application to neural network training is given in [12].
In [13], a memetic variant of cGA is proposed in order
to enhance the convergence performance of the algorithm in
the presence of a high number of dimensions. Paper [14]
analyzes analogies and differences between cGAs and (1 + 1)-
ES and extends a mathematical model of ES [15] to cGA
obtaining useful information on the performance. Moreover,
[14] introduces the concept of elitism, and proposes two
new variants, with strong and weak elitism respectively, that
signicantly outperform both the original cGA and (1 + 1)-
ES. A real-encoded cGA (rcGA) has been introduced in [16].
Some examples of rcGA applications to control engineering
are given in [17] and [18]. A simple real-encoded version of
ecGA has been proposed in [19] and [20].
This paper proposes a compact differential evolution (cDE)
algorithm. Although the general motivation behind the algo-
rithmic design is similar to that of cGA and its variants,
there are two important issues especially related to differential
evolution (DE) which are addressed in this paper. The rst
one is the survivor selection scheme which employs the so
called one-to-one spawning logic, i.e., in DE the survivor
selection is performed by performing a pair-wise compari-
son between the performance of a parent solution and its
corresponding offspring. In our opinion this logic can be
naturally encoded into a compact algorithm unlike the case of a
selection mechanism typical of genetic algorithms (GAs), e.g.,
tournament selection. In other words, we believe that a DE can
be straightforwardly encoded into a compact algorithm without
losing the basic working principles (in terms of survivor
selection). The second issue is related to the DE search
logic. A DE algorithm contains a limited amount of search
moves which might contribute to jeopardizing the generation
of high quality solutions which improve upon the current
best performance (see [21][23]). In order to overcome these
algorithmic limitations a popular modication of the basis DE
scheme is by introducing some randomness into the search
logic, for example in jitter and dither and in the jDE (see
[24]). A cDE algorithm, due to its nature, does not hold
a full population of individuals but contains its information
in distribution functions and samples the individuals from it
when necessary. Thus, unavoidably some extra randomness,
with respect to original DE, is introduced. This fact is, in
our opinion, benecial to the algorithmic functioning and
performance. These two issues will be explained in greater
detail in this paper.
The suitability of cDE to solve challenging problems in
environments with limited computational resources is as-
sessed by an experimental application on a challenging on-
line optimization problem. The considered case study regards
a complex control scheme for a specic class of direct-
drive linear motors with high positioning accuracy. These
motors are often directly coupled with their load, and the
absence of reduction/transmission gears makes the positioning
performance strongly inuenced by the various uncertainties
related to electro-mechanical phenomena (stiction, cogging
forces), which therefore must be compensated with appropriate
strategies. In this paper, the control scheme is based on the
widely-adopted sliding mode design approach, which includes
a nonlinear module used to estimate the equivalent effect of
disturbances acting on the motor. The module is obtained with
a recurrent neural network whose parameters are trained on-
line by means of a cDE. The training algorithm is implemented
on the same micro-controller running the control algorithms.
The experimental results show that with a negligible increase
of computational costs caused by the cDE algorithm, the con-
trol system is able to reach much better tracking performances
under perturbed operating conditions.
The remainder of this paper is organized in the following
way. General descriptions of cGA and rcGA are given in
Section II as well as a short description of DE. Section
II introduces also the notation used throughout this paper.
Section III describes the cDE algorithm proposed in this paper
and discusses its working principles and algorithmic details.
Section IV displays the numerical results and subdivides them
into three parts: a comparative analysis of population-based
DE and its corresponding compact versions highlights the
role of elitism and proves that cDE outperforms, in most
of the cases, its corresponding population based variant, a
comparison of cDE against other modern compact algorithms
and EDAs shows that the cDE algorithm outperforms the other
algorithms and thus can be considered as a very promising
compact algorithm, the comparison with state-of-the-art DE
based algorithms shows that, notwithstanding the low memory
requirements, cDE has a comparable performance for several
problems and therefore can be an efcient solution when the
hardware limitations forbid the use of a modern population
based algorithm. Section V shows the applicability of cDE
in a real world case and summarizes the results obtained on
the experimental case study, and nally Section VI gives the
conclusive remarks of this paper.
II. Background
In order to clarify the notation used throughout this chapter
we refer to the minimization problem of an objective function
f (x), where x is a vector of n design variables in a decision
space D. Without loss of generality, let us assume that param-
eters are normalized so that each search interval is [1, 1]. In
the following sections, we indicate with bold font the vectors
and matrices while in italic scalar values.
A. Compact Genetic Algorithm
With the term cGA we will refer to the original paper
algorithm proposed in [2]. The cGA consists of the following.
A binary vector of length n is randomly generated by assigning
a 0.5 probability to each gene to take either the value 0 or the
value 1. This describing the probabilities, initialized with n
values all equal to 0.5, is named as probability vector (PV).
By means of the PV two individuals are sampled and their
tness values are calculated. The winner solution, i.e., the
solution characterized by a higher performance, biases the PV
on the basis of a parameter N
p
called virtual population. More
specically, if the winner solution in correspondence to its
ith gene displays a 1 while the loser solution displays a 0
the probability value in position ith of the PV is augmented
34 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 15, NO. 1, FEBRUARY 2011
Fig. 1. cGA pseudo-code.
by a quantity
1
N
p
. On the contrary, if the winner solution in
correspondence to its ith gene displays a 0 while the loser
solution displays a 1 the probability value in position ith of
the PV is reduced by a quantity
1
N
p
. If the genes in position
ith display the same value for both the winner and loser
solutions, the ith probability of PV is not modied. This
scheme is equivalent to (steady-state) pair-wise tournament
selection as shown in [2]. For the sake of clarity, the pseudo-
code describing the working principles of cGA is displayed
in Fig. 1. With the function compete we simply mean the
tness-based comparison.
B. Elitism in Compact Genetic Algorithms
Two novel versions of cGA have been proposed in [14].
Both of these algorithms still share the same ideas proposed
in [2] but proved to have a signicantly better performance
compared to their corresponding earlier versions. These two
algorithms, namely persistent elitist compact genetic algorithm
(pe-cGA) and nonpersistent elitist compact genetic algorithm
(ne-cGA), modify the original cGA in the following way.
During the initialization, one candidate solution besides the
PV, namely elite, is also randomly generated. Subsequently,
only one (and not two as in cGA) new candidate solution is
generated. This solution is compared with the elite. If the elite
is the winner solution, the elite biases the PV as shown for
the cGA and the elite is conrmed for the following solution
generation and consequent comparison. On the contrary, if the
newly generated candidate solution outperforms the elite, the
PV is updated as shown for the cGA where the new solution is
the winner and the elite is the looser. Under these conditions,
the elite is replaced by the new solution which becomes the
new elite. In the scheme of pe-cGA this replacement only
Fig. 2. pe-cGA pseudo-code.
occurs under the condition that the elite is outperformed.
In the ne-cGA scheme, if an elite is still not replaced after
comparisons, the elite is replaced by a newly generated
solution regardless of its tness value. It must be remarked that
whether the persistent or nonpersistent scheme is preferable
seems to be a problem dependent issue (see [14]). The pseudo-
codes highlighting the working principles of pe-cGA and ne-
cGA are given in Figs. 2 and 3, respectively.
C. Real-Valued Compact Genetic Algorithm
The real-valued compact genetic algorithm (rcGA) has been
introduced in [16]. The rcGA is a compact algorithm inspired
by the cGA which exports the compact logic to a real-valued
domain thus obtaining an optimization algorithm with a high
performance despite the limited amount of employed memory
resources.
In rcGA the PV is not a vector but a n 2 matrix
PV
t
=
_

t
,
t

(1)
where and are, respectively, vectors containing, for each
design variable, mean and standard deviation values of a
Gaussian probability distribution function (PDF) truncated
within the interval [1, 1]. The height of the PDF has been
normalized in order to keep its area equal to 1. The apex t
indicates the generation (number of performed comparison).
MININNO et al.: COMPACT DIFFERENTIAL EVOLUTION 35
Fig. 3. ne-cGA pseudo-code.
At the beginning of the optimization process, for each
design variable i,
1
[i] = 0 and
1
[i] = , where is a large
positive constant ( = 10). This initialization of [i] values is
done in order to simulate a uniform distribution. Subsequently,
one individual is sampled as elite exactly like in the case of pe-
cGA or ne-cGA. A new individual is generated and compared
with the elite. As for the cGA, in rcGA the winner solution
biases the PV. The update rule for each element of values
is given by

t+1
[i] =
t
[i] +
1
N
p
(winner [i] loser [i]) (2)
where N
p
is virtual population size. The update rule for
values is given by
_

t+1
[i]
_
2
=
_

t
[i]
_
2
+
_

t
[i]
_
2

t+1
[i]
_
2
+
1
N
p
_
winner [i]
2
loser [i]
2
_
. (3)
Details for constructing (2) and (3) are given in [16]. It must
be remarked that, in [16], both persistent and nonpersistent
structures of rcGA have been tested and it is shown that also
in this case the best choice on the elitism seems to be problem
dependent.
D. Differential Evolution
According to its original denition (see [21], [25]), the
DE algorithm consists of the following steps. An initial
sampling of N
p
individuals is performed pseudo-randomly
with a uniform distribution function within the decision space
D. At each generation, for each individual x
k
of the N
p
,
three individuals x
r
, x
s
, and x
t
are pseudo-randomly extracted
from the population. According to the DE logic, a provisional
offspring x

o
is generated by mutation as
x

o
= x
t
+ F(x
r
x
s
) (4)
where F [0, 2] is a scale factor which controls the length
of the exploration vector (x
r
x
s
) and thus determines how
far from point x
k
the offspring should be generated. The
mutation scheme shown in (4) is also known as DE/rand/1.
Other variants of the mutation rule have been subsequently
proposed in literature (see [25]).
1) DE/best/1: x

o
= x
best
+ F (x
r
x
s
).
2) DE/cur-to-best/1: x

o
= x
k
+ F (x
best
x
k
)+F (x
r
x
s
).
3) DE/best/2: x

o
= x
best
+ F (x
r
x
s
) + F (x
u
x
v
).
4) DE/rand/2: x

o
= x
t
+ F (x
r
x
s
) + F (x
u
x
v
).
5) DE/rand-to-best/1: x

o
= x
t
+ F (x
best
x
t
)+F (x
r
x
s
).
6) DE/rand-to-best/2: x

o
= x
t
+ F (x
best
x
t
); +F (x
r
x
s
)
+F (x
u
x
v
)
where x
best
is the solution with the best performance
among the individuals of the population where x
u
and
x
v
are two additional pseudo-randomly selected individ-
uals. It is worthwhile to mention the rotation invariant
mutation (see [21], [25]).
7) DE/current-to-rand/1 x
o
=x
k
+K(x
t
x
k
)+F

(x
r
x
s
)
where K is the combination coefcient, which should be
chosen with a uniform random distribution from [0, 1]
and F

= K F. Since this mutation scheme already


contains the crossover, the mutated solution does not
undergo the crossover operation described below.
Recently, in [26], a new mutation strategy has been dened.
This strategy, namely DE/rand/1/either-or, consists of the
following:
x

o
=
_
x
t
+ F(x
r
x
s
) if rand (0, 1) < p
F
x
t
+ K(x
r
+ x
s
2x
t
) otherwise
(5)
where for a given value of F, the parameter K is set equal to
0.5 (F + 1).
When the provisional offspring has been generated by
mutation, each gene of the individual x

o
is exchanged with
the corresponding gene of x
i
with a uniform probability and
the nal offspring x
o
is generated
x
o
[i] =
_
x

o
[i] if rand (0, 1) Cr
x
k,
[i] otherwise
(6)
where rand (0, 1) is a random number between 0 and 1; i is the
index of the gene under examination; Cr is a constant value
namely crossover rate. This crossover strategy is well-known
as binomial crossover and indicated as DE/rand/1/bin.
36 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 15, NO. 1, FEBRUARY 2011
Fig. 4. DE/rand/1/bin pseudo-code.
For the sake of completeness, we mention that a few other
crossover strategies also exist, for example the exponential
strategy (see [26]). However, in this paper we focus on the
binomial strategy since it is the most commonly used and
often the most promising.
The resulting offspring x
o
is evaluated and, according to
a one-to-one spawning strategy, it replaces x
k
if and only if
f(x
o
) f(x
k
); otherwise no replacement occurs. It must
be remarked that although the replacement indexes are saved,
one by one, during the generation, the actual replacements
occur all at once at the end of the generation. For the sake of
clarity, the pseudo-code highlighting the working principles of
the DE/rand/1/bin is shown in Fig. 4.
III. Compact Differential Evolution
The cDE algorithm herein proposed combines the update
logic of rcGA and integrates it within a DE framework. cDE
is a simple algorithm which, despite its simplicity, can be a
very efcient possibility for optimization in limited memory
environments.
The proposed algorithm consists of the following. A (2 n)
PV is generated as shown for the rcGA: values are set equal
to 0 while values are set equal to a large number = 10. As
explained in Section II-C, the value of is empirically set in
order to simulate a uniform distribution at the beginning of the
optimization process. A solution is sampled from PV and plays
the role of elite. Subsequently, at each step t, some solutions
are sampled on the basis of the selected mutation scheme. For
example, if a DE/rand/1 mutation is selected, three individuals
x
r
, x
s
, and x
t
are sampled from PV.
Fig. 5. Sampling mechanism.
More specically, the sampling mechanism of a design
variable x
r
[i] associated to a generic candidate solution x
r
from PV consists of the following steps. As mentioned above,
for each design variable indexed by i, a truncated Gaussian
PDF characterized by a mean value [i] and a standard
deviation [i] is associated. The formula of the PDF is
PDF[i], [i] =
e

(x[i])
2
2[i]
2
_
2

[i]
_
erf
_
[i]+1

2[i]
_
erf
_
[i]1

2[i]
__ (7)
where erf is the error function (see [27]).
From the PDF, the corresponding cumulative distribution
function (CDF) is constructed by means of Chebyshev poly-
nomials according to the procedure described in [28]. It must
be observed that the codomain of CDF is [0, 1]. In order to
sample the design variable x
r
[i] from PV a random number
rand(0, 1) is sampled from a uniform distribution. The inverse
function of CDF, in correspondence of rand(0, 1), is then
calculated. This latter value is x
r
[i]. A graphical representation
of the sampling mechanism is given in Fig. 5.
As mentioned in Section II-C, the sampling is performed
on normalized values within [1, 1]. It can be noticed that
in order to obtain the value in the original interval [a, b], the
following operation must be performed: x
r
[i]
(ba)
2
+ a.
The mutation is then performed, e.g., according to (4), and
the provisional offspring is generated. A crossover, according
to (6), between the elite and the provisional offspring is
performed in order to generate the offspring. The tness value
of the offspring is then computed and compared with that
of the elite individual. The comparison allows the denition
of winner and loser solutions. Formulas (2) and (3) are
then applied to update the PV for the subsequent solution
generations. If the offspring outperforms the elite individual,
the offspring replaces the elite.
MININNO et al.: COMPACT DIFFERENTIAL EVOLUTION 37
Fig. 6. pe-DE/rand/1/bin pseudo-code.
Clearly, within a cDE framework all the above mentioned
mutation schemes (as well as others) can be easily im-
plemented. Similarly, both exponential crossover, instead of
binomial, can be integrated within the algorithm, if desired.
In addition, both persistent and nonpersistent elitism can be
adopted as elite strategies. Fig. 6 shows the pseudo-code
of a cDE employing rand/1 mutation, binomial crossover
and persistent elitism. This algorithm is indicated as pe-
cDE/rand/1/bin.
A. Compact Differential Evolution: Algorithmic Philosophy
As shown above, cDE is an algorithm which combines the
search logic of DE and the evolution structure of an EDA.
More specically, cDE, as well rcGA, shares some important
aspects with continuous-population based incremental learning
(PBIL
c
) and continuous-univariate marginal distribution algo-
rithm (UMDA
c
). The PBIL
c
algorithms (see [29], [30]) are
extensions of a population-based algorithm originally devised
for binary search spaces. One specic version of PBIL
c
uses a set of Gaussian distributions to model the vector of
problems variables (see [31]). During the iterations, this PBIL
algorithm only changes the mean values, while the standard
deviations are constant parameters that must be xed a priori.
In this sense cDE can be seen as a member of the family of
PBIL
c
which integrates an adaptation on the standard deviation
values. It must be remarked that in literature some examples
of simple standard deviation adaptation for PBIL
c
algorithms
have been proposed (see [32]). Regarding the UMDA
c
, as
shown in [33], the update of the PDFs occurs after a certain
amount of pairwise comparisons have been performed. Due to
its algorithmic structure UMDA
c
requires many comparisons
before a single PDF update. In this sense the cDE, as well
as the rcGA, can be seen as a UMDA
c
restricted to a single
comparison. It must be nally remarked the similarity between
cDE and (1 + 1)-ES. A proof related to this topic is given in
[14] and some considerations are reported in [16]. Similarly
to (1 + 1)-ES, cDE processes pairs of candidate solutions and
subsequently modies a PDF. The main difference between
38 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 15, NO. 1, FEBRUARY 2011
the two algorithms is that while (1 + 1)-ES encodes the search
strategy for generating new candidate solutions, cDE directly
encodes the new solution to be generated.
Regarding the structure of cDE with respect to the other
cEAs the following considerations can be done. With respect
to the cGAs, cDE generates the offspring solutions by means
of the DE logic instead of simply generating the offspring
directly from the PV. This fact guarantees higher exploration
properties since the combination of the solutions (by means of
DE mutation and crossover) allows the generation of candidate
solutions outside the PDF characterizing the PV. Thus, the
cDE enhances the probability of detecting unexplored
promising areas of the decision space and consequently
reduces the risk of premature convergence. In addition, the
cDE, unlike cGAs, allows a rather straightforward encoding
of the survivor selection scheme of the population-based DE.
More specically, DE employs one-to-one spawning already
based on simple pairwise comparisons and replacements. On
the contrary, in efcient GAs, the proper selection of a new
population is based on the performance of an entire population,
for example by means of a ranking procedure. The encoding
of GAs within PV clearly limits and simplies the survivor
selection scheme, often reducing the performance of the com-
pact algorithms with respect to their original population-based
variants. In other words, the encoding of a GA into a compact
algorithm imposes the employment of a pair-wise tournament
selection (see [2]), which can be nt the optimal choice for a GA
structure. Within a DE scheme, the situation appears to be dif-
ferent. The encoding into a compact algorithm does not in this
case jeopardize the replacement logic, except for the fact that
in DE the population replacement is usually performed at the
end of all the comparisons [we are referring to the most com-
mon discrete survivor selection scheme (see [26])]. Apart from
this fact, since one-to-one spawning structure characterizing
the DE survivor selection scheme can be seen as a pair-wise
tournament selection, the cDE maintains the same features
of the survivor selection logic of its population-based variant.
In addition, in order to understand the properties and work-
ing principles of cDE, an analysis on the DE functioning must
be carried out. From an algorithmic viewpoint, the reasons
for the success of DE have been highlighted in [34]: the
success of DE is due to an implicit self-adaptation contained
within the algorithmic structure. More specically, since, for
each candidate solution, the search rule depends on other
solutions belonging to the population (e.g., x
t
, x
r
, and x
s
),
the capability of detecting new promising offspring solutions
depends on the current distribution of solutions within the
decision space. During the early stages of the optimization
process, the solutions tend to be spread out within the decision
space. For a given scale factor value, this implies that mutation
appears to generate new solutions by exploring the space by
means of a large step size (if x
r
and x
s
are distant solutions,
F(x
r
x
s
) is a vector characterized by a large modulus).
During the optimization process, solutions of the population
tend to concentrate in specic parts of the decision space.
Therefore, step size in the mutation is progressively reduced
and the search is performed in the neighborhood of the
solutions. In other words, due to its structure, a DE scheme is
highly explorative at the beginning of the evolution and sub-
sequently becomes more exploitative during the optimization.
Although this mechanism appears, at rst glance, very
efcient, it hides a limitation. If for some reason the algorithm
does not succeed at generating offspring solutions which
outperform the corresponding parent, the search is repeated
again with similar step size values and will likely fail by falling
into an undesired stagnation condition (see [35]). Stagnation
is that undesired effect which occurs when a population-based
algorithm does not converge to a solution (even suboptimal)
and the population diversity is still high. In the case of DE,
stagnation occurs when the algorithm does not manage to
improve upon any solution of its population for a prolonged
amount of generations. In other words, the main drawback of
DE is that the scheme has, for each stage of the optimization
process, a limited amount of exploratory moves and if these
moves are not enough to generate new promising solutions,
the search can be heavily compromised.
In order to overcome these limitations, computer scientists
have been intensively proposing modications of the origi-
nal DE structure. For example it is worthwhile mentioning
the employment of extra or multiple mutation schemes, as
for example the trigonometric mutation, the self-adaptation
with multiple mutation schemes proposed, and the genera-
tion of candidate solutions by means of an alternative rule
(opposition-based) (see [36]). The offspring generation, by
means of the composition of two contributions, the rst
one resulting from the entire population, the second from a
subset of it, is proposed in [37]. In other works, local search
algorithms support the DE search (see [38][40]). In modern
DE-based algorithms randomization seems to play a very
important role on the algorithmic performance. It is a well-
known fact that a scale factor randomization tends to improve
upon the algorithmic performance as in the case of jitter and
dither (see [21], [25], and references therein). In a similar
way, in [24] a controlled randomization of scale factor and
crossover rate is proposed. In [41], the concept of parameter
randomization is encoded within a sophisticated adaptive rule
which is based on truncated Gaussian distributions. In [42],
a randomized adaptation of the parameters is combined with
multiple mutation schemes.
The cDE clearly has something in common with these
approaches. The main difference is that instead of imposing a
randomization on the control parameters, the cDE imposes
a randomization on the solutions which contribute to the
offspring generation. However, the cDE can be seen as a
DE which introduces a randomization within the solution
generation and therefore introduces extra search moves which
assist the DE structure and attempt to improve upon its
performance. A numerical validation of this intuition is given
in Section IV when the results are shown. Thus, with respect
of other cEAs, cDE has the advantage/novelty of being a
straightforward implementation of its population-based equiv-
alent. With respect to DE, cDE employs a novel structure for
selecting the individuals composing the provisional offspring
by means of the sampling from a probability distribution.
As a nal remark, it should be observed that cDE is not
an improved version of DE but, on the contrary, is a light
MININNO et al.: COMPACT DIFFERENTIAL EVOLUTION 39
TABLE I
Average Final Fitness Standard Deviation for DE/rand/1/bin Schemes
Problem pe-cDE/rand/1/bin ne-cDE/rand/1/bin DE/rand/1/bin
n = 10
f
1
2.683e-11 1.54e-11 5.389e-10 5.48e-10 3.083e-08 2.43e-08
f
2
3.935e-03 8.78e-03 1.898e+02 3.53e+02 3.284e+02 1.52e+02
f
3
2.608e+02 1.05e+03 7.711e+02 2.11e+03 1.312e+03 1.20e+03
f
4
1.891e-06 5.08e-07 7.758e-06 3.28e-06 2.987e-04 9.74e-05
f
5
1.976e+00 1.98e+00 3.516e-01 3.52e-01 4.336e+00 1.16e+00
f
6
8.419e-03 1.02e-02 1.131e-03 3.85e-03 9.546e-03 2.37e-03
f
7
2.400e-01 2.40e-01 6.309e-02 6.31e-02 8.079e-01 3.04e-01
f
8
1.658e-01 8.42e-03 6.039e+00 1.13e-03 3.079e-14 3.04e-14
f
9
3.093e+01 1.24e+01 3.712e+01 5.73e+00 5.261e+01 6.51e+00
f
10
5.125e+00 2.60e+00 1.725e+01 1.33e+01 5.646e+00 1.21e+00
f
11
1.365e-02 1.68e-02 1.648e-01 3.02e-01 1.273e-01 6.81e-01
f
12
1.167e+02 1.27e+02 4.295e+01 5.26e+01 1.604e+02 1.77e+01
f
13
7.813e+02 1.74e+02 5.896e+02 1.32e+02 5.800e+02 5.41e+01
f
14
1.292e-06 4.59e-07 4.215e-06 1.90e-06 4.782e-05 1.96e-06
f
15
1.000e+02 1.79e-08 1.000e+02 4.67e-04 4.905e+01 5.01e+00
f
16
5.532e-13 4.48e-13 7.603e-12 6.54e-12 9.119e-06 5.43e-06
f
17
1.150e+00 1.02e-11 1.150e+00 8.20e-11 1.150e+00 2.65e-10
f
18
5.887e+01 4.91e+02 2.523e+02 1.51e+02 1.043e+03 8.22e+02
f
19
9.673e+01 1.10e+00 9.924e+01 7.05e-01 9.932e+01 6.07e-01
f
20
5.192e+01 5.91e+02 9.240e+02 1.03e+03 1.746e+03 1.46e+03
n = 30
f
1
1.996e+02 7.36e+02 4.828e-28 9.03e-28 6.187e-07 2.33e-08
f
2
1.282e+04 3.13e+03 2.892e+04 4.83e+03 3.640e+04 2.84e+03
f
3
6.535e+06 3.20e+07 5.294e+02 1.19e+03 5.242e+04 3.29e+04
f
4
1.140e+01 1.09e+00 1.642e+01 3.68e-01 1.863e+01 2.07e-01
f
5
1.264e+01 1.01e+00 1.650e+01 4.27e-01 1.863e+01 2.54e-01
f
6
1.634e-02 2.99e-02 6.867e-02 6.75e-02 2.319e+02 2.07e+01
f
7
2.278e-01 2.74e-01 1.867e-01 2.20e-01 2.319e+02 2.81e+01
f
8
7.384e+01 1.23e+01 1.531e+02 2.09e+01 3.211e-14 6.25e-14
f
9
1.317e+02 2.66e+01 2.655e+02 2.28e+01 2.866e+02 1.84e+01
f
10
8.067e+03 4.90e+03 3.665e+04 5.28e+03 1.632e+05 1.77e+04
f
11
1.490e+03 4.14e+02 1.939e+03 6.69e+02 6.480e+03 6.37e+02
f
12
8.962e+01 6.44e+01 1.299e+02 2.03e+01 3.423e+02 3.60e+01
f
13
9.443e+02 2.85e+01 9.707e+02 2.96e+01 1.087e+03 1.47e+01
f
14
5.220e+00 2.62e+00 9.917e-01 2.69e+00 1.065e+01 1.02e+00
f
15
9.954e+01 1.10e+00 9.930e+01 6.93e-01 2.181e+01 3.14e+00
f
16
1.216e+00 2.00e+00 4.803e-01 6.72e-01 9.483e+01 1.30e+01
f
17
1.511e-01 1.35e+00 3.378e-01 1.25e+00 1.574e-01 3.00e-01
f
18
8.939e+03 1.57e+03 1.003e+04 1.22e+03 1.021e+04 1.42e+03
f
19
1.248e+02 2.65e+00 1.279e+02 1.59e+00 1.300e+02 1.54e+00
f
20
1.013e+05 4.49e+04 1.204e+05 4.09e+04 3.804e+05 3.01e+04
n various
f
21
5.296e-02 7.79e-18 5.296e-02 2.75e-12 5.296e-02 4.31e-09
f
22
1.067e+00 4.22e-16 1.067e+00 1.71e-13 1.067e+00 1.36e-04
f
23
3.980e-01 3.56e-04 3.983e-01 5.19e-04 3.984e-01 6.57e-04
f
24
3.863e+00 4.29e-11 3.863e+00 1.10e-08 3.863e+00 8.37e-05
f
25
3.288e+00 5.53e-02 3.322e+00 8.74e-04 3.248e+00 2.91e-02
f
26
5.040e+00 3.16e+00 9.267e+00 1.37e+00 5.740e+00 1.69e+00
f
27
4.822e+00 3.07e+00 9.764e+00 1.16e+00 6.150e+00 1.29e+00
f
28
6.048e+00 3.64e+00 1.003e+01 5.77e-01 6.216e+00 1.86e+00
40 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 15, NO. 1, FEBRUARY 2011
version of DE. With light version we mean that cDE mimics
the DE behavior and performance but imposes much lower
memory requirements. More specically, while, according to
a typical setting, DE requires the storage of N
p
= 2 n
or even more individuals, cDE requires the storage of only
four individuals independently on the dimensionality of the
problem (see details in Section IV). This fact allows the cDE
implementation in hardware environments characterized by
memory limitations and, dually, a high performance notwith-
standing modest hardware investments.
IV. Numerical Results
The following test problems have been considered in this
paper.
f
1
Shifted sphere function: F
1
from [43].
f
2
Shifted Schwefels problem 1.2: F
2
from [43].
f
3
Rosenbrocks function: f
3
from [42].
f
4
Shifted Ackleys function: f
5
from [42].
f
5
Shifted rotated Ackleys function: f
6
from [42].
f
6
Shifted Griewanks function: f
7
from [42].
f
7
Shifted rotated Griewanks function: f
8
from [42].
f
8
Shifted Rastrigins function: F
9
from [43].
f
9
Shifted rotated Rastrigins function: F
10
from [43].
f
10
Shifted noncontinuous Rastrigins function: f
11
from
[42].
f
11
Schwefels function: f
12
from [42].
f
12
Composition function 1: CF1 from [44]. The function
f
12
(CF1) is composed using ten sphere functions.
f
13
Composition function 6: CF6 from [44]. The function
f
13
(CF6) is composed by using ten different benchmark
functions, i.e., two rotated Rastrigins functions, two rotated
Weierstrass functions, two rotated Griewanks functions, two
rotated Ackleys functions, and two rotated Sphere functions.
f
14
Schwefel problem 2.22: f
2
from [45].
f
15
Schwefel problem 2.21: f
4
from [45].
f
16
Generalized penalized function 1: f
12
from [45].
f
17
Generalized penalized function 2: f
13
from [45].
f
18
Schwefels problem 2.6 with Global Optimum on
Bounds: F
5
from [43].
f
19
Shifted rotated Weierstrass function: F
11
from [43].
f
20
Schwefels problem 2.13: F
12
from [43].
f
21
Kowaliks function: f
15
from [46].
f
22
Six-hump camel-back function: f
20
from [42].
f
23
Branin function: f
17
from [45].
f
24
Hartmans function 1: f
19
from [46].
f
25
Hartmans function 2: f
20
from [46].
f
26
f
28
Shekels family: f
21
f
24
from [46].
The test problems have been selected by employing entirely
the benchmark used in [42] (f
1
f
17
and f
21
f
28
). In addition,
our benchmark has been expanded by adding a few extra
problem from [43] (f
1820
). Some of the problems appearing
in [42] were chosen from [43] and [46]. In the list above, we
indicated the original papers where the problems have been
dened and proposed for the rst time.
All the algorithms in this paper have been run for test
problems f
1
f
20
with n = 10 and n = 30. Test problems
f
21
f
28
are characterized by a unique dimensionality value.
These problems have been run with the original dimensionality
as shown in the papers mentioned above. Thus, totally 48 test
problems are contained in this paper. For each algorithm, 30
independent runs have been performed. The budget of each
single run has been xed equal to 5000n tness evaluations.
Actual and virtual population sizes have been set equal to
N
p
= 2 n.
A. Validation of Compact Differential Evolution
This section presents the result of cDE with respect to its
population-based variant. The aim of this section is to prove
that the proposed compact encoding does not deteriorate the
performance of the corresponding population based algorithm.
In other words, this section shows that the proposed light
version of DE, despite its minimal memory requirement is
not less performing than a heavy standard population based
DE. In order to pursue this aim, four DE schemes have been
considered.
1) DE/rand/1/bin: F = 0.9 and Cr = 0.9.
2) DE/rand-to-best/1/bin: F = 0.5 and Cr = 0.7.
3) DE/rand-to-best/2/bin: F = 0.5 and Cr = 0.7.
4) The dithered version of DE (DE-dither) presented in [47]
with F = 0.5 (1 + rand (0, 1)) updated at each generation
and Cr = 0.9.
For each scheme, the corresponding cDE algorithms, with
persistent and non persistent elitist strategies, have been tested.
Each cDE algorithm employs the same parameter setting
of the corresponding population based algorithm. Regarding
the nonpersistent elitist schemes, has been set equal to
0.5 n throughout all the experiments presented in this
paper (including ne-cGA and ne-rcGA). Tables I shows the
average of the nal results detected by each DE/rand/1/bin-
like algorithm the corresponding standard deviation values.
The best results are highlighted in bold face.
In order to strengthen the statistical signicance of the
results, the Wilcoxon Rank-Sum test has also been applied
according to the description given in [48], where the con-
dence level has been xed to 0.95. Table II summarizes
the results of the Wilcoxon test for each version of cDE
against its corresponding population-based algorithm. A +
indicates the case in which cDE statistically outperforms, for
the corresponding test problem, its corresponding population-
based algorithm; a = indicates that a pairwise comparison
leads to success of the Wilcoxon Rank-Sum test, i.e., the two
algorithms have the same performance; a - indicates that
cDE is outperformed.
Results of the Wilcoxon test showed that both, persistent
and nonpersistent elitist, cDE versions signicantly outper-
form their corresponding population-based DE. Regarding the
schemes DE/rand/1/bin, DE/rand-to-best/1/bin, and DE/rand-
to-best/2/bin, for almost all the problems analyzed we can
conclude that cDE has a performance at least as good as
DE. The only exception over the 144 cases analyzed (48 test
problems 3 mutation schemes considered) is test problem
f
13
. Only in this case population-based DE seems to be more
promising than the compact variants (regardless the mutation
scheme). The reason of the overall success of cDE schemes
MININNO et al.: COMPACT DIFFERENTIAL EVOLUTION 41
TABLE II
Wilcoxon Test for the Validation Study
rand/1/bin rand-to-Best/1/bin rand-to-Best/2/bin Dither
pe-cDE vs. DE ne-cDE vs. DE pe-cDE vs. DE ne-cDE vs. DE pe-cDE vs. DE ne-cDE vs. DE pe-cDE vs. DE ne-cDE vs. DE
n = 10
f1 + + + + + + + +
f2 + + + + + + + =
f3 + + + + + + =
f4 + + + = + +
f5 + + + + + +
f6 + + + + + +
f7 + + + + + +
f8 + + + +
f9 + + + + + + =
f10 + + + + + +
f11 + = + + = + + +
f12 = + = + + +
f13 = = =
f14 + + + + + + = =
f15 + + + + + +
f16 + + = = + + + +
f17 = = + + + + = +
f18 + + + + + + =
f19 + = + = + = + =
f20 + + + + + + + =
n = 30
f1 + + + + + +
f2 + + + + +
f3 + + + + + =
f4 + + + + + + =
f5 + + + + + +
f6 + + + + + + =
f7 + + + + + +
f8 + + + +
f9 + + + + + + +
f10 + + + + + +
f11 + + + + + + =
f12 + + + + +
f13 + + + + + +
f14 + + + + + + + +
f15 + + + + + +
f16 + + + + + +
f17 + + + + + +
f18 + + + + + +
f19 + + + + + + + +
f20 + + + + + + = =
n various
f21 = = = = = = = =
f22 = = + + = = + +
f23 + = + + + + = =
f24 = = + = + + = =
f25 + + + + + + = +
f26 = + = + = +
f27 = + = + = +
f28 = + = + = +
+ means that cDE outperforms DE, means that cDE is outperformed, and = means that the algorithms have the same performance.
42 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 15, NO. 1, FEBRUARY 2011
Fig. 7. Performance trends cDE/rand/1/bin vs. DE/rand/1/bin. (a) f
2
with n = 10. (b) f
15
with n = 10. (c) f
1
8 with n = 10.
with respect to population-based DE is, in our opinion, due
to the implicit randomization of the search logic within the
DE structure. As explained above, a certain degree of ran-
domization seems to be benecial on DE performance and
the reported results seem to conrm this intuition. Finally, it
should be remarked the memory saving of cDE with respect
to DE. A DE scheme requires N
p
permanent memory slots in
order to store the population and volatile memory slots, the
rst being reserved for offspring generation and the second
for keeping track of the replacement to be performed at the
end of each generation. A cDE scheme (regardless of the
elitist strategy) requires three permanent memory slots, two
reserved for the PV and one for the elite and one slot of
volatile memory for the offspring generation (four memory
slots in total). Since N
p
must be set in dependance to the
amount of variables n of the problem, it is clear that if
the problem is multi-variate there is a very relevant memory
saving. For example, if n =10 [which can be the amount of
variables in a simple micro-controller (see [22], [39])], DE
would require, e.g., N
p
= 2 n = 20 permanent memory slots
while cDE only four memory slots. Thus, with these small
memory requirements, the optimization would be allowed on
a modest and relatively cheap hardware (micro-controller) thus
leading to a competitive product for the industrial market.
Since the employment of compact algorithms does not cause a
worsening in the DE performance (unlike what often happens
with GAs) the proposed cDE algorithms are, in our opinion,
an appealing alternative for optimization in many industrial
contexts.
Regarding the dithered schemes, the population-based DE
has a better performance than the cDE variants but still the
compact algorithms obtain a respectable performance. This
fact can be interpreted by considering that DE-dither already
employs a certain degree a randomization which causes a good
performance. On the contrary, the randomization introduced by
the compact algorithms within the sampling of the individuals
does not lead to further advantages. However, taking into
account that the memory requirement for the cDE-dither
algorithms is a way smaller than that for DE-dither, we can
conclude that the results are overall satisfactory.
Fig. 7 shows average performance trends of cDE/rand/1/bin
and its population-based DE algorithm over a selection of the
test problems listed above.
B. Comparison with State-of-the-Art Compact Evolutionary
Algorithms
In order to analyze the performance of the proposed
cDE algorithms with respect to other cEAs, pe-cDE/rand-to-
best/1/bin and ne-cDE/rand-to-best/1/bin have been compared
with pe-cGA and ne-cGA proposed in [14] and pe-rcGA and
ne-rcGA proposed in [16]. The choice of the rand-to-best
mutation scheme is due to the fact that it displayed the best
MININNO et al.: COMPACT DIFFERENTIAL EVOLUTION 43
TABLE III
Average Final Fitness Standard Deviation for cDE Algorithms Against the State-of-the-Art CEAs
Problem pe-cGA [14] ne-cGA [14] pe-rcGA [16] ne-rcGA [16] pe-cDE/rand-to-best/1/bin ne-cDE/rand-to-best/1/bin
n = 10
f
1
6.963e+02 5.93e+02 6.996e+02 5.56e+02 9.783e-26 4.40e-25 1.898e-28 2.42e-28 3.096e-03 2.40e-03 2.021e-02 1.22e-02
f
2
8.935e+03 6.07e+03 1.049e+04 7.25e+03 1.188e+03 1.15e+03 3.350e+01 5.46e+01 1.460e+00 1.29e+00 7.179e+01 9.77e+01
f
3
3.609e+07 4.94e+07 3.143e+07 3.00e+07 2.467e+02 3.95e+02 8.098e+01 1.67e+02 3.559e+01 3.19e+01 6.370e+02 1.27e+03
f
4
9.726e+00 1.83e+00 9.667e+00 2.65e+00 1.013e+01 5.59e+00 1.925e-01 4.40e-01 2.219e-02 9.17e-03 7.846e-02 2.87e-02
f
5
1.038e+01 2.07e+00 1.112e+01 2.64e+00 7.388e+00 5.23e+00 3.453e-01 7.20e-01 1.750e+00 1.10e+00 2.218e-01 3.11e-01
f
6
2.496e+02 8.17e+00 2.496e+02 1.00e+01 7.413e-02 6.94e-02 1.119e-02 1.58e-02 5.150e-02 3.26e-02 2.922e-02 1.39e-02
f
7
2.509e+02 1.12e+01 2.514e+02 8.67e+00 1.681e-01 1.01e-01 7.789e-02 7.81e-02 2.679e-01 1.61e-01 3.852e-02 3.38e-02
f
8
4.106e+01 1.23e+01 4.419e+01 1.43e+01 3.366e+01 1.20e+01 8.291e+00 3.47e+00 5.503e-03 6.92e-03 2.916e+00 1.31e+00
f
9
5.815e+01 1.36e+01 5.365e+01 1.30e+01 3.640e+01 1.11e+01 1.244e+01 3.76e+00 2.380e+01 1.11e+01 3.104e+01 5.21e+00
f
10
8.543e+03 6.89e+03 8.078e+03 3.73e+03 2.351e+01 1.01e+01 8.996e+00 3.79e+00 7.154e+00 2.62e+00 5.726e+01 1.61e+01
f
11
8.306e+02 2.90e+02 8.164e+02 3.07e+02 5.528e+02 2.85e+02 2.702e+01 5.39e+01 2.502e-02 2.19e-02 2.344e+00 4.07e+00
f
12
3.225e+02 1.27e+01 3.238e+02 1.47e+01 1.070e+01 2.91e+01 6.016e-04 2.94e-03 1.261e+02 1.25e+02 2.744e+01 5.50e+01
f
13
9.481e+02 3.29e+01 9.449e+02 3.15e+01 5.294e+02 8.23e+01 5.637e+02 9.79e+01 7.842e+02 1.71e+02 7.239e+02 1.78e+02
f
14
5.614e+00 2.91e+00 5.239e+00 2.21e+00 4.127e+00 4.90e+00 1.091e-03 5.34e-03 1.065e-02 4.11e-03 3.094e-02 8.61e-03
f
15
3.309e+01 1.13e+01 3.428e+01 1.35e+01 1.000e+02 5.06e-09 1.000e+02 5.25e-08 1.000e+02 7.17e-04 9.999e+01 6.29e-03
f
16
4.472e+04 1.37e+05 1.053e+05 3.37e+05 1.401e+00 1.91e+00 2.724e-01 9.73e-01 2.316e-04 2.54e-04 1.002e-03 1.05e-03
f
17
1.121e+06 3.66e+06 3.765e+05 4.11e+05 7.869e-01 8.94e-01 1.095e+00 2.55e-01 1.148e+00 2.18e-03 1.146e+00 3.59e-03
f
18
2.176e+03 1.22e+03 1.919e+03 1.05e+03 1.438e+02 5.36e+02 2.614e+02 7.23e+01 2.826e+02 1.95e+01 2.842e+02 1.78e+01
f
19
9.961e+01 1.06e+00 1.001e+02 1.24e+00 9.650e+01 1.64e+00 9.827e+01 1.29e+00 9.650e+01 1.08e+00 9.940e+01 6.26e-01
f
20
5.786e+03 4.96e+03 4.905e+03 3.76e+03 3.256e+03 5.16e+03 7.859e+03 6.16e+03 6.142e+02 1.24e+03 2.171e+03 9.27e+02
n = 30
f
1
1.446e+04 4.63e+03 1.253e+04 3.52e+03 1.906e+04 9.62e+03 8.075e+02 1.00e+03 5.334e-17 4.99e-17 1.816e-15 9.57e-16
f
2
1.628e+06 7.00e+05 1.582e+06 5.49e+05 2.677e+04 4.78e+03 3.082e+04 6.51e+03 4.628e+03 2.54e+03 1.183e+04 4.15e+03
f
3
2.432e+09 1.62e+09 2.784e+09 1.43e+09 1.803e+09 2.02e+09 2.861e+07 5.83e+07 4.141e+02 1.72e+03 5.218e+01 4.75e+01
f
4
1.681e+01 9.45e-01 1.609e+01 1.61e+00 1.859e+01 4.15e-01 1.196e+01 2.24e+00 1.930e+00 2.17e+00 1.351e-01 3.67e-01
f
5
1.721e+01 1.41e+00 1.708e+01 1.25e+00 1.880e+01 4.54e-01 1.214e+01 2.18e+00 3.935e+00 1.60e+00 6.886e-01 7.34e-01
f
6
8.840e+02 3.08e+01 8.951e+02 4.65e+01 2.259e-03 4.11e-03 1.026e-03 3.77e-03 1.240e-02 1.08e-02 8.663e-02 3.95e-02
f
7
8.778e+02 3.34e+01 8.791e+02 4.26e+01 3.403e-02 9.71e-02 4.380e-02 9.03e-02 1.955e-01 2.16e-01 5.561e-02 6.78e-02
f
8
2.265e+02 4.06e+01 2.107e+02 2.91e+01 2.037e+02 2.74e+01 9.556e+01 2.06e+01 5.605e+01 1.24e+01 1.203e+02 2.53e+01
f
9
3.013e+02 4.72e+01 2.992e+02 4.29e+01 1.985e+02 3.06e+01 1.642e+02 3.85e+01 1.046e+02 2.70e+01 2.202e+02 3.02e+01
f
10
1.307e+05 4.96e+04 1.295e+05 3.77e+04 2.900e+03 3.07e+03 3.108e+02 1.70e+02 3.664e+02 1.55e+02 3.799e+03 1.43e+03
f
11
4.947e+03 7.03e+02 4.844e+03 7.40e+02 3.156e+03 7.54e+02 1.231e+03 4.78e+02 6.308e+02 2.56e+02 5.684e+02 2.89e+02
f
12
8.947e+02 6.00e+01 9.024e+02 7.53e+01 9.373e+01 3.59e+01 6.181e+01 1.82e+01 8.347e+01 1.27e+02 4.179e+01 7.78e+01
f
13
1.012e+03 2.04e+01 1.020e+03 2.33e+01 1.095e+03 6.62e+01 9.379e+02 1.61e+01 9.000e+02 2.16e-01 8.958e+02 2.04e+01
f
14
5.830e+01 1.50e+01 5.168e+01 1.05e+01 9.348e+01 1.59e+01 2.259e+01 1.18e+01 6.754e-02 2.22e-01 6.180e-08 1.21e-07
f
15
6.992e+01 6.25e+00 7.024e+01 6.14e+00 6.334e+01 3.10e+01 9.773e+01 3.58e+00 1.000e+02 2.40e-07 9.996e+01 1.88e-02
f
16
3.350e+07 1.84e+07 2.734e+07 1.93e+07 8.438e+05 2.24e+06 1.716e+04 8.39e+04 7.346e-02 1.63e-01 4.323e-02 1.10e-01
f
17
8.933e+07 5.24e+07 9.072e+07 6.46e+07 2.080e+07 2.89e+07 3.755e+03 1.19e+04 1.052e+00 3.08e-01 1.148e+00 4.56e-03
f
18
1.172e+04 2.57e+03 1.184e+04 2.15e+03 8.975e+03 2.38e+03 6.679e+03 1.32e+03 5.662e+03 1.58e+03 4.817e+03 1.98e+03
f
19
1.307e+02 2.84e+00 1.321e+02 2.17e+00 1.242e+02 2.76e+00 1.282e+02 2.89e+00 1.214e+02 2.41e+00 1.303e+02 1.14e+00
f
20
2.958e+05 1.05e+05 2.241e+05 6.04e+04 3.089e+05 1.38e+05 3.359e+05 1.08e+05 3.535e+04 2.37e+04 1.045e+05 3.35e+04
n various
f
21
5.296e-02 9.80e-09 5.296e-02 3.52e-08 5.296e-02 2.32e-17 5.296e-02 4.80e-18 5.296e-02 9.65e-13 5.296e-02 1.39e-11
f
22
9.632e-01 4.22e-02 9.733e-01 2.60e-02 1.067e+00 4.09e-16 1.067e+00 4.54e-16 1.067e+00 1.17e-07 1.067e+00 1.36e-06
f
23
2.337e+01 1.14e+00 2.307e+01 9.49e-01 3.979e-01 9.70e-13 3.980e-01 3.61e-04 3.979e-01 1.03e-07 3.979e-01 2.67e-07
f
24
3.760e+00 1.95e-02 3.760e+00 2.07e-02 3.863e+00 1.94e-15 3.863e+00 2.25e-15 3.863e+00 5.40e-09 3.863e+00 1.59e-09
f
25
4.819e-01 4.27e-02 4.795e-01 5.60e-02 3.238e+00 5.53e-02 3.268e+00 6.07e-02 3.286e+00 5.61e-02 3.322e+00 1.63e-03
f
26
2.773e+00 1.13e+00 2.458e+00 1.00e+00 6.458e+00 2.47e+00 5.281e+00 1.21e+00 5.366e+00 3.36e+00 9.948e+00 3.41e-01
f
27
2.628e+00 9.63e-01 3.001e+00 1.09e+00 7.258e+00 3.01e+00 6.598e+00 3.14e+00 6.639e+00 3.18e+00 1.012e+01 8.63e-01
f
28
2.764e+00 9.00e-01 2.485e+00 8.24e-01 6.940e+00 3.19e+00 6.262e+00 2.61e+00 5.991e+00 3.53e+00 1.040e+01 1.35e-01
performance over the other mutation schemes for the test
problems under consideration. Results of the original cGA
are not reported here because it is well-known that the elitist
variants proposed in [14] outperform the original cGA. The
48 test problems described above have been considered for
comparison. The same previously shown setup in terms of
computational budget and virtual population size has been
employed for this comparison.
Numerical results are reported in Table III while results of
the related Wilcoxon tests are listed in Tables IV and V. Fig.
8 shows some performance trends for this set of numerical
experiments.
Numerical results show that the cDE schemes signicantly
outperform cGAs for all the considered test problems. The
comparison between cDE and rcGA show that cDE algorithms
perform in most cases better than rcGAs and in some cases
they have comparable performance. Two important consid-
erations must be carried out. The rst is that rcGAs tend
to be competitive with cDE algorithms for low dimensional
problems (in this case n = 10) while cDE schemes appear to be
more promising when n = 30. This fact can be, in our opinion,
explained as a consequence of the DE search logic. More
specically, the rcGAs generate candidate solutions directly
by means of the PV and compare them one by one with
the current elite. In the late stage of the optimization this
mechanism can lead to the generation of offspring solutions
very similar to the elite and thus can result in a convergence.
In relatively high dimensional cases (already for n = 30)
the convergence is likely to happen prematurely on solutions
which are not necessarily characterized by a high performance.
On the contrary, cDE algorithms generate the offspring after
having manipulated several solutions sampled from the PV.
This fact allows a more explorative behavior and mitigates
the risk of premature convergence. In [16] it was shown
44 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 15, NO. 1, FEBRUARY 2011
Fig. 8. Performance trends cDE/rand-to-best/1/bin vs. the-state-of-the-art cEAs. (a) f
1
with n = 30. (b) f
16
with n = 30. (c) f
3
with n = 30.
that rcGAs were behaving promisingly, with respect to cGAs
in the presence of (relatively) highly multi-variate tness
problems. On the basis of the results with n = 30, cDE
schemes seem to be denitely more promising for relatively
large scale problems (i.e., n = 30) compared to all the other
compact algorithms present in literature. The second important
consideration is that rcGAs display their most promising
behavior when a high exploitation is required, as for example
for sphere problems. In these cases, the algorithm needs to
exploit the basin of attraction in order to quickly reach the
global optimum. For these problems, cDE schemes (as well
as population based DE) can be too explorative and thus too
slow at reaching high quality solutions. In order to correct
this drawback it will be important, in the future, to perform the
integration of local search algorithms within cDE structures by
coordinating the various algorithmic components in the fashion
of Meta-Lamarckian learning [49].
C. Comparison with the State-of-the-Art Estimation of
Distribution Algorithms
The persistent and nonpersistent versions of cDE/rand/1/bin
have been tested against two modern EDAs.
1) Estimation of distribution algorithm with multivariate
Gaussian model (EDA
mvg
) proposed in [50]. EDA
mvg
has been run with learning rate = 0.2, population
size N
p
= 4, selection ratio = 1, and maximum
amplication value Q = 1.5.
2) Histogram population-based incremental learning for
continuous problems (HPBIL
c
) proposed in [51].
HPBIL
c
has been run with a learning rate = 0.2, 10
divisions of domain, number of promising individuals
S = 5, population size N
p
= 4.
It must be remarked that in this set of experiments, the
considered EDAs have been run with memory requirements
comparable to cDE. Numerical results are given Tables VI
and in Fig. 9. With pe-E and pe-H we mean the Wilcoxon
test between pe-cDE/rand/1/bin and EDA
mvg
and HPBIL
c
,
respectively. Analogous, with ne-E and ne-H we mean the
Wilcoxon test between ne-cDE/rand/1/bin and EDA
mvg
and
HPBIL
c
, respectively.
Numerical results show that cDE algorithms display a good
performance with respect to EDAs and, in the case of both the
elitist schemes, outperform on average the EDAs considered
in this paper. In conclusion, cDE algorithms appear to be very
promising if compared with other algorithms having similar
memory employment.
D. Comparison with the State-of-the-Art Population-Based
Algorithms
The cDE has been compared with some modern algo-
rithms. More specically, ne-cDE/rand-to-best/1/bin has been
MININNO et al.: COMPACT DIFFERENTIAL EVOLUTION 45
Fig. 9. Performance trends cDE/rand/1/bin vs. the state-of-the-art EDAs. (a) f
7
with n = 10. (b) f
8
with n = 30. (c) f
20
with n = 30.
compared with the following DE based modern algorithms
(see [25], and references therein):
1) j-differential evolution (jDE) with F
l
= 0.1, F
u
= 0.9,
and
1
=
2
= 0.1 (see [24]);
2) J-adaptive differential evolution (JADE) with c = 0.1
(see [41]);
3) differential evolution with global and local neighbor-
hoods (DEGL) with = = F = 0.7 and CR = 0.3
(see [37]);
4) self-adaptive differential evolution (SADE) (see [42]).
In addition, pe-cDE/rand-to-best/1/bin has been compared with
the covariance matrix adaptation evolution strategy (CMA-
ES) proposed in [52]. For all the considered algorithms, the
population size N
p
(actual or virtual) has been set equal to
2n. Table VII displays the numerical results of cDE against
the modern algorithms mentioned above.
Numerical results displayed in Tables VII and VIII show
that ne-cDE/rand-to-best/1/bin can be competitive, at least
for some of the considered problems, with the state-of-the-
art population algorithms taken into account. Clearly, cDE
algorithms are not expected to outperform modern complex
algorithms because of the following two reason: rst, cDE
employs a memory structure which is times smaller than the
algorithms taken into account in this section (especially JADE
and SADE which make use of an archive), second, the ne-
cDE/rand-to-best/1/bin algorithm is supposed to be a light ver-
sion of DE/rand-to-best/1/bin and does not employ any extra
search component. Since it is well known that modern DE
based algorithms outperform a standard DE/rand-to-best/1/bin,
it is obvious that the cheap and light version of a standard
DE cannot outperform complex and modern algorithms which
have been designed to be performing and not to be light.
In this sense, the numerical results in this section should
be read in the following way: notwithstanding its signicant
disadvantage, ne-cDE/rand-to-best/1/bin is not only capable to
outperform, on a regular basis, its population based version but
also displays a respectable performance when compared with
complex, modern, and memory consuming algorithms. Future
studies will consider the integration into the presented compact
framework of more advanced search techniques.
V. Case of Study: Optimal Control of a Tubular
Linear Synchronous Motor
To illustrate the usefulness of cDE in environments with
limited computational resources, this section considers its
application to a challenging online optimization problem.
More specically, the optimization algorithm is used to
automatically design a component of a closed loop position
46 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 15, NO. 1, FEBRUARY 2011
TABLE IV
Wilcoxon Test pe-cDE/rand-to-best/1/bin vs. the
State-of-the-Art cEAs
Problem pe-cGA ne-cGA pe-rcGA ne-rcGA
[14] [14] [16] [16]
n = 10
f
1
+ +
f
2
+ + + +
f
3
+ + + +
f
4
+ + + +
f
5
+ + +
f
6
+ + = =
f
7
+ + =
f
8
+ + + +
f
9
+ + +
f
10
+ + + +
f
11
+ + + +
f
12
+ +
f
13
+ + =
f
14
+ + +
f
15
+ + = =
f
16
+ + + +
f
17
= + = +
f
18
+ + + +
f
19
+ + = +
f
20
+ + + +
n = 30
f
1
+ + + +
f
2
+ + + +
f
3
+ + + +
f
4
+ + + +
f
5
+ + + +
f
6
+ + =
f
7
+ +
f
8
+ + + +
f
9
+ + + +
f
10
+ + + =
f
11
+ + + +
f
12
+ + = =
f
13
+ + + +
f
14
+ + + +
f
15
+ + + +
f
16
+ + = +
f
17
+ + + +
f
18
+ + + +
f
19
+ + + +
f
20
+ + + +
n various
f
21
+ = = =
f
22
+ + = =
f
23
+ + = =
f
24
+ + = =
f
25
+ + + =
f
26
+ + = =
f
27
+ + =
f
28
+ + = =
+ means that cDE outperforms the compact opponent, means that
cDE is outperformed, and = means that the algorithms have the same
performance.
TABLE V
Wilcoxon Test ne-cDE/rand-to-best/1/bin vs. the
State-of-the-Art cEAs
Problem pe-cGA ne-cGA pe-rcGA ne-rcGA
[14] [14] [16] [16]
n = 10
f
1
+ +
f
2
+ + + =
f
3
+ + =
f
4
+ + + +
f
5
+ + + +
f
6
+ + + =
f
7
+ + + +
f
8
+ + + +
f
9
+ + + =
f
10
+ +
f
11
+ + + +
f
12
+ +
f
13
+ + = =
f
14
+ + +
f
15
+ + = =
f
16
+ + + +
f
17
+ + + +
f
18
+ + + +
f
19
= + = =
f
20
+ + = +
n = 30
f
1
+ + + +
f
2
+ + + +
f
3
+ + + +
f
4
+ + + +
f
5
+ + + +
f
6
+ +
f
7
+ + = =
f
8
+ + + =
f
9
+ + = =
f
10
+ + =
f
11
+ + + +
f
12
+ + + =
f
13
+ + + +
f
14
+ + + +
f
15
+ + + +
f
16
+ + = =
f
17
+ + + =
f
18
+ + + +
f
19
+ + = =
f
20
+ + + +
n various
f
21
+ = = =
f
22
+ + = =
f
23
+ + =
f
24
+ + = =
f
25
+ + + +
f
26
+ + + +
f
27
+ + + +
f
28
+ + + +
+ means that cDE outperforms the compact opponent, means that
cDE is outperformed, and = means that the algorithms have the same
performance.
MININNO et al.: COMPACT DIFFERENTIAL EVOLUTION 47
Fig. 10. Scheme of a TLSM.
control system for a permanent-magnet tubular linear
synchronous motor (TLSM). In few words, a TLSM is a
three-phase linear motor, which includes a mover containing
the three phase windings, and a tubular rod containing
the permanent magnets (see 10). The permanent magnets
are cylindrically shaped, axially magnetized and uniformly
distributed so as to form an alternate sequence of magnets
and spacers. The three-phase windings are wrapped around
the rod and the mover does not contain magnetic material.
This permits to exploit the magnetic ux with good efciency
and to avoid cogging forces. The motor is driven by a current
controlled pulsewidth modulation voltage source inverter.
This type of motor is often adopted for high precision
applications, as it can guarantee position resolutions of orders
of micrometers and below.
TLSM are often directly coupled with their load (reduction
or rotary-to-linear conversion gears are unnecessary) but the
absence of reduction gears makes their performances strongly
inuenced by the uncertainties regarding electro-mechanical
parameters [53]. Furthermore, TLSM exhibit mechanical
resonances, especially at high acceleration and deceleration
regimes, which vary with different operating conditions and
from machine to machine [54]. Therefore, as conrmed
by recent literature on the subject [53][55] advanced
feedback control strategies capable to cope with the effect of
uncertainties and the complexity of tuning procedures are a
subject of particular interest.
The control scheme presented in this paper is based on
a combination of sliding mode control with recurrent neural
networks. The sliding mode design theory is used to obtain a
general scheme in which stability of the closed loop can be
proven independently of the particular value of the payload
mass. The design of this controller is based on the criteria
developed in [56]. In the original scheme, a linear system is
used to estimate an equivalent disturbance when the controller
operates in a predened region of the error plane. Since the
disturbance is caused by inherently nonlinear or time-varying
phenomena such as stiction or payload changes, this section
describes a potential enhancement of the scheme in which the
linear system is replaced by a recurrent neural network, whose
weights are tuned with the cDE according to the procedure that
is summarized in the following.
A. Problem Statement
In the typical d-q reference frame used to control syn-
chronous motors, the mover equation of a TLSM syn-
Fig. 11. Block diagram of the position controller.
chronously moving at x m/s is as follows:
v
dq
= L
di
dq
dt
+ Ri
dq
+ j x

p
Li
dq
+ j x

f
(8)
where v is the mover voltage vector, i is the mover current
vector, R is the mover resistance, L is the mover inductance,
x is the speed of the mover,
f
if the mover magnet ux
linkage and
p
the pole pitch (corresponding to electrical
degrees), while subscripts d and q indicate the two axes of
vector control.
The electromagnetic force is proportional to the q-axis
current and does not depend on the d-axis current
F
e
=
3
2
p

f
i
q
= K
f
i
q
. (9)
According to standard practice in TLSM, the i
d
and i
q
control loops are controlled by two identical PI controllers
that make the current transients negligible with respect to
the mechanical dynamics. Therefore, it will make assumed
hereafter that references are equal to the actual values during
speed and position transients (i.e., i

d
= i
d
= 0 and i

q
= i
q
).
Thus, the current i
q
will be regarded as the actual control
signal. Since the d-axis current does not contribute to the
force production, it is controlled to zero. The q-axis current
reference is the output of the position controller whose block
diagram is reported in Fig. 11.
The mathematical model of TLSM is completed by the
mechanical equation
M x = F
e
F (10)
where M is the mover mass, and F is the unknown force
caused by friction, load forces and other uncertain phenomena.
According to standard sliding mode design arguments, the
controller must be designed so as to make the trajectory of
the system in error coordinates to reach the line (the sliding
manifold) dened by the following equation:
s
x
(x, t) = x
x
+
x
e
x
= 0 (11)
where e
x
= x

x is the tracking error, x

is the position
reference, and
x
> 0 is a designer parameter. Once en-
tered the sliding manifold, the trajectory of the system can
48 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 15, NO. 1, FEBRUARY 2011
TABLE VI
Average Final Fitness Standard Deviation and Wilcoxon Tests for cDE Algorithms Against the State-of-the-Art EDAs
Problem pe-cDE/rand/1/bin ne-cDE/rand/1/bin EDA
mvg
[50] HPBIL
c
[51] pe-E pe-H ne-E ne-H
[50] [51] [50] [51]
n = 10
f
1
2.683e-11 1.54e-11 5.389e-10 5.48e-10 0.000e+00 0.00e+00 6.705e+02 1.69e+02 + +
f
2
3.935e-03 8.78e-03 1.898e+02 3.53e+02 1.284e-15 6.27e-15 1.538e+03 4.22e+02 + +
f
3
2.608e+02 1.05e+03 7.711e+02 2.11e+03 4.499e+05 8.43e+05 2.846e+06 1.88e+06 + + + +
f
4
1.891e-06 5.08e-07 7.758e-06 3.28e-06 8.882e-16 0.00e+00 9.577e+00 4.37e-01 + +
f
5
1.976e+00 9.75e-01 3.516e-01 6.81e-01 2.120e+00 8.05e-01 9.333e+00 6.26e-01 = + + +
f
6
8.419e-03 1.02e-02 1.131e-03 3.85e-03 6.807e-01 1.90e-01 6.330e+00 1.33e+00 + + + +
f
7
2.400e-01 1.50e-01 6.309e-02 8.05e-02 1.609e+01 6.43e+00 5.889e+00 1.63e+00 + + + +
f
8
1.658e-01 3.79e-01 6.039e+00 2.92e+00 3.711e+01 6.56e+00 3.913e+01 3.52e+00 + + + +
f
9
3.093e+01 1.24e+01 3.712e+01 5.73e+00 2.484e+01 8.00e+00 4.325e+01 5.44e+00 + +
f
10
5.125e+00 2.60e+00 1.725e+01 1.33e+01 3.616e+01 7.61e+00 1.194e+04 4.09e+03 + + + +
f
11
1.365e-02 1.68e-02 1.648e-01 3.02e-01 1.608e+03 1.78e+02 1.393e+03 1.62e+02 + + + +
f
12
1.167e+02 1.27e+02 4.295e+01 5.26e+01 1.794e+02 1.72e+01 1.320e+02 2.94e+01 + = + +
f
13
7.813e+02 1.74e+02 5.896e+02 1.32e+02 8.290e+02 1.43e+02 7.443e+02 1.35e+02 = = + +
f
14
1.292e-06 4.59e-07 4.215e-06 1.90e-06 0.000e+00 0.00e+00 7.145e+00 7.73e-01 + +
f
15
1.000e+02 1.79e-08 1.000e+02 4.67e-04 1.000e+02 0.00e+00 8.228e+01 2.14e+00 + +
f
16
5.532e-13 4.48e-13 7.603e-12 6.54e-12 4.712e-32 5.59e-48 8.960e+00 2.34e+00 + +
f
17
1.150e+00 1.02e-11 1.150e+00 8.20e-11 6.565e-01 9.90e-02 7.947e+02 2.31e+03 + = + =
f
18
5.887e+01 4.91e+02 2.523e+02 1.51e+02 5.033e+03 1.70e+03 4.322e+03 7.04e+02 + + + +
f
19
9.673e+01 1.10e+00 9.924e+01 7.05e-01 9.306e+00 5.64e-01 9.593e+00 7.89e-01
f
20
5.192e+01 5.91e+02 9.240e+02 1.03e+03 3.668e+04 4.14e+04 2.707e+04 9.92e+03 + + + +
n = 30
f
1
1.996e+02 7.36e+02 4.828e-28 9.03e-28 3.549e+03 7.10e+03 1.168e+04 1.31e+03 + + + +
f
2
1.282e+04 3.13e+03 2.892e+04 4.83e+03 1.424e+04 2.78e+04 3.283e+04 4.99e+03 = + +
f
3
6.535e+06 3.20e+07 5.294e+02 1.19e+03 7.196e+08 2.14e+09 8.977e+08 2.06e+08 = + = +
f
4
1.140e+01 1.09e+00 1.642e+01 3.68e-01 7.858e+00 5.80e+00 1.616e+01 4.60e-01 + =
f
5
1.264e+01 1.01e+00 1.650e+01 4.27e-01 3.366e+00 6.74e-01 1.604e+01 4.45e-01 +
f
6
1.634e-02 2.99e-02 6.867e-02 6.75e-02 5.543e+02 6.48e+01 1.016e+02 1.34e+01 + + + +
f
7
2.278e-01 2.74e-01 1.867e-01 2.20e-01 2.630e+02 4.48e+01 9.805e+01 1.36e+01 + + + +
f
8
7.384e+01 1.23e+01 1.531e+02 2.09e+01 2.109e+02 1.10e+01 2.506e+02 1.49e+01 + + + +
f
9
1.317e+02 2.66e+01 2.655e+02 2.28e+01 1.809e+02 1.19e+01 2.584e+02 1.53e+01 + + =
f
10
8.067e+03 4.90e+03 3.665e+04 5.28e+03 1.148e+05 8.44e+04 5.473e+05 7.16e+04 + + + +
f
11
1.490e+03 4.14e+02 1.939e+03 6.69e+02 1.087e+04 1.12e+03 7.595e+03 3.37e+02 + + + +
f
12
8.962e+01 6.44e+01 1.299e+02 2.03e+01 1.328e+02 1.02e+02 1.908e+02 2.69e+01 = + = +
f
13
9.443e+02 2.85e+01 9.707e+02 2.96e+01 9.510e+02 1.28e+01 9.986e+02 7.57e+00 = + +
f
14
5.220e+00 2.62e+00 9.917e-01 2.69e+00 1.329e+01 1.47e+00 6.052e+01 7.36e+00 + + + +
f
15
9.954e+01 1.10e+00 9.930e+01 6.93e-01 9.774e+01 1.36e-01 3.826e+01 6.52e+00 + + + +
f
16
1.216e+00 2.00e+00 4.803e-01 6.72e-01 8.782e+06 2.80e+07 1.209e+06 3.52e+05 = + = +
f
17
1.511e-01 1.35e+00 3.378e-01 1.25e+00 4.388e+05 2.14e+06 1.404e+07 3.80e+06 = + = +
f
18
8.939e+03 1.57e+03 1.003e+04 1.22e+03 1.478e+04 4.75e+03 2.077e+04 2.92e+03 + + + +
f
19
1.248e+02 2.65e+00 1.279e+02 1.59e+00 4.024e+01 8.99e-01 4.007e+01 9.83e-01
f
20
1.013e+05 4.49e+04 1.204e+05 4.09e+04 6.825e+05 5.70e+05 1.227e+06 2.01e+05 + + + +
n various
f
21
5.296e-02 7.79e-18 5.296e-02 2.75e-12 5.296e-02 6.20e-09 5.296e-02 5.77e-09 + + + +
f
22
1.067e+00 4.22e-16 1.067e+00 1.71e-13 1.067e+00 3.26e-05 1.067e+00 4.62e-05 + + + +
f
23
3.980e-01 3.56e-04 3.983e-01 5.19e-04 3.979e-01 4.92e-05 3.980e-01 7.27e-05 = =
f
24
3.863e+00 4.29e-11 3.863e+00 1.10e-08 3.861e+00 1.08e-03 3.863e+00 1.45e-04 + + + +
f
25
3.288e+00 5.53e-02 3.322e+00 8.74e-04 3.242e+00 3.70e-02 3.234e+00 4.85e-02 + + + +
f
26
5.040e+00 3.16e+00 9.267e+00 1.37e+00 5.786e+00 3.54e+00 7.257e+00 2.70e+00 = + +
f
27
4.822e+00 3.07e+00 9.764e+00 1.16e+00 6.653e+00 3.62e+00 8.303e+00 1.47e+00 = + +
f
28
6.048e+00 3.64e+00 1.003e+01 5.77e-01 7.758e+00 3.69e+00 8.813e+00 1.21e+00 = + +
+ means that cDE outperforms the EDA, means that cDE is outperformed, and = means that the algorithms have the same performance.
MININNO et al.: COMPACT DIFFERENTIAL EVOLUTION 49
TABLE VII
Average Final Fitness Standard Deviation for cDE Against the State-of-the-Art Population-Based Algorithms
Problem jDE [24] JADE [41] DEGL [37] SADE [42] CMA-ES [52] ne-cDE/rand-to-best/1/bin
n = 10
f
1
3.154e-24 4.99e-24 7.465e-49 2.41e-48 6.498e-60 3.12e-59 1.487e-27 2.83e-27 4.346e-252 0.00e+00 2.021e-02 1.22e-02
f
2
8.131e-05 7.75e-05 1.631e-25 7.82e-25 2.744e-20 6.69e-20 6.390e-27 3.08e-26 2.993e-73 1.32e-72 7.179e+01 9.77e+01
f
3
1.864e+00 1.09e+00 8.667e-01 1.68e+00 4.983e-01 1.35e+00 1.661e-01 8.14e-01 6.012e-02 8.14e-03 6.370e+02 1.27e+03
f
4
9.099e-13 5.11e-13 4.595e-15 7.41e-16 4.441e-15 0.00e+00 1.584e-14 2.11e-14 1.004e-01 3.33e-01 7.846e-02 2.87e-02
f
5
3.136e-12 4.11e-12 4.441e-15 0.00e+00 4.441e-15 0.00e+00 1.984e-14 2.50e-14 4.813e-02 2.36e-01 2.218e-01 3.11e-01
f
6
0.000e+00 0.00e+00 0.000e+00 0.00e+00 0.000e+00 0.00e+00 0.000e+00 0.00e+00 0.000e+00 0.00e+00 2.922e-02 1.39e-02
f
7
0.000e+00 0.00e+00 0.000e+00 0.00e+00 0.000e+00 0.00e+00 0.000e+00 0.00e+00 0.000e+00 0.00e+00 3.852e-02 3.38e-02
f
8
0.000e+00 0.00e+00 8.927e-01 8.85e-01 1.112e+01 3.34e+00 1.741e+00 2.16e+00 1.168e+01 5.68e+00 2.916e+00 1.31e+00
f
9
1.417e+01 4.44e+00 6.831e+00 3.16e+00 2.568e+01 4.42e+00 8.913e+00 4.28e+00 1.401e+01 1.01e+01 3.104e+01 5.21e+00
f
10
3.482e-12 7.17e-12 1.091e+00 1.09e+00 1.035e+01 2.47e+00 1.368e+00 1.57e+00 1.843e+01 8.29e+00 5.726e+01 1.61e+01
f
11
1.273e-04 1.90e-13 1.476e+02 7.63e+01 3.138e+02 1.76e+02 3.251e+45 1.45e+46 4.150e+03 1.20e-12 2.344e+00 4.07e+00
f
12
1.304e+01 3.44e+01 3.913e+01 6.56e+01 7.500e+01 9.89e+01 1.250e+01 3.38e+01 4.783e+01 5.11e+01 2.744e+01 5.50e+01
f
13
6.304e+02 1.74e+02 6.740e+02 1.93e+02 7.458e+02 1.91e+02 5.667e+02 1.40e+02 9.000e+02 0.00e+00 7.239e+02 1.78e+02
f
14
1.615e-14 1.31e-14 9.552e-25 1.29e-24 1.606e-32 3.93e-32 1.784e-15 2.98e-15 5.015e-117 1.99e-116 3.094e-02 8.61e-03
f
15
1.000e+02 0.00e+00 1.000e+02 0.00e+00 1.000e+02 0.00e+00 1.000e+02 0.00e+00 1.000e+02 0.00e+00 9.999e+01 6.29e-03
f
16
2.610e-25 3.71e-25 4.712e-32 5.60e-48 4.712e-32 5.59e-48 1.849e-28 5.56e-28 1.352e-02 6.48e-02 1.002e-03 1.05e-03
f
17
1.150e+00 6.81e-16 1.150e+00 6.81e-16 1.150e+00 6.73e-16 1.150e+00 6.80e-16 1.052e+00 1.50e-01 1.146e+00 3.59e-03
f
18
3.100e+02 8.92e-12 3.100e+02 1.05e-12 3.790e-14 1.86e-13 3.100e+02 2.83e-12 2.027e+02 3.94e+02 2.842e+02 1.78e+01
f
19
9.953e+01 5.16e-01 9.911e+01 1.21e+00 9.141e+00 9.20e-01 9.940e+01 5.52e-01 9.454e+01 5.55e+00 9.940e+01 6.26e-01
f
20
1.921e+03 1.35e+03 4.735e+03 1.65e+03 1.517e+04 1.02e+04 2.121e+03 3.97e+03 1.101e+02 5.71e+02 2.171e+03 9.27e+02
n = 30
f
1
3.725e-31 9.22e-31 1.544e-13 5.16e-13 9.489e-75 4.02e-74 2.313e-34 1.13e-33 8.751e-302 0.00e+00 1.816e-15 9.57e-16
f
2
1.189e+01 8.96e+00 1.496e+02 1.46e+02 3.027e-03 4.39e-03 1.527e-01 1.86e-01 1.719e-28 2.91e-28 1.183e+04 4.15e+03
f
3
2.696e+01 2.28e+01 3.877e+01 3.05e+01 9.967e+00 1.76e+00 2.334e+01 2.36e+01 7.118e+00 1.07e+00 5.218e+01 4.75e+01
f
4
7.253e-15 1.47e-15 1.861e+00 6.73e-01 3.151e-01 5.11e-01 4.813e-02 2.36e-01 8.290e-15 1.45e-15 1.351e-01 3.67e-01
f
5
7.401e-15 1.35e-15 2.475e+00 7.25e-01 3.880e-02 1.90e-01 1.021e-14 4.30e-15 7.994e-15 0.00e+00 6.886e-01 7.34e-01
f
6
0.000e+00 0.00e+00 2.179e-02 2.97e-02 8.502e-03 1.40e-02 4.107e-04 2.01e-03 0.000e+00 0.00e+00 8.663e-02 3.95e-02
f
7
1.068e-02 3.65e-02 1.686e-01 1.79e-01 0.000e+00 0.00e+00 4.897e-02 1.08e-01 0.000e+00 0.00e+00 5.561e-02 6.78e-02
f
8
7.462e-01 7.90e-01 2.027e+01 4.38e+00 6.892e+01 5.10e+01 2.176e+01 1.33e+01 3.462e+01 1.42e+01 1.203e+02 2.53e+01
f
9
4.024e+01 9.71e+00 2.654e+01 7.28e+00 1.817e+02 7.98e+00 3.254e+01 8.62e+00 3.802e+01 1.32e+01 2.202e+02 3.02e+01
f
10
7.877e-01 8.79e-01 2.781e+01 4.43e+00 1.105e+02 4.54e+01 1.662e+01 1.32e+01 5.688e+01 1.44e+01 3.799e+03 1.43e+03
f
11
1.190e+02 9.88e+01 2.830e+03 2.25e+02 1.842e+03 5.94e+02 2.142e+02 7.98e+02 1.245e+04 0.00e+00 5.684e+02 2.89e+02
f
12
0.000e+00 0.00e+00 1.250e+01 3.38e+01 5.833e+01 1.02e+02 1.250e+01 4.48e+01 3.594e-26 2.45e-26 4.179e+01 7.78e+01
f
13
9.000e+02 0.00e+00 9.000e+02 2.84e-13 9.000e+02 1.01e-13 9.000e+02 1.42e-13 9.000e+02 0.00e+00 8.958e+02 2.04e+01
f
14
1.891e-18 1.24e-18 1.301e-08 3.20e-08 5.422e-38 1.54e-37 2.159e-20 4.47e-20 7.905e-38 3.87e-37 6.180e-08 1.21e-07
f
15
1.000e+02 4.47e-08 9.086e+01 1.92e+01 1.000e+02 9.47e-14 1.000e+02 1.25e-02 3.069e+01 5.39e+01 9.996e+01 1.88e-02
f
16
2.700e-32 1.60e-32 1.728e-02 3.95e-02 8.255e-02 2.65e-01 1.571e-32 5.59e-48 8.639e-03 2.93e-02 4.323e-02 1.10e-01
f
17
1.150e+00 4.54e-16 1.084e+00 2.98e-01 1.149e+00 5.13e-03 1.150e+00 4.77e-16 8.262e-01 2.35e-01 1.148e+00 4.56e-03
f
18
1.055e+03 5.83e+02 4.106e+03 8.23e+02 3.259e+02 5.69e+02 2.011e+03 6.58e+02 1.893e+03 8.73e+02 4.817e+03 1.98e+03
f
19
1.307e+02 1.02e+00 1.303e+02 1.11e+00 1.309e+02 1.28e+00 1.289e+02 4.34e+00 1.118e+02 2.20e+01 1.303e+02 1.14e+00
f
20
1.052e+05 3.28e+04 1.933e+05 2.91e+04 7.909e+05 1.36e+05 3.949e+04 3.46e+04 1.097e+03 2.10e+03 1.045e+05 3.35e+04
n various
f
21
5.296e-02 1.38e-10 5.296e-02 1.20e-10 5.296e-02 7.23e-11 5.296e-02 2.23e-14 5.296e-02 5.54e-18 5.296e-02 1.39e-11
f
22
1.067e+00 4.54e-16 1.067e+00 4.54e-16 1.067e+00 4.32e-16 1.067e+00 4.54e-16 1.067e+00 4.64e-16 1.067e+00 1.36e-06
f
23
3.979e-01 0.00e+00 3.979e-01 0.00e+00 3.979e-01 0.00e+00 3.979e-01 0.00e+00 3.979e-01 0.00e+00 3.979e-01 2.67e-07
f
24
3.863e+00 2.27e-15 3.863e+00 2.27e-15 3.863e+00 2.27e-15 3.863e+00 2.27e-15 3.863e+00 1.56e-15 3.863e+00 1.59e-09
f
25
3.296e+00 5.03e-02 3.276e+00 5.95e-02 3.277e+00 5.88e-02 3.317e+00 2.43e-02 3.281e+00 5.81e-02 3.322e+00 1.63e-03
f
26
1.015e+01 5.25e-15 9.934e+00 1.80e+00 1.015e+01 5.44e-15 5.055e+00 3.28e-16 9.148e+00 3.71e-01 9.948e+00 3.41e-01
f
27
1.040e+01 4.97e-15 1.017e+01 1.10e+00 1.040e+01 4.61e-06 1.040e+01 4.63e-15 5.088e+00 2.50e-15 1.012e+01 8.63e-01
f
28
1.054e+01 3.57e-15 1.054e+01 3.63e-15 1.026e+01 1.37e+00 1.054e+01 3.72e-15 5.128e+00 1.37e-15 1.040e+01 1.35e-01
be driven to the origin of the error plane by applying a
discontinuous control action. In order to avoid the undesirable
chattering effects associated with such a discontinuous action,
the control law is modied with a variant that generates
smoother control actions at the expense of the loss of guaran-
tees of ideal convergence to zero of the pure sliding mode
dynamics s
x
= 0. This variant is referred to as boundary
layer [57], and is described by the following differential
equation:
s
x
= i
max
q
sat(s
x
)
K
f

M
+ d
eq
x
(12)
where i
max
q
is the maximum allowable motor current,

M is the
rated mass value, d
eq
x
=
_
K
f
M

K
f

M
_
i
q
+
F
M
is the equivalent
disturbance, the saturation function is dened as follows:
sat(s
x
) =
_
sign(s
x
) |s
x
| >
s
x

> |s
x
|
(13)
where is the width of the boundary layer. A steady state
error occurs due to the equivalent disturbance, and therefore
an observer is used to estimate its amplitude for compensation,
and guarantee zero steady state error. The variable s
x
is
used as the input of the proposed disturbance observer. This
produces an additional control action i
OBS
q
to ensure that the
state reaches the sliding surface. The disturbance observer
is obtained as the parallel between a standard discrete-time
PI and a recurrent neural network (RNN) whose structure is
summarized in Fig. 12.
More specically, the RNN receives the current and past
samples of the variable s
x
, and produces an additional control
50 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 15, NO. 1, FEBRUARY 2011
Fig. 12. Neural network block diagram in Simulink.
action that is summed to the output of a PI controller. The
parallel between the PI and the RNN denes a nonlinear
hybrid estimator in which the linear action is used to preserve
acceptable closed loop performances, especially during the
initial training stages in which the RNN has a virtually
random behavior, and therefore acts as an unpredictable dis-
turbance. This circumstance is particularly critical for those
RNN schemes in which training is based on a stochastic search
algorithm (such as a GA or a SPSA [58]). In these algorithms,
initialization is generally obtained with randomly-generated
solutions that produce very poor (or even unstable) results.
Therefore, it is generally preferred to partition the controller
in two modules, a linear law designed to hold the control
loop in stable conditions, and a nonlinear compensator that is
trained online (see [59]) for a specic control goal.
B. Neural Network Training with cDE
There are several ways to train an NN in a control loop.
A large part of the literature [60] focuses on NNs that are
linear in the unknown parameters that can be easily trained
with a variety of algorithms derived from Lyapunov stability
theory. Several extensions to nonlinear in-the-parameters NNs
have also been proposed, which include stochastic gradient-
free optimization algorithms, such as the SPSA [58] or Genetic
Algorithms [61] to address specic problems, such training
with noisy measurements, training recurrent networks, or
avoiding local minima of the objective function. This paper
considers the case in which the RNN is trained using the
proposed cDE algorithm. More specically, the NN training
has been performed by means of ne-cDE/rand-to-best/1/bin
with F = 0.9 and Cr = 0.9. The results obtained by ne-
cDE/rand/1/bin have been compared with those obtained by
ne-rcGA [16]. For both the algorithms, = 0.5 N
p
and
N
p
= 2 n, where n = 24.
For NN training purposes, the TLSM is requested to track a
periodic position trajectory as shown in Fig. 14. The trajectory
is obtained by ltering a square wave of the same frequency
with a nonlinear lter that shapes its output so as to keep the
maximum speed and acceleration within the selected limits
[62]. Each period of the reference is viewed as a separate
tness evaluation. At time t = 1.5 s a load force is applied
using a second motor connected to the controlled plant. The
Fig. 13. Experimental test bench.
load force has a prole proportional to the plant acceleration
(measured) so as to emulate a payload mass variation in the
second half of each tness evaluation. In particular a 25 kg
mass increase was emulated in all the presented experiments.
The tness function is the integral absolute value of the error
between the position reference and the actual position of the
TLSM over the observation interval (one period). Thus, at the
end of each period, new weights for the NN are generated by
the cDE and passed to the actual controller.
C. Summary of Experimental Results
The test bench utilizes two identical TLSMs (the rst one
used as a motor, the second one as a load) having the following
rated specications: rated i
sq
current 2.0 A, coil resistance
R = 12.03, coil inductance L = 7.8 mH,
p
= 25.6 mm,
K
f
= 31.2 N/A, mass of the mover 2.75 kg (see Fig. 13). All
the experimental investigations presented in this section are
performed by using a dSPACE 1103 micro-controller board
based on a Motorola Power PC microprocessor.
The performance of the proposed control scheme has been
compared with that obtained using the same position controller
represented in Fig. 11 but without the contribution of the
neural network. In order to obtain a fair comparison the
parameters of the PI controller inside the disturbance observer
were accurately tuned via trial and error during a test in which
the trajectory shown in Fig. 14 was followed, but without
mass change during the experiment. In particular, the PI gains
were increased as much as possible so to reduce the tness
MININNO et al.: COMPACT DIFFERENTIAL EVOLUTION 51
TABLE VIII
Wilcoxon Test for cDE Against the State-of-the-Art
Population-Based Algorithms
Problem jDE JADE DEGL SADE CMA-ES
[24] [41] [37] [42] [52]
n = 10
f
1

f
2

f
3
=
f
4
=
f
5

f
6

f
7

f
8
+ = +
f
9
= = =
f
10
+ =
f
11
+ + = +
f
12
= = = = =
f
13
= = = +
f
14

f
15
= = = =
f
16
+
f
17
= = = +
f
18
= = = = +
f
19
= = =
f
20
= + + =
n = 30
f
1
= =
f
2
= =
f
3
= = = =
f
4
= + = =
f
5
+
f
6
= =
f
7
+ = =
f
8
=
f
9
+
f
10

f
11
+ + = +
f
12
= = =
f
13
= = = = =
f
14
=
f
15
+ = +
f
16
= = = =
f
17
= = +
f
18
=
f
19
= = = = =
f
20
= + +
n various
f
21
= = = = =
f
22
= = = = =
f
23
= = = = =
f
24
= = = = =
f
25
+ + + = +
f
26
= = = = +
f
27
= = = = +
f
28
= = = = +
+ means that cDE outperforms the population-based opponent, means
that cDE is outperformed, and = means that the algorithms have the same
performance.
Fig. 14. Trajectory used during training.
value. This is a standard procedure that could be realized by
a skilled operator in industrial practice. At the end of tuning,
the performances of the obtained control scheme were tested
including the emulation of the mass variation. The obtained
position error is reported in Fig. 16. The position error is
below 100 m during the rst movement but increases up to
450 m when the mass is increased. At the end of training,
the neural network clearly improves the performances reducing
the effects of the mass change. The obtained position error is
reported in Fig. 18 evidencing how the performances are only
slightly improved during the rst movement but the system
becomes much more robust to the mass variations. The peak
error is below 80 m and 200 m during the two consecutive
movements. The similarity of the position error responses in
the rst half of the experiment conrms that the position
controller was tuned so to obtain optimal performances when
the payload is absent. The performances of both the control
schemes were also evaluated using a position trajectory (shown
in Fig. 15) different from the one adopted during training. The
trajectory used for validation includes movements of different
amplitudes so to stress the effect of static friction on motor
performances. Also in this case the load emulated an 25 kg
additional mass during the second half of the experiment.
The parameters of the position controller, including the neural
network, were kept constant and equal to the values used in
the rst test. Figs. 17 and 19 report the position errors obtained
without and with the neural network, respectively. Also in this
case the neural network reduces the effects of the mass change
during the transients and guarantees to reach zero error faster
when the set-point is kept constant. It should be remarked that
the training requires between 20 and 30 min to reach satis-
factory results. Most of the time is devoted to tness function
evaluation, while the increase of computational cost due to the
real time implementation of the cDE is negligible. Considering
that the sampling time is 200 s and that about 3/4 of this
time is devoted to position, speed, and current control, in order
to avoid a slowing down of the training process, the handling
of the optimization algorithm should not exceed 50 s. While
cDE (as well as other compact algorithms) require approxi-
mately 20 s per sampling step regardless the dimensionality
of the problem and the virtual population size, a population-
based algorithm which attempts to optimize this 24 variable
52 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 15, NO. 1, FEBRUARY 2011
Fig. 15. Trajectory used during the validation.
Fig. 16. Position error during the training test without the neural network.
Fig. 17. Position error during the validation test without the neural network.
study case would require at least 20 individuals which would
result into a slowing down of the real-time process.
Finally, Fig. 20 shows the average value of the tness
function and the standard deviation calculated over ten run
of the neural network training using the cDE algorithm and
rcGA. It can be observed that the results obtained by cDE are
signicantly more satisfactory than those obtained by rcGA.
While the rcGA prematurely converges to a suboptimal solu-
tion, the proposed cDE algorithms continues the optimization
and detects high quality solutions.
Fig. 18. Position error during the training test with the neural network.
Fig. 19. Position error during the validation test with the neural network.
Fig. 20. Mean and standard deviation of the tness value of the best solution
over ten run.
VI. Conclusion
This paper introduced the concept of compact differential
e volution and proposed two algorithmic variants based on
this novel idea. The rst variant employs the persistent elitism
while the second employs nonpersistent elitism. Both of these
variants do not require powerful hardware in order to display
a high performance. On the contrary, the proposed algorithms
make use of a limited amount of memory in order to perform
the optimization. As conrmed by our implementation on
a challenging on-line optimization problem in the eld of
precision motion control, this feature makes the proposed
MININNO et al.: COMPACT DIFFERENTIAL EVOLUTION 53
approach suitable for commercial devices and industrial appli-
cations which have cost and space limitations. Despite its small
hardware demand the proposed approach seems to outperform
classical population-based differential evolution algorithms.
This nding appears to be due to the fact that randomization
of compact schemes is benecial for the differential evolution
search logic. The comparison with other compact evolutionary
algorithms and other estimation of distribution algorithms,
recently presented in literature, shows that compact differential
evolution is a competitive approach which often leads to a
signicantly better performance, in particular when the search
space dimension is large. Finally, the comparison with the-
state-of-the-art complex population based algorithms shows
that the proposed approach, despite its simplicity and low
memory requirements, is competitive for several problems.
Future work will focus on memetic extensions of the work
carried out, denition of parallel compact differential evolution
systems, as well as the implementation of adaptive schemes
aiming at the reduction of parameters to set. A simple memetic
approach, employing cDE as an evolutionary framework and
a low memory local search algorithm has been introduced in
order to solve a specic control problem with reference to
robotics (see [63]). Although this memetic extension of the
cDE algorithm has been already published, its design is subse-
quent to the cDE proposed in this paper. In addition, it must be
remarked that the algorithm, aim of this paper, and optimiza-
tion problems in [63] are signicantly different with respect to
the present paper. More specically, while the present paper
proposes cDE as a new general purpose algorithm and tests
its potential against other compact and population-based algo-
rithms, in [63] cDE is used only as a component of a memetic
algorithm which is specically tailored to the optimization of
the control system for an industrial Cartesian robot.
References
[1] P. Larra naga and J. A. Lozano, Estimation of Distribution Algorithms:
A New Tool for Evolutionary Computation. Boston, MA: Kluwer, 2001.
[2] G. R. Harik, F. G. Lobo, and D. E. Goldberg, The compact genetic
algorithm, IEEE Trans. Evol. Comput., vol. 3, no. 4, pp. 287297, Nov.
1999.
[3] R. Rastegar and A. Hariri, A step forward in studying the compact
genetic algorithm, Evol. Comput., vol. 14, no. 3, pp. 277289, 2006.
[4] G. Harik, Linkage learning via probabilistic modeling in the ECGA,
Univ. Illinois at Urbana-Champaign, Urbana, Tech. Rep. 99 010, 1999.
[5] G. R. Harik, F. G. Lobo, and K. Sastry, Linkage learning via proba-
bilistic modeling in the extended compact genetic algorithm (ECGA),
in Scalable Optimization via Probabilistic Modeling (Studies in Compu-
tational Intelligence, vol. 33), M. Pelikan, K. Sastry, and E. Cant u-Paz,
Eds. Berlin, Germany: Springer, 2006, pp. 3961.
[6] K. Sastry and D. E. Goldberg, On extended compact genetic algorithm,
Univ. Illinois at Urbana-Champaign, Urbana, Tech. Rep. 2 000 026,
2000.
[7] K. Sastry and G. Xiao, Cluster optimization using extended compact
genetic algorithm, Univ. Illinois at Urbana-Champaign, Urbana, Tech.
Rep. 2 001 016, 2001.
[8] K. Sastry, D. E. Goldberg, and D. D. Johnson, Scalability of a hybrid
extended compact genetic algorithm for ground state optimization of
clusters, Mater. Manuf. Processes, vol. 22, no. 5, pp. 570576, 2007.
[9] C. Aporntewan and P. Chongstitvatana, A hardware implementation of
the compact genetic algorithm, in Proc. IEEE Congr. Evol. Comput.,
vol. 1. 2001, pp. 624629.
[10] J. C. Gallagher, S. Vigraham, and G. Kramer, A family of compact
genetic algorithms for intrinsic evolvable hardware, IEEE Trans. Evol.
Comput., vol. 8, no. 2, pp. 111126, Apr. 2004.
[11] Y. Jewajinda and P. Chongstitvatana, Cellular compact genetic
algorithm for evolvable hardware, in Proc. Int. Conf. Electr.
Eng./Electron. Comput. Telecommun. Inform. Technol., vol. 1. 2008,
pp. 14.
[12] J. C. Gallagher and S. Vigraham, A modied compact genetic algorithm
for the intrinsic evolution of continuous time recurrent neural networks,
in Proc. Genet. Evol. Comput. Conf., 2002, pp. 163170.
[13] R. Baraglia, J. I. Hidalgo, and R. Perego, A hybrid heuristic for the
traveling salesman problem, IEEE Trans. Evol. Comput., vol. 5, no. 6,
pp. 613622, Dec. 2001.
[14] C. W. Ahn and R. S. Ramakrishna, Elitism based compact genetic
algorithms, IEEE Trans. Evol. Comput., vol. 7, no. 4, pp. 367385,
Aug. 2003.
[15] G. Rudolph, Self-adaptive mutations may lead to premature conver-
gence, IEEE Trans. Evol. Comput., vol. 5, no. 4, pp. 410414, Aug.
2001.
[16] E. Mininno, F. Cupertino, and D. Naso, Real-valued compact genetic
algorithms for embedded microcontroller optimization, IEEE Trans.
Evol. Comput., vol. 12, no. 2, pp. 203219, Apr. 2008.
[17] F. Cupertino, E. Mininno, and D. Naso, Elitist compact genetic algo-
rithms for induction motor self-tuning control, in Proc. IEEE Congr.
Evol. Comput., 2006, pp. 30573063.
[18] F. Cupertino, E. Mininno, and D. Naso, Compact genetic algorithms
for the optimization of induction motor cascaded control, in Proc. IEEE
Int. Conf. Electr. Mach. Drives, vol. 1. 2007, pp. 8287.
[19] L. Fossati, P. L. Lanzi, K. Sastry, and D. E. Goldberg, A simple real-
coded extended compact genetic algorithm, in Proc. IEEE Congr. Evol.
Comput., Sep. 2007, pp. 342348.
[20] P. Lanzi, L. Nichetti, K. Sastry, and D. E. Goldberg, Real-coded
extended compact genetic algorithm based on mixtures of models,
in Linkage in Evolutionary Computation (Studies in Computational
Intelligence, vol. 157). Berlin, Germany: Springer, 2008, pp. 335358.
[21] F. Neri and V. Tirronen, Recent advances in differential evolution: A
review and experimental analysis, Artif. Intell. Rev., vol. 33, nos. 12,
pp. 61106, 2010.
[22] A. Caponio, A. Kononova, and F. Neri, Differential evolution with
scale factor local search for large scale problems, in Computational
Intelligence in Expensive Optimization Problems (Studies in Evolution-
ary Learning and Optimization, vol. 2), Y. Tenne and C.-K. Goh, Eds.
Berlin, Germany: Springer, 2010, ch. 12, pp. 297323.
[23] M. Weber, F. Neri, and V. Tirronen, Distributed differential evolution
with explorative-exploitative population families, Genet. Programming
Evolvable Mach., vol. 10, no. 4, pp. 343371, 2009.
[24] J. Brest, S. Greiner, B. Bo skovi c, M. Mernik, and V.

Zumer, Self-
adapting control parameters in differential evolution: A comparative
study on numerical benchmark problems, IEEE Trans. Evol. Comput.,
vol. 10, no. 6, pp. 646657, Dec. 2006.
[25] S. Das and P. N. Suganthan, Differential evolution: A survey of the
state-of-the-art, IEEE Trans. Evol. Comput., 2011, to be published.
[26] K. V. Price, R. Storn, and J. Lampinen, Differential Evolution: A Practi-
cal Approach to Global Optimization. Berlin, Germany: Springer, 2005.
[27] W. Gautschi, Error function and fresnel integrals, in Handbook of
Mathematical Functions with Formulas, Graphs, and Mathematical
Tables, M. Abramowitz and I. A. Stegun, Eds. New York: Dover
Publications, Inc., 1972, ch. 7, pp. 297309.
[28] W. J. Cody, Rational Chebyshev approximations for the error function,
Math. Comput., vol. 23, no. 107, pp. 631637, Jul. 1969.
[29] M. Gallagher, An empirical investigation of the user-parameters and
performance of continuous PBIL algorithms, in Proc. IEEE Signal
Process. Soc. Workshop Neural Netw., Dec. 2000, pp. 702710.
[30] B. Yuan and M. Gallagher, Playing in continuous spaces: Some analysis
and extension of population-based incremental learning, in Proc. IEEE
Congr. Evol. Comput., vol. 1. Dec. 2003, pp. 443450.
[31] M. Sebag and A. Ducoulombier, Extending population-based incre-
mental learning to continuous search spaces, in Proc. Parallel Problem
Solving Nature, LNCS 1498, A. E. Eiben, T. B ack, M. Schoenauer, and
H.-P. Schwefel, Eds. Berlin, Germany: Springer, 1998, pp. 418427.
[32] M. Schmidt, K. Kristensen, and T. Randers Jensen, Adding genetics
to the standard PBIL algorithm, in Proc. IEEE Congr. Evol. Comput.,
vol. 2. Jul. 1999, pp. 15271534.
[33] C. Gonzailez, J. A. Lozano, and P. Larranaga, Mathematical modeling
of umda
c
algorithm with tournament selection: Behavior on linear and
quadratic functions, Int. J. Approximate Reasoning, vol. 31, no. 3, pp.
313340, 2002.
[34] V. Feoktistov, Differential Evolution in Search of Solutions. Berlin,
Germany: Springer, 2006.
[35] J. Lampinen and I. Zelinka, On stagnation of the differential evolution
algorithm, in Proc. 6th Int. Mendel Conf. Soft Computing, 2000, pp. 76
83.
54 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 15, NO. 1, FEBRUARY 2011
[36] S. Rahnamayan, H. R. Tizhoosh, and M. M. Salama, Opposition-based
differential evolution, IEEE Trans. Evol. Comput., vol. 12, no. 1, pp.
6479, Feb. 2008.
[37] S. Das, A. Abraham, U. K. Chakraborty, and A. Konar, Differential
evolution with a neighborhood-based mutation operator, IEEE Trans.
Evol. Comput., vol. 13, no. 3, pp. 526553, Jun. 2009.
[38] V. Tirronen, F. Neri, T. K arkk ainen, K. Majava, and T. Rossi, An
enhanced memetic differential evolution in lter design for defect
detection in paper production, Evol. Comput., vol. 16, no. 4, pp. 529
555, 2008.
[39] A. Caponio, F. Neri, and V. Tirronen, Super-t control adapta-
tion in memetic differential evolution frameworks, Soft Comput.-A
Fusion Found., Methodol. Applicat., vol. 13, no. 8, pp. 811831,
2009.
[40] N. Noman and H. Iba, Accelerating differential evolution using an
adaptive local search, IEEE Trans. Evol. Comput., vol. 12, no. 1, pp.
107125, Feb. 2008.
[41] J. Zhang and A. C. Sanderson, JADE: Adaptive differential evolution
with optional external archive, IEEE Trans. Evol. Comput., vol. 13,
no. 5, pp. 945958, Oct. 2009.
[42] A. K. Qin, V. L. Huang, and P. N. Suganthan, Differential evolu-
tion algorithm with strategy adaptation for global numerical optimiza-
tion, IEEE Trans. Evol. Comput., vol. 13, no. 2, pp. 398417, Apr.
2009.
[43] P. N. Suganthan, N. Hansen, J. J. Liang, K. Deb, Y.-P. Chen, A. Auger,
and S. Tiwari, Problem denitions and evaluation criteria for the
CEC 2005 special session on real-parameter optimization, Nanyang
Technol. Univ. KanGAL, Singapore, IIT Kanpur, Kanpur, India, Tech.
Rep. 2 005 005, 2005.
[44] J. Liang, P. Suganthan, and K. Deb, Novel composition test functions
for numerical global optimization, in Proc. IEEE Symp. Swarm Intell.,
2005, pp. 6875.
[45] J. Vesterstrm and R. Thomsen, A comparative study of differential
evolution particle swarm optimization and evolutionary algorithms on
numerical benchmark problems, in Proc. IEEE Congr. Evol. Comput.,
vol. 3. Jun. 2004, pp. 19801987.
[46] X. Yao, Y. Liu, and G. Lin, Evolutionary programming made faster,
IEEE Trans. Evol. Comput., vol. 3, no. 2, pp. 82102, Jul. 1999.
[47] S. Das, A. Konar, and U. K. Chakraborty, Two improved differential
evolution schemes for faster global search, in Proc. Conf. Genet. Evol.
Comput., 2005, pp. 991998.
[48] F. Wilcoxon, Individual comparisons by ranking methods, Biometrics
Bull., vol. 1, no. 6, pp. 8083, 1945.
[49] Y. S. Ong and A. J. Keane, Meta-Lamarkian learning in memetic
algorithms, IEEE Trans. Evol. Comput., vol. 8, no. 2, pp. 99110, Apr.
2004.
[50] B. Yuan and M. Gallagher, Experimental results for the special session
on real-parameter optimization at CEC 2005: A simple, continuous
EDA, in Proc. IEEE Conf. Evol. Comput., Sep. 2005, pp. 17921799.
[51] J. Xiao, Y. Yan, and J. Zhang, HPBILc: A histogram-based EDA for
continuous optimization, Appl. Math. Comput., vol. 215, no. 3, pp.
973982, Oct. 2009.
[52] N. Hansen and A. Ostermeier, Completely derandomized self-
adaptation in evolution strategies, Evol. Comput., vol. 9, no. 2, pp.
159195, 2001.
[53] F. J. Lin, P. H. Shen, S. L. Yang, and P. H. Chou, Recurrent radial basis
function network-based fuzzy neural network control for permanent-
magnet linear synchronous motor servo drive, IEEE Trans. Mag.,
vol. 42, no. 11, pp. 36943705, Nov. 2009.
[54] Z. Z. Liu, F. L. Luo, and M. A. Rahman, Robust and precision motion
control system of linear-motor direct drive for high-speed x-y table
positioning mechanism, IEEE Trans. Ind. Electron., vol. 52, no. 5, pp.
13571363, Oct. 2005.
[55] K. Low and M. Keck, Advanced precision linear stage for industrial
automation applications, IEEE Trans. Instrum. Meas., vol. 52, no. 3,
pp. 785789, Jun. 2003.
[56] F. Cupertino, D. Naso, E. Mininno, and B. Turchiano, Sliding-mode
control with double boundary layer for robust compensation of payload
mass and friction in linear motors, IEEE Trans. Ind. Applicat., vol. 45,
no. 5, pp. 16881696, Sep.Oct. 2009.
[57] J. E. Slotine and W. Li, Applied Nonlinear Control. Englewood Cliffs,
NJ: Prentice-Hall, 1991.
[58] J. C. Spall, Introduction to Stochastic Search and Optimization. New
York: Wiley, 2003.
[59] X. D. Ji and B. O. Familoni, A diagonal recurrent neural network-
based hybrid direct adaptive SPSA control system, IEEE Trans. Autom.
Control, vol. 44, no. 9, pp. 14691473, Jul. 1999.
[60] F. L. Lewis, R. Selmic, and J. Campos, Neuro-Fuzzy Control of Indus-
trial Systems with Actuator Nonlinearities. Philadelphia, PA: Society for
Industrial and Applied Mathematics, 2002.
[61] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Ma-
chine Learning. Reading, MA: Addison-Wesley, 1989.
[62] R. Zanasi, A. Tonielli, and G. Lo Bianco, Nonlinear lters for the
generation of smooth trajectories, Automatica, vol. 36, no. 3, pp. 439
448, 2000.
[63] F. Neri and E. Mininno, Memetic compact differential evolution for
Cartesian robot control, IEEE Comput. Intell. Mag., vol. 5, no. 2,
pp. 5465, May 2010.
Ernesto Mininno (M04) received the Masters and
Ph.D. degrees in electrical engineering from the
Technical University of Bari, Bari, Italy, in 2002 and
2007, respectively, and the MBA degree from the
National Research Center, Milan, Italy, in 2003.
He was a Project Manager with the National
Research Center from 2003 to 2009. Currently, he is
a Post-Doctoral Researcher with the Department of
Mathematical Information Technology, University of
Jyvskyl, Jyvskyl, Finland. His current research
interests include robotics, intelligent motion control,
evolutionary optimization, compact algorithms, and optimization in noisy
environments.
Ferrante Neri (S04M08) received the Masters
and Ph.D. degrees in electrical engineering from the
Technical University of Bari, Bari, Italy, in 2002 and
2007, respectively, and the Ph.D. degree in computer
science from the University of Jyvskyl, Jyvskyl,
Finland, in 2007.
Currently, he is an Assistant Professor with the De-
partment of Mathematical Information Technology,
University of Jyvskyl, and is a Research Fellow
with the Academy of Finland, Helsinki, Finland.
His current research interests include computational
intelligence optimization and more specically memetic computing, differen-
tial evolution, noisy and large scale optimization, and compact and parallel
algorithms.
Francesco Cupertino (M08) was born in
December 1972. He received the Laurea and Ph.D.
degrees in electrical engineering from the Technical
University of Bari, Bari, Italy, in 1997 and 2001,
respectively.
From 1999 to 2000, he was with the PEMC
Research Group, University of Nottingham,
Nottingham, U.K. Since July 2002, he has been
an Assistant Professor with the Department of
Electrical and Electronic Engineering, Technical
University of Bari. He teaches two courses in
electrical drives at the Technical University of Bari. His current research
interests include intelligent motion control of electrical machines, applications
of computational intelligence to control, sliding-mode control, sensorless
control of ac electric drives, signal processing techniques for three phase
signal analysis, and fault diagnosis of ac motors. He is the author or
co-author of more than 70 scientic papers on these topics.
Dr. Cupertino is a Registered Professional Engineer in Italy.
David Naso (M98) received the Laurea (Honors)
degree in electronic engineering and the Ph.D. de-
gree in electrical engineering from the Polytechnic
Institute of Bari, Bari, Italy, in 1994 and 1998,
respectively.
He was a Guest Researcher with the Operation
Research Institute, Technical University of Aachen,
Aachen, Germany, in 1997. Since 1999, he has
been an Assistant Professor of Automatic Control
and the Technical Head of the Robotics Laboratory,
Department of Electric and Electronic Engineering,
Polytechnic Institute of Bari. His current research interests include compu-
tational intelligence and its application to control and robotics, distributed
multiagent systems, modeling and control of smart materials, and unconven-
tional actuators for precise positioning and vibration damping.
Dr. Naso is currently an Area Editor of the journal Fuzzy Sets and Systems
for the topic of intelligent control, and a member of the International
Federation of Automatic Control Technical Committee on Computational
Intelligence in Control.

You might also like