Professional Documents
Culture Documents
Alireza Rezvanian
I.
INTRODUCTION
II.
978-1-4244-7375-5/10/$26.00 2010
IEEE
IMMUNE ALGORITHMS
486
Proceedings of the World Congress on Nature and Biologically Inspired Computing (NaBIC2010)
i= j
(1 - b ) p j (n )
p j (n + 1) = b
r 1 + (1 - b ) p j (n ) j
A learning automaton [7,12,16-18] is an adaptive decisionmaking unit that improves its performance by learning how
choose the optimal action from a finite set of allowed
actions through repeated interactions with a random
environment. The action is chosen at random based on a
probability distribution kept over the action-set and at each
instant the given action is served as the input to the random
environment. The environment responds the taken action in
turn with a reinforcement signal. The action probability
vector is updated based on the reinforcement feedback from
the environment. The objective of a learning automaton is to
find the optimal action from the action-set so that the
average penalty received from the environment is
minimized.
The environment can be described by a triple E { , ,
c}, where {1 , 2 ,..., r } represents the finite set of the
inputs, {1 , 2 ,..., m} denotes the set of the values can
be taken by the reinforcement signal, and c {c1 ,c2 , ..., cr}
denotes the set of the penalty probabilities, where the
element ci is associated with the given action i. If the
penalty probabilities are constant, the random environment is
said to be a stationary random environment, and if they vary
with time, the environment is called a non stationary
environment. The environments depending on the nature of
the reinforcement signal can be classified into P-model, Qmodel and S-model. The environments in which the
reinforcement signal can only take two binary values 0 and 1
are referred to as P-model environments. Another class of the
environment allows a finite number of the values in the
interval [0, 1] can be taken by the reinforcement signal. Such
an environment is referred to as Q-model environment. In Smodel environments, the reinforcement signal lies in the
interval [a, b].
Learning automata can be classified into two main
families [18]: fixed structure learning automata and variable
structure learning automata. Variable structure learning
automata are represented by a triple < , , T >, where is
IEEE
(1)
LEARNING AUTOMATA
978-1-4244-7375-5/10/$26.00 2010
i= j
p j (n ) + a 1 p j (n )
p j (n + 1) =
(
)
(
)
1
a
p
n
j
j
i j
(2)
1
exp - f *
(3)
487
Proceedings of the World Congress on Nature and Biologically Inspired Computing (NaBIC2010)
Initialize
Evaluate
Change
Environment?
Y
Re-initialize
Re-evaluate
Increasing
Select Best
Pm
Environ
ment
Learning
Automata
Evaluation
actions
Fixing
Pm
Update
probability
vector
Assign Pm by LA
Replace
Increasing
Pm
Update
Suppression
n
2
D = A( xi ) B( xi )
i =1
End
(4)
978-1-4244-7375-5/10/$26.00 2010
Stopping
Criteria?
IEEE
V.
SIMULATION RESULTS
488
(x - xi )2 + ( y - y i )2
(5)
Proceedings of the World Congress on Nature and Biologically Inspired Computing (NaBIC2010)
TABLE I.
Algorithm
ACO
CSA
HACO-IA
LA-AIN
TABLE II.
Algorithm
ACO
CSA
HACO-IA
LA-AIN
TABLE III.
Algorithm
ACO
CSA
HACO-IA
LA-AIN
f gt
1.5
0.0760.008
0.0780.050
0.0350.007
0.0550.014
A
2.5
0.0570.013
0.0670.006
0.0320.013
0.0250.013
3.5
0.0890.008
0.0620.007
0.0500.013
0.0480.010
1.5
0.0150.003
0.0200.003
0.0160.004
0.0130.011
A
2.5
0.0200.003
0.0250.006
0.0210.005
0.0200.014
3.5
0.0450.012
0.0390.003
0.0190.002
0.0170.012
1.5
0.0050.001
0.0150.002
0.0010.001
0.0140.008
A
2.5
0.0110.001
0.0110.004
0.0010.001
0.0100.003
3.5
0.0050.001
0.0110.002
0.0020.001
0.0030.005
(6)
g gt
Where fgt is the best fitness at t, and ggt is the actual global
optimum. The best egt since the previous change occurring
in the environment is calculated as:
egt = min{eg, eg+1, , egt}
(7)
1 T '
e gt
T t =1
(8)
978-1-4244-7375-5/10/$26.00 2010
IEEE
489
Proceedings of the World Congress on Nature and Biologically Inspired Computing (NaBIC2010)
[8]
[9]
[10]
[11]
CONCLUSIONS
[12]
[13]
[14]
[15]
[16]
[17]
ACKNOWLEDGMENT
[18]
[19]
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[20]
978-1-4244-7375-5/10/$26.00 2010
IEEE
[21]
[22]
[23]
[24]
490