You are on page 1of 7

2012 IEEE. Personal use of this material is permitted.

Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

This paper appears in: Optimization of Electrical and Electronic Equipment (OPTIM), 2012 13th International Conference on Date of Conference: 24-26 May 2012 Author(s): Neukart, F. Dept. of Electr. Eng. & Comput. Sci., Transilvania Univ. of Brasov, Brasov, Romania Moraru, S.-A. ; Grigorescu, C.-M. ; Szakacs-Simon, P. Page(s): 1120 - 1125 Product Type: Conference Publications

Transgenetic NeuroEvolution
Sorin-Aurel Moraru Costin-Marius Grigorescu Peter Szakacs-Simon Department of Electrical Engineering and Computer Science Transilvania University of Brasov Brasov, Romania 500036 florian.neukart@campus02.at smoraru@unitbv.ro costin.grigorescu@unitbv.ro peter.szakacs@unitbv.ro
Abstract- Transgenetic algorithms can be used for performing a
stochastic search by simulating endosymbiotic interactions between a host and a population of endosymbionts as well as information exchange between the host and endosymbionts by agents. The already introduced, computationally intelligent Data Mining system "System applying High Order Computational Intelligence in Data Mining (SHOCID) applies such for Artificial Neural Network (ANN) learning by the combination of one of its learning approaches with a host organism, serving as genetic pool, and transgenetic vectors. The application of an algorithm combining horizontal gene transfer between a host and a symbiont is a completely new ANN learning approach, which increases both learning performance and accuracy to a considerable degree. A further advantage is that the application of transgenetic vectors massively increases the chance of reaching the desired stopping criteria (like a minimum Root Mean Squared Error [RMSE]) instead of abort criteria (like the evolutionary stop after 5,000 generations without improvement although the desired have not been fulfilled), as even learning algorithms like back propagation cannot oscillate or get stuck in local minima due to the inescapable transfer of host genetic material.

Florian Neukart

gene sequences. The endosymbiont is the real solution, which is being evolved until predefined stopping criteria have been met. However, the manipulation of the endosymbiont's genetic material does not only happen through horizontal gene (sequence) transfer of the host's genetic material to the former one, but also by some special types of mutation in its chromosomes, or in case of population-based learning, genomes. Both changes in the endosymbiotic DNA are carried out by agents, the so-called transgenetic vectors. SHOCID transgenetic NeuroEvolution makes use of the following types of vectors: Plasmids o Weight Plasmid o Structure Plasmid Transposons o Jump and Swap Transposon o Erase and Jump Transposon Plasmids are used for the transportation of genetic information from the host to the endosymbiont, and transposons mutate the genetic material of the endosymbiont. Figure 1 provides a brief overview of how transgenetic NeuroEvolution works:
h1

I.

INTRODUCTION

The biological fundamentals of transgenetic algorithms, which have been introduced by Gouva [1], for solving NPhard combinatorial problems have already been elucidated in detail by Goldbarg & Goldbarg [2], but have to be adapted for ANN evolution. According to Gouva Computational Transgenetics (CT) [3] brings the following ideas to the evolutionary context: Usage exogenous and endogenous information to interfere on the processes of formation and modification of individuals of a given population. Usage of the intracellular flow as an operational way to carry the required manipulations on the individuals. Exploration new processes of population improvement using transgenetic agents and competition between agents and individuals. Guidance of the evolutionary process allows the occurrence of evolutionary jumps [4]. As in nature, the relationship between a host organism and a symbiont may be an advantage for both organisms, and the descendants share genetic material of both the host and the symbiont. The host organism in SHOCID [5] mostly serves as genetic database and contributes to the final solution with

Host genetic material


id plasm

w i1h wi1
w i1w i1hn
cture

h2

ne Ge

h3

i2h 1

Stru

h2

Endosymbiont
s an fer

seq e tr nc ue

w i2h2

h1 i1
w i1h wi1
w i1w i1hn
i2h 1

h1 o1

h2

w h1
on

h1 o1 i1
w i1h wi1
w i1w i1hn
1

h2

i2

w i2h2 wi2
h3

wh2o1 w h

w inh1

inh

h3

on
on

i2h 1

w inh1

wh

4o 1

hn
Weight plas mid

in

inh

winh3
inh n

inh n

wh

in

winh3
w
4o 1

i2

h3

h1

o1

2o n

h2

w h1
on

h3

w h3 wh3on

o1

h2

w i2h2 wi2
h3

wh2o1 w h

o1
o1

w i2

hn

2o n

w h4

h3

w h3 wh3on

w i2

hn

on
on

w h4

hn
J&

w i2h2 wi2
h3

Gen

nce que e se

tran

sfer
r teg -in Re

nspo S Tra

on ati

son

w inh1

inh 2

winh3
w
inh n

h4 o1

wh

4o
3o1

Fig. 1 - Transgenetic NeuroEvolution

wh

w i2

hn

wh3on

w h3 wh3on
Jump & Swap

o1

w h4

on

w h4

on

The picture shows on the left side the host, which serves as genetic database for the horizontal gene transfer, carried out by plasmid vector agents. One weight plasmid on the left side below transports a sequence of weights to the endosymbiont, as well as a structural plasmid above the host transfers the weights, biases and activation functions. On the right side a jump and swap transposon vector mutates the chromosome of the endosymbiont. A. Host genetic material The host within SHOCID consists "rough" solutions to the problem statement, meaning several populations each consisting of thousands individuals each of them subject of continuous evolution. Therefore, the host is not only one organism, but thousands of organisms in different populations. B. Endosymbiont The endosymiont within SHOCID is a simulated annealing ANN solution, which simulates the cooling of metal. Simulated annealing belongs to the group of metaheuristic algorithms, suitable for solving optimization or search problems. In physics, the term 'annealing' refers to the very slow cooling of gas or metal into a crystalline solid of minimum energy configuration [6]. The atoms of such materials have very high energy values at very high temperatures, which gives the atoms a great deal of freedom in their ability to restructure themselves [7]. The energy values of such materials decreases during cooling down. If the ideal speed (continuous temperature reduction) has been found, the material will be stable in its structure and more consistent then if cooling it down too quick. Simulated annealing (SA) [8] algorithms therefore simulate this behaviour, which can also be applied within ANN training. The simulated annealing algorithm always works with two solutions, the first being the best one having been achieved until the point in time , represented by , and the second being the one currently being created and compared with the first one, represented by (2) If the second one performs better than the first one, thus the outcome of the above equation is positive, it is used to replace the latter as the current best one. In some cases, simulated annealing also makes use of a probability of when to replace a solution with a better one, meaning that in some implementations a better new solution might not always replace the actual one: (3) , where T represents the current temperature, meaning the value that influences the change of weights within a weight matrix of an ANN:

(4) The ratio for changing the weights within an ANN is calculated by multiplying the temperature by a random number . The higher the temperature is, the higher is the probability of a high weight change. Changes with a specific temperature are carried out as long as a predefined number of iterations (cycles) has not been reached. After having fulfilled the last iteration of a temperature, the algorithm verifies if the lowest, predefined temperature has been reached. If not, the temperature will be lowered by either a constant or by logarithmically decreasing it by a ratio between a beginning and an ending temperature, as the following equation shows [4]:

(1) The variable s represents the starting temperature, e the ending temperature. c represents the cycle count. The above equation calculates a ratio that should be multiplied by the current temperature T, which produces a change that will cause the temperature to reach the ending temperature in the specified number of cycles. The transgenetic vectors are being applied during the simulated cooling process. Each iteration creates a new possible solution, and each new solution must endure the attacks of both plasmid and transposon vector agents to see if a better solution can be found. II. ALGORITHM The algorithm for applying transgenetic NeuroEvolution is as follows:
TABLE I TRANSGENETIC NEUROEVOLUTION

Start 1. Creation of 2. a) Repeat

initial host populations.

Randomization of weights and threshold values of each chromosome. i. Repeat 1.Calculate the network output for the value 2.Evaluate the fitness of each chromosome: a. Calculate the error neuron for each output

b.

Calculate the error hid

for each hidden neuron

4.Compare solutions according to 3.Selection of chromosomes to recombine 4.Repeat a. b. Crossover of chromosomes Mutation of offspring 5.If is better than , set =

6.Apply plasmid vector 7.If is better than , set =

5.Until all selected chromosomes are recombined ii. Until criteria are reached b) Creation of new population 3. Until rough evolution ( finished. ) for each has been

8.Apply transposon vector 9.If is better than , set =

iv. Until max tries for current temperature reached v. Decrease temperature by

4.

Create initial ANN solution and randomize weights.

a) Repeat i. Calculate the network output for the value ii. Evaluate the fitness of each neuron: 1.Calculate the error for each output neuron b) Until lower temperature bound reached End Breakdown: nhp: the number of host populations to create rrmse: rough root mean squared error C(C): calculation current solution C(S): calculation initial solution C(P): calculation plasmid solution C(T): calculation transposon solution The algorithm shows that the evolution of the host genetic material is carried out only roughly, until the rough root mean squared error has been reached by the evolution of the population. This is, in terms of SHOCID the allowed, passed RMSE multiplied by 10. If the allowed RMSE is 1 %, the rough RMSE is 10 %. III. HORIZONTAL (ENDOSYMBIOTIC) GENE (SEQUENCE)
TRANSFER

2.Calculate the error

for each hidden neuron hid

3.Set iii. Repeat 1.Create new ANN and randomize weights according to T 2.Calculate the error for each output neuron

3.Calculate the error

for each hidden neuron hid

The horizontal transfer of genetic material is the transfer of genetic material from the host to the endosymbiont by plasmids. Every time the plasmid vector is applied on the currently evolving endosymbiont, the actual best solution of the currently selected host population is being determined and serves as current host. The plasmid vector selects a gene sequence of the host chromosomes, copies it and transfers it to the endosymbiont. The application of the plasmid vector happens once during each the evolutionary iteration of the endosymbiont. As mentioned above, the actual best solution

(=the host) is chosen from one of the host populations, where the selection of the latter also happens at random, depending from the number of host populations created. If three host populations have initially been created, the probability for each one to be selected is 1/3 either. After having selected the host chromosome, the relevant gene sequence has to be determined, which also happens by randomness. For SHOCID transgenetic algorithms it is important that the host and the endosymbiont consist of the same number of genes, as for the horizontal gene transfer gene sequence determination the chromosome length is taken into consideration. After the length of the selected chromosome has been determined, a random starting and end point are being created, which enclose the transfer gene sequence. The same starting and end points are then used to delete the gene sequence in the endosymiont. The gap is then filled with the host genetic transfer sequence. However, both the old and new solutions are then compared according to their quality, and if the transfer has proofed to be evolutionary reasonable, the new solution will form the basis for further evolution, but if not, the gene transfer is rolled back. As mentioned in the fundamentals, there are two types of plasmids. The application of either the one or the other happens at random with a probability of 1/2 for each. A. Weight plasmid The weight plasmid does only take the weights between the neurons into consideration. This means that a transfer sequence only consists of weights, but does not contain activation functions or biases. The following example shows the application of a weight plasmid. The length of the host chromosome is being determined and a random sequence selected: (5) The host chromosome has the length 15, and the gene transfer sequence starts with gene 6 and ends with gene 12. The same sequence is then selected in the endosymbiont: (6) Afterwards, the endosymbiotic gene sequence is deleted and replaced by the host gene sequence, which forms the descendant chromosome: (7) B. Structure plasmid The structure plasmid, in contrary to the weight plasmid, contains all activation functions and bias values the transfer sequence encloses. This requires the different host populations to make use of different activation functions, as a transfer of these would be useless otherwise. The bias values within SHOCID evolve independently in each chromosome

of each host population as well as in the endosymbiont. SHOCID takes care of both the activation functions and the bias values - there is no need to interfere with the system. The example for the structural plasmid is the same, with the exception that the gene sequence also contains bias values and activation functions. The length of the host chromosome is being determined and a random sequence selected: (8) The host chromosome has the length 15, and the gene transfer sequence starts with gene 6 and ends with gene 12. The difference is that the above chromosome also contains neuron information, namely the mentioned activation functions and the bias values of the host. The same sequence is then selected in the endosymbiont: (9) Afterwards, the endosymbiotic gene sequence is deleted and replaced by the host gene sequence, which forms the descendant chromosome: (10) IV. TRANSPOSON MUTATION As mentioned in the fundamentals, the transposon mutation does not transfer genetic sequences from the host, but changes the genetic information in the endosymbiont. As with plasmid vectors, the application of the transposon vector happens once during each the evolutionary iteration of the endosymbiont. The transposon vector selects a gene sequence in the endosymbiont and mutates it according to the transposon type. Again, both the old and new solutions are then compared according to their quality, and if the mutation has proofed to be evolutionary reasonable, the new solution will form the basis for further evolution, but if not, the gene sequence mutation is rolled back. As mentioned in the fundamentals, there are two types of transposons. The application of each type of transposon vector depends on the randomly selected gene sequence in the endosymbiotic chromosome. A. Jump and swap transposon After the length of the endosymbiotic chromosome has been determined, a random starting and end point are being created, which enclose a gene sequence. If the enclosed gene sequence consists of two genes, the jump and swap transposon vector is applied and the two selected genes are being swapped. The length of the symbiont chromosome is being determined and a random sequence selected: (11)

The host chromosome has the length 15, and the gene transfer sequence starts with gene 6 and ends with gene 8. The transposon can only be a jump and swap vector, when the length of the selected gene sequence does not exceed 2. These two genes are then swapped, which forms the descendant chromosome: (12) B. Erase and jump transposon If the length of the random gene sequence exceeds two, then the erase and jump transposon is being applied. In principle, it works similar to the jump and swap transposon, except that one randomly selected gene in the chromosome is being deleted and replaced by a randomly chosen other one of the sequence. The length of the symbiont chromosome is being determined and a random sequence selected: (13) The host chromosome has the length 15, and the gene transfer sequence starts with gene 6 and ends with gene 2. The transposon can only be an erase and jump vector, when the length of the selected gene sequence exceeds 2. Within this gene sequence, one gene is being selected at random and erased, in the example gene number 7: (14) Another gene is selected at random for replacing the missing one, in the following case gene number 10, which forms the descendant chromosome: (15) V. USAGE Test 1 2 3 4 5

strike price running from 70 dollars to 130 dollars. The stock price was set to 100 dollars and the interest rate to 0 percent when generating the data [9]. The number of datasets to learn was 1,530. The data for the problem can also be found at [9]. Not only the RMSE served as fitness function, but also the absolute error of the price, which had to be below 0.5 dollars. However, for this second presented problem the paper at hand does not contain transgenetic vector verification table, as the space is limited. The problems have, for being able to draw conclusions, been taught to a one-hidden-layer MLP ANN learning with back propagation (MLP BP), one learning with resilient propagation (MLP RP), and one learning with simulated annealing (MLP SATG) and applying transgenetic vectors. Furthermore, the focus of the verification lay on minimizing the number of hidden layers and neurons, so SHOCID was allowed to evolve the solutions horizontally (number of hidden neurons) and vertically (number of hidden layers), but not allowed to exceed 5 neurons. SHOCID determined for all solutions 3 hidden neurons and one hidden layer as optimum. All four ANNs made use of the sigmoid activation function, and the allowed RMSE for the first three was 1%. However, as trangenetic NeuroEvolution performs well, the allowed error for this solution has been set to 0.01 % for the XOR statement, as otherwise training finishes too fast. For the second verification test, all solutions were allowed an RMSE of 1% (plus the mentioned absolute tolerance of 0.5 dollars). All tests have been conducted 5 times, and the following table shows the iterations needed to reach the allowed minimum error by each solution type:
TABLE II XOR VERIFICATION RESULTS

Transgenetic NeuroEvolution can be applied with each ANN solution type, as long as there exists a host and an endosymbiont. It is crucially to represent the chromosomes and gene sequences as arrays of values, so that the transfer and mutation can be carried out. VI. VERIFICATION The transgenetic NeuroEvolution within SHOCID has been tested with manifold Data Mining problem statements, but for verification in this paper two problem statements have been presented to the system: The XOR problem, containing 4 datasets each consisting of two attributes and one output neuron. The RMSE served as fitness function. A financial problem consisting of data that follows consists of Black-Scholes option prices for volatility levels running from 20 percent to 200 percent, for time remaining running from 5 to 15 days, and for

MLP BP 183 LM 505 269 213

MLP RP 436 1090 1025 743 OS

MLP SATG 5 4 1 9 7

Table II shows that the back propagation solution got stuck in a local minimum in the second XOR test and, the resilient propagation solution oscillated in the 5th. According to the results, it is obvious that MLP SATG performed better than the other solution types.
TABLE III BLACK-SCHOLES OPTION VERIFICATION RESULTS

Test 1 2

MLP BP OS > 100,000

MLP RP > 100,000 > 100,000

MLP SATG 3,496 3,225

3 4 5

> 100,000 > 100,000 > 100,000

> 100,000 > 100,000 > 100,000

4,169 3,219 3,936

As table III shows, the Black-Scholes problem statement results can be interpreted in the same manner as the results from the XOR problem: MLP SATG outperformed the other solutions. Table IV shows, in which training iteration transgenetic vectors have been applied at the XOR problem:
TABLE IV XOR TRANSGENETIC VECTOR APPLICATION

just been the beginning of symbiotic ANN learning strategies and there is still a lot more to improve and discover. As already indicated in [5], the overall target of the project lies on advancing the system in a way so it will be of practical use in enterprises - science shall become a more important factor not only in theoretical, but also in practically applied Data Mining. ACKNOWLEDGEMENTS Thanks to Dean Professor Sorin-Aurel Moraru for having enabled the SHOCID research and development project. REFERENCES
[1] [2] [3] [4] [5] E.F. Gouva (2001): Transgentica Computacional: Um Estudo Algortmico. Ph.D. Thesis, Universidade Federal do Rio de Janeiro E. F. G. Goldbarg, M. C. Goldbarg et al. (2009): Foundations of Computational Intelligence Volume 3 Global Optimization; Berlin Heidelberg: Springer-Verlag, p. 425 ff. M. C. Goldbarg, E. F. Gouva (2000): Computational Transgenetics, X CLAIO, September, Mexico City, Mexico. E. F. Gouva, M. C. Goldbarg (2001): ProtoG: a Computational Transgenetic Algorithm, Proceedings of 4th Metaheuristics International Conference, 2001 Conference on, p. 625-630 F. Neukart et al., "High Order Computational Intelligence in Data Mining - A generic approach to systemic intelligent Data Mining", Proceedings of Speech Technology and Human-Computer Dialogue (SpeD), 2011 6th Conference on, 2011, pp. 1-9. J. Fulcher et al. (2008): Computational Intelligence A Compendium; Springer; Berlin-Heidelberg, p. 909 J. Heaton (2008): Introduction to Neural Networks for Java, 2nd ed.; Chesterfield: Heaton Research, Inc. S. Kirkpatrick et al. (1983): Optimization by simulated annealing; Science, 220(4598): 671680 Scientific Consultant Services (2003): Neural Network Test Data [0404-2012]; URL: http://www.scientific-consultants.com/nnbd.html

Test 1 2 3 4 5

Plasmid 2,3,4 2 5,6 5

Transposon 1 1 1 2 4

As the Black-Scholes problem statement consists of a high number of datasets, the plasmid and transposon application has been concluded to the sum of applications for each test with the MLP SATG:
TABLE V BLACK-SCHOLES TRANSGENETIC VECTOR APPLICATION

[6] [7] [8] [9]

Test 1 2 3 4 5

Plasmid 29 74 63 79 13 VII.

Transposon 12 3 1 3 1 CONCLUSION

The transgenetic NeuroEvolution solution outperformed any other solution type SHOCID provides due to the strategy of combining a host organism with an endosymbiont. By the application of transgenetic NeuroEvolution, we can overcome several, well-known problems in ANN learning, like the local minima or oscillation problems of back propagation. Horizontal gene transfer and transposon mutation of the endosymbiont do, due to architectural constraints like the constant changing of the processed solutions structure (or gene sequence), simply not allow the upcoming of such problems. VIII. SUMMARY Transgenetic NeuroEvolution is not just only a new approach for ANN learning, but also existing ANN learning strategies can benefit thereof. Within SHOCID, the introduced approach is able to increase the performance of any of the systems learning approaches for classification and time-series prediction problems, if applied. However, this has

You might also like