You are on page 1of 70

March 18 - 2009

02
Semester 2 - 2008/2009
1

Genetic Algorithms
By

Anas Amjad Obeidat

Advanced Algorithms
Overview
March 18 - 2009

Semester 2 - 2008/2009
2

• Introduction To Genetic Algorithms (GAs)


• GA Operators and Parameters
• Genetic Algorithms To Solve The Traveling
Salesman Problem (TSP)
• 8-queens Problem
• Summary
March 18 - 2009

Semester 2 - 2008/2009
3

Introduction To Genetic
Algorithms (GAs)
March 18 - 2009

History Of Genetic Algorithms


Semester 2 - 2008/2009
4

• “Evolutionary Computing” was introduced in the 1960s by I.


Rechenberg.

• John Holland wrote the first book on Genetic Algorithms


‘Adaptation in Natural and Artificial Systems’ in 1975.

• In 1992 John Koza used genetic algorithm to evolve programs


to perform certain tasks. He called his method “Genetic
Programming”.
March 18 - 2009

What Are GAs?


Semester 2 - 2008/2009
5

Genetic Algorithms are search and optimization


techniques based on Darwin’s Principle of
Natural Selection.
March 18 - 2009

Principle Of Natural Selection


Semester 2 - 2008/2009
6

“Select The Best, Discard The Rest”[1]


March 18 - 2009

GAs Vs other search methods Semester 2 - 2008/2009


7

“Search” for what?

• Data - Efficiently retrieve a piece of information, (Data


mining)  Not AI

• Paths to solutions - Sequence of actions/steps from an initial


state to a given goal, (AI-tree/graph search)

• Solutions - Find a good solution to a problem in a large space


(search space) of candidate solutions

– Aggressive methods (e.g. Simulated Annealing, Hill Climbing)


– Non-aggressive methods (e.g. GAs)
March 18 - 2009

Applications of GAs Semester 2 - 2008/2009


8

• Numerical and Combinatorial Optimization


– Job-Shop Scheduling, Traveling salesman
• Automatic Programming
– Genetic Programming
• Machine Learning
– Classification, NNet training, Prediction
• Economic
– Biding strategies, stock trends
• Ecology
– host-parasite coevolution, resource flow, biological arm races
• Population Genetics
– Viability of gene propagation
• Social systems
– Evolution of social behavior in insect colonies
March 18 - 2009

Semester 2 - 2008/2009
9

Genetic Algorithms
Implementation
March 18 - 2009

Computational Model Semester 2 - 2008/2009


10

Main GA algorithm
March 18 - 2009

Working Mechanism Of GAs Semester 2 - 2008/2009


11

Begin

Initialize
population

Evaluate
Solutions

T =0

Optimum N
Solution?
Selection
Y

T=T+1 Crossover
Stop

Mutation
March 18 - 2009

Simple Genetic Algorithm Semester 2 - 2008/2009


12
function GENETIC-ALGORITHM(population, FITNESS-FN, crossover-rate, mutation-rate)
returns an individual
inputs: population, a set of individuals
FITNESS-FN (the fitness function)
repeat
new_population  empty set
calulate the fitness value of each individual
loop for i from 1 to SIZE(population) do
x  RANDOM-SELECTION(population, FITNESS-FN)
add x to new population
loop for i from 1 to SIZE(population) * crossover-rate do
x  RANDOM-SELECTION(new_population)
y  RANDOM-SELECTION(new_population)
x, y  REPRODUCE(x, y)
loop for i from 1 to SIZE(population) * mutation-rate do
x  RANDOM-SELECTION(new_population)
x  MUTATE(x)
population  new_population
until the average fitness values are stable, or enough time has elapsed
return the best individual found in any population
March 18 - 2009

Nature to Computer Mapping


Semester 2 - 2008/2009
13

Nature Computer

Population Set of solutions.


Individual Solution to a problem.
Fitness Quality of a solution.
Chromosome Encoding for a Solution.
Gene Part of the encoding of a solution.
Reproduction Crossover
March 18 - 2009

Semester 2 - 2008/2009
14

GA Operators and Parameters


March 18 - 2009

Encoding
Semester 2 - 2008/2009
15

The process of representing the solution in the form of a string


that conveys the necessary information.
• Binary Encoding – Most common method of encoding. Chromosomes
are strings of 1s and 0s and each position in the chromosome represents a
particular characteristic of the problem.

Permutation Encoding – Useful in ordering problems such as the


Traveling Salesman Problem (TSP). Example. In TSP, every chromosome
is a string of numbers, each of which represents a city to be visited.

• Value Encoding – Used in problems where complicated values, such as


real numbers, are used and where binary encoding would not suffice.
March 18 - 2009

Fitness Function
Semester 2 - 2008/2009
16

A fitness function quantifies the optimality of a


solution (chromosome) so that that particular solution
may be ranked against all the other solutions.
• A fitness value is assigned to each solution depending on how
close it actually is to solving the problem.

• Ideal fitness function correlates closely to goal + quickly


computable.

• Example. In TSP, f(x) is sum of distances between the cities in


solution. The lesser the value, the fitter the solution is.
March 18 - 2009

Recombination
Semester 2 - 2008/2009
17

The process that determines which solutions are to be


preserved and allowed to reproduce and which ones
deserve to die out.

• The primary objective of the recombination operator is to


emphasize the good solutions and eliminate the bad solutions
in a population, while keeping the population size constant.

• “Selects The Best, Discards The Rest”.

• “Recombination” is different from “Reproduction”.


March 18 - 2009

Recombination(Cont.) Semester 2 - 2008/2009


18

• Identify the good solutions in a population.

• Make multiple copies of the good solutions.

• Eliminate bad solutions from the population so


that multiple copies of good solutions can be
placed in the population.
March 18 - 2009

Roulette Wheel Selection


Semester 2 - 2008/2009
19
• Each current string in the population has a slot assigned to it
which is in proportion to it’s fitness.

• We spin the weighted roulette wheel thus defined n times


(where n is the total number of solutions).

• Each time Roulette Wheel stops, the string corresponding to


that slot is created.

Strings that are fitter are assigned a larger slot and hence have
a better chance of appearing in the new population.
March 18 - 2009

Example Of Roulette Wheel Selection


Semester 2 - 2008/2009
20

No. String Fitness % Of Total

1 01101 169 14.4

2 11000 576 49.2

3 01000 64 5.5

10011 361 30.9


4
Total 1170 100.0
March 18 - 2009

Roulette Wheel For Example


Semester 2 - 2008/2009
21
March 18 - 2009

Crossover
Semester 2 - 2008/2009
22

It is the process in which two chromosomes (strings)


combine their genetic material (bits) to produce a
new offspring which possesses both their
characteristics.

• Two strings are picked from the mating pool at random to cross
over.

• The method chosen depends on the Encoding Method.


March 18 - 2009

Crossover Methods
Semester 2 - 2008/2009
23

• Single Point Crossover- A random point is chosen on


the individual chromosomes (strings) and the genetic
material is exchanged at this point.

Chromosome1 11011 | 00100110110

Chromosome 2 11011 | 11000011110

Offspring 1 11011 | 11000011110

Offspring 2 11011 | 00100110110


March 18 - 2009

Crossover Methods (contd.)


Semester 2 - 2008/2009
24

• Two-Point Crossover- Two random points are chosen on


the individual chromosomes (strings) and the genetic
material is exchanged at these points.
Chromosome1 11011 | 00100 | 110110

Chromosome 2 10101 | 11000 | 011110

Offspring 1 10101 | 00100 | 011110

Offspring 2 11011 | 11000 | 110110

NOTE: These chromosomes are different from the last example.


March 18 - 2009

Crossover Methods (contd.)


Semester 2 - 2008/2009
25

• Uniform Crossover- Each gene (bit) is selected randomly


from one of the corresponding genes of the parent
chromosomes.
Chromosome1 11011 | 00100 | 110110

Chromosome 2 10101 | 11000 | 011110

Offspring 10111 | 00000 | 110110

NOTE: Uniform Crossover yields ONLY 1 offspring.


March 18 - 2009

Crossover (contd.) Semester 2 - 2008/2009


26

• Crossover between 2 good solutions MAY NOT


ALWAYS yield a better or as good a solution.

• Since parents are good, probability of the child


being good is high.

• If offspring is not good (poor solution), it will be


removed in the next iteration during “Selection”.
March 18 - 2009

Elitism Semester 2 - 2008/2009


27

Elitism is a method which copies the best


chromosome to the new offspring population
before crossover and mutation.

• When creating a new population by crossover or mutation


the best chromosome might be lost.

• Forces GAs to retain some number of the best individuals


at each generation.

• Has been found that elitism significantly improves


performance.
March 18 - 2009

Mutation
Semester 2 - 2008/2009
28

It is the process by which a string is deliberately


changed so as to maintain diversity in the population
set.

We saw in the giraffes’ example, that mutations could be


beneficial.

Mutation Probability- determines how often the parts of a


chromosome will be mutated.
March 18 - 2009

Example Of Mutation
Semester 2 - 2008/2009
29

• For chromosomes using Binary Encoding, randomly selected


bits are inverted.

Offspring 11011 00100 110110

Mutated Offspring 11010 00100 100110

NOTE: The number of bits to be inverted depends on the Mutation Probability.


March 18 - 2009

Advantages Of GAs
Semester 2 - 2008/2009
30

• Global Search Methods: GAs search for the function


optimum starting from a population of points of the
function domain, not a single one. This characteristic
suggests that GAs are global search methods. They can, in
fact, climb many peaks in parallel, reducing the probability
of finding local minima, which is one of the drawbacks of
traditional optimization methods.
March 18 - 2009

Advantages of GAs (contd.)


Semester 2 - 2008/2009
31

• Blind Search Methods: GAs only use the information


about the objective function. They do not require
knowledge of the first derivative or any other auxiliary
information, allowing a number of problems to be solved
without the need to formulate restrictive assumptions. For
this reason, GAs are often called blind search methods.
March 18 - 2009

Advantages of GAs (contd.) Semester 2 - 2008/2009


32

• GAs use probabilistic transition rules during


iterations, unlike the traditional methods that use fixed
transition rules.
This makes them more robust and applicable to a large
range of problems.
March 18 - 2009

Advantages of GAs (contd.) Semester 2 - 2008/2009


33

• GAs can be easily used in parallel machines-


Since in real-world design optimization problems, most
computational time is spent in evaluating a solution, with
multiple processors all solutions in a population can be
evaluated in a distributed manner. This reduces the overall
computational time substantially.
March 18 - 2009

Semester 2 - 2008/2009
34

Genetic Algorithms To Solve


The Traveling Salesman
Problem (TSP)
March 18 - 2009

The Problem
Semester 2 - 2008/2009
35

The Traveling Salesman Problem is defined as:

‘We are given a set of cities and a symmetric distance matrix


that indicates the cost of travel from each city to every other
city.
The goal is to find the shortest circular tour, visiting every city
exactly once, so as to minimize the total travel cost, which
includes the cost of traveling from the last city back to the first
city’.
March 18 - 2009

Encoding
Semester 2 - 2008/2009
36
• We represent every city with an integer .

• Consider 6 Jordanian cities –


Amman, Irbid, Al-Mafraq, Al-Salt , Aqabah and Al-Karak and
assign a number to each.

Amman 1
Irbid 2
Al-Mafraq 3
Al-Salt 4
Aqabah 5
Al-Karak 6
March 18 - 2009

Encoding (contd.)
Semester 2 - 2008/2009
37

• Thus a path would be represented as a sequence of integers


from 1 to 6.

• The path [1 2 3 4 5 6 ] represents a path from Amman to Irbid ,


Irbid to Al-Mafraq, Al-Mafraq to Al-Salt, Al-Salt to Aqabah ,
Aqabah to Al-Karak . Finally Al-Karak to Amman

• This is an example of Permutation Encoding as the position


of the elements determines the fitness of the solution.
March 18 - 2009

Fitness Function Semester 2 - 2008/2009


38

• The fitness function will be the total cost of the tour


represented by each chromosome.

• This can be calculated as the sum of the distances traversed in


each travel segment.

The Lesser The Sum, The Fitter The Solution


Represented By That Chromosome.
March 18 - 2009

Distance/Cost Matrix For TSP


Semester 2 - 2008/2009
39

Amman Irbid Al-Mafraq Al-Salt Al-Aqabah Al-Karak


1 2 3 4 5 6
Amman [1] 0 90 100 35 300 200

Irbid [2] 90 0 60 120 400 290


Al-Mafraq [3] 100 60 0 70 480 225

Al-Salt [4] 35 120 70 0 320 150

Aqabah [5] 300 400 480 320 0 290

Al-Karak [6] 200 290 225 150 290 0

Cost matrix for six city example.


Distances in Kilometers
March 18 - 2009

Fitness Function (contd.)


Semester 2 - 2008/2009
40

• So, for a chromosome [4 1 3 2 5 6], the total cost of travel or


fitness will be calculated as shown below

• Fitness = 35+ 100+ 60+ 400+ 290 + 150


= 1035 kms.

• Since our objective is to Minimize the distance, the lesser the


total distance, the fitter the solution.
March 18 - 2009

Selection Operator
Semester 2 - 2008/2009
41

Tournament Selection.

As the name suggests tournaments are played between two


solutions and the better solution is chosen and placed in the
mating pool.

Two other solutions are picked again and another slot in the
mating pool is filled up with the better solution.
March 18 - 2009

Why we can’t use single-point


Semester 2 - 2008/2009
42

• Single point crossover method randomly selects a crossover point


in the string and swaps the substrings.
• This may produce some invalid offsprings as shown below.

4 1 3 2 5 6 4 1 3 1 5 6

4 3 2 1 5 6 4 3 2 2 5 6
March 18 - 2009

Order 1 crossover
Semester 2 - 2008/2009
43

• Idea is to preserve relative order that elements occur


• Informal procedure:
1. Choose an arbitrary part from the first parent
2. Copy this part to the first child
3. Copy the numbers that are not in the first part, to the
first child:
• starting right from cut point of the copied part,
• using the order of the second parent
• and wrapping around at the end
4. Analogous for the second child, with parent roles
reversed
March 18 - 2009

Order 1 crossover example


Semester 2 - 2008/2009
44

• Copy randomly selected set from first parent

• Copy rest from second parent in order 1,9,3,8,2


March 18 - 2009

Mutation Operator
Semester 2 - 2008/2009
45
• The mutation operator induces a change in the solution, so as
to maintain diversity in the population and prevent Premature
Convergence.
• In our project, we mutate the string by randomly selecting any
two cities and interchanging their positions in the solution, thus
giving rise to a new tour.
4 1 3 2 5 6

4 5 3 2 1 6
March 18 - 2009

TSP Example: details (1)


Semester 2 - 2008/2009
46
• Initial Population:
Termination
Termination Condition:
Condition: Generation
Generation 3
3
P1 : {2,1,3,4,5,6}
P2 : {1,2,3,5,4,6}
P3: {1,4,3,2,6,5}
P4: {5,3,2,1,4,6}
• Generation 1:
1- Fitness Function (P1) Fitness Function: Minimum Distance between Cites
(2,1) + (1,3) + (3,4) + (4,5) + (5, 6) + (6,2) = 90 + 100 + 70 + 320 + 290 +290 = 1060 km

2- Fitness Function (P2)


(1,2)+(2,3)+(3,5) + (5,4) + (4,6) + (6,1) = 90 + 60 +480 + 320 + 150 + 200 = 1300 km

3- Fitness Function (P3)


(1,4) + (4,3)+(3,2)+(2,6)+ (6,5)+(5,1) = 35 + 70 + 60 + 290 + 290 + 300 = 1045 km

4- Fitness Function (P4)


(5,3)+(3,2)+(2,1)+(1,4)+(4,6)+(6,5) = 480 + 60 + 90 + 35 + 150 + 290 = 1105 km
March 18 - 2009

TSP Example: details (2)


Semester 2 - 2008/2009
47
• Tournament Selection
P1: 1060 km
P2: 1300 km The Winners P1 & P3
P3: 1045 km
P4: 1105 km

• Crossover (Two Points): Order (1)


Table 1
Nodes Solution Notes

P1 2 1 |3 4 5|6
P3 1 4 |3 2 6|5
S1 2 6 |3 4 5|1 5 1 4 3 2 6 (Order 1)
S2 4 5 |3 2 6|1 6 2 1 3 4 5 (Order 1)
March 18 - 2009

TSP Example: details (3)


Semester 2 - 2008/2009
48
• Generation 2
P1 : {2,1,3, 4,5,6} = 1060 km
P2 : {1,4,3,2,6,5} = 1045 km
P3: {2,6,3,4,5,1} = 1295 km
P4: {4,5,3,2,6,1} = 1385 km
• Tournament Selection
P1: 1060 km
P2: 1045 km
P3: 1295 km The Winners P1 & P2
P4: 1385 km
• Crossover (Two Points): Order (1)
Table 2
Nodes Solution Notes

P1 2 1 |3 4 5|6
P3 1 4 |3 2 6|5
S1 2 6 |3 4 5|1 1 2 6 (Order 1)
S2 4 5 |3 2 6|1 1 4 5 (Order 1)
March 18 - 2009

TSP Example: details (4)


Semester 2 - 2008/2009
49
• Generation 3
P1 : {2,1,3, 4,5,6} = 1060 km
P2 : {1,4,3,2,6,5} = 1045 km
P3: {2,6,3,4,5,1} = 1295 km
We Find that

P4: {4,5,3,2,6,1} = 1385 km
Tournament Selection
Optimal solution is a P2
P1: 1060 km
P2: 1045 km
P3: 1295 km
Depends on Generation #3
P4: 1385 km
• Crossover (Two Points): Order (1)
The crossover result will be as previous table (2)
The Winners P1 & P2
• Mutation
^P1 2 6 3 4 5 1 Fitness = 1290 km
We used the mutation to solve the local minimum problem
March 18 - 2009

Semester 2 - 2008/2009
50

8-queens Problem
March 18 - 2009

8-queens
Semester 2 - 2008/2009
51
• How to represent the 8-queens problem in GA?
• Remember an individual is a potential solution.
• In the 8-queens problem, it will be a state with 8-queens on the board.
• One way is to specify the position of the 8 queens, each in a column of 8
squares.

• For example, the setting on the right


will be specified by this chromosome:
(86427531)
• This can be represented by bits or
digits.
• Note: this is not an optimization
problem.
March 18 - 2009

8-queens: (a) Initialization


Semester 2 - 2008/2009
52

• Assume we have the following initial


populations with 4 individuals:
v1 = (24748552)
v2 = (32752411)
v3 = (24415124)
v4 = (32543213)
March 18 - 2009

8-queens: (b) Fitness Evaluation


Semester 2 - 2008/2009
53

• Fitness function: the less conflicts (attacking


queens) the better
• We can use the number of non-attacking pairs of
queens. The highest possible value of the fitness
function is 8C2 = 28. Every solution will have a
fitness value of 28.
March 18 - 2009

8-queens: (b) Fitness Evaluation


Semester 2 - 2008/2009
54

• We calculate the fitness value of each chromosome.


• For example, fitness of the chromosome v1 (24748552) is
28 – 4 = 24
• That is because only 4 pairs of queens attack each other:
– The queens on 1st and 8th column
– The queens on 2nd and 4th column
– The queens on 6th and 7th column
– The queens on 3rd and 8th column
March 18 - 2009

8-queens: (b) Fitness Evaluation


Semester 2 - 2008/2009
55

• The fitness values for the chromosomes are calculated


as follows:
eval(v1) = 24
eval(v2) = 23
eval(v3) = 20
eval(v4) = 11
• None of the chromosomes is the solution to the
problem. If a solution is found, the algorithm stops and
returns the solution.
March 18 - 2009

8-queens: (c) Selection


Semester 2 - 2008/2009
56

• The total sum of fitness values = 24 + 23 + 20 + 11 = 78


• So, the probability of each chromosome to be selected
into the next generation is as follows:
prob(v1) = 24/78 = 31%
prob(v2) = 23/78 = 29%
prob(v3) = 20/78 = 26%
prob(v4) = 11/78 = 14%
March 18 - 2009

8-queens: (c) Selection


Semester 2 - 2008/2009
57

• Next, we arrange these probabilities into different ranges


from 0 to 1 to facilitate the roulette wheel process:
v1 : 0.00 to 0.31
v2 : 0.31 to 0.60
v3 : 0.60 to 0.84
v4 : 0.84 to 1.00
March 18 - 2009

8-queens: (c) Selection


Semester 2 - 2008/2009
58

• Four random numbers are then drawn for the next generation.
Suppose we have the following random numbers:
0.4012
0.1486
0.5973
0.8129
• The following individuals will be chosen:
0.4012  v2 (32752411)  v1'
0.1486  v1 (24748552)  v2'
0.5973  v2 (32752411)  v3'
0.8129  v3 (24415124)  v4'
March 18 - 2009

8-queens: (d) Crossover


Semester 2 - 2008/2009
59

• Next, some of these four chromosomes will


perform crossover. Suppose the crossover
probability is 0.80. All 4 chromosomes are selected
for crossover (the number is rounded up to an even
number).
• The selected chromosomes are paired up randomly.
• A crossover point is randomly chosen for each
crossover.
March 18 - 2009

8-queens: (d) Crossover


Semester 2 - 2008/2009
60

• Suppose the 3rd digit in the first pair is chosen as


the crossover point.
v1' = (327 | 52411)
v2' = (247 | 48552)
• After crossover, we will have:
v1'' = (327 | 48522)
v2'' = (247 | 52411)
March 18 - 2009

Semester 2 - 2008/2009
61

v2 '
v1'

v1'' v2''
March 18 - 2009

8-queens: (e) Mutation


Semester 2 - 2008/2009
62

• For each gene (digit), there is a small chance that it will be


mutated.
• In the 8-queens problem, it means choosing a queen at random
and moving it to a random square in its column.
• Suppose the mutation probability is 0.05
• 32 random numbers are generated in total.
• Suppose the 6th, 19th, and 32nd random numbers are smaller than
0.05.
• The three corresponding digits will be mutated:
– 6th digit in v1''
– 3rd digit in v3''
– 8th digit in v4''
March 18 - 2009

8-queens: (e) Mutation


Semester 2 - 2008/2009
63

• For each digit to be mutated, another random


number will be generated to determine where the
queen should be moved to.
• For example,
v1'' = (32748522)
• If a random number determines that the digit should
be mutated to 1, the new chromosome will become:
v1''' = (32748122)
March 18 - 2009

8-queens: (e) Mutation Semester 2 - 2008/2009


64
• The same process is applied to every gene to be mutated.
• The final chromosomes for the new generation are thus
as follows:
v1''' = (32748122)
v2''' = (24752411)
v3''' = (32252124)
v4''' = (24415417)
• The process is then repeated from step (b) until a solution
is found.
March 18 - 2009

8-queens: A Summary
Semester 2 - 2008/2009
65
March 18 - 2009

Semester 2 - 2008/2009
66

Summary
March 18 - 2009

Summary
Semester 2 - 2008/2009
67

• Genetic Algorithms (GAs) implement optimization strategies


based on simulation of the natural law of evolution of a species
by natural selection

• The basic GA Operators are:


Encoding
Recombination
Crossover
Mutation
• GAs have been applied to a variety of function optimization
problems, and have been shown to be highly effective in
searching a large, poorly defined search space even in the
presence of difficulties such as high-dimensionality, multi-
modality, discontinuity and noise.
March 18 - 2009

References
Semester 2 - 2008/2009
68

1. D. E. Goldberg, ‘Genetic Algorithm In Search, Optimization


And Machine Learning’, New York: Addison – Wesley (1989)
2. John H. Holland ‘Genetic Algorithms’, Scientific American
Journal, July 1992.
3. Kalyanmoy Deb, ‘An Introduction To Genetic Algorithms’,
Sadhana, Vol. 24 Parts 4 And 5.
4. T. Starkweather, et al, ‘A Comparison Of Genetic Sequencing
Operators’, International Conference On Gas (1991)
5. D. Whitley, et al , ‘Traveling Salesman And Sequence
Scheduling: Quality Solutions Using Genetic Edge
Recombination’, Handbook Of Genetic Algorithms, New
York
March 18 - 2009

References (contd.)
Semester 2 - 2008/2009
69

WEBSITES
6.www.iitk.ac.in/kangal
7.www.math.princeton.edu
8.www.genetic-programming.com
9.www.garage.cse.msu.edu
10.www.aic.nre.navy.mie/galist
March 18 - 2009

Semester 2 - 2008/2009
70

Questions ?

You might also like