Professional Documents
Culture Documents
Tsai-Yang Jea
Department of Computer Science and Engineering SUNY at Buffalo
Query tools Statistical techniques Visualization On-line analytical processing (OLAP) Clustering
Whats Clustering
Clustering is a kind of unsupervised learning. Clustering is a method of grouping data that share similar trend and patterns. Clustering of data is a method by which large sets of data is grouped into clusters of smaller sets of similar data.
Example:
After clustering: Thus, we see clustering means grouping of data or dividing a large data set into smaller data sets of some similarity.
A Clustering Example
Cluster 1
Income: Low Children:0 Car:Compact
Income: Medium Children:2 Car: Sedan and Car:Truck Children:3 Income: Medium Cluster 4
Cluster 3
Cluster 2
(b) d j h i
1 0.4 0.1 0.3 2 3 0.1 0.5 0.8 0.1 0.3 0.4
e c f b
d a
e j g
h
i
c f
(d)
a c i e d kb j f h
...
For each cluster,recomputeits center by finding the mean of the cluster : 1 Nk Mk X jk N k j 1 where M k is the new mean, N k is the number of training patterns in cluster k , and X jk is the j - th pattern belonging to cluster k .
Figure 1
Introduction of GAs
Inspired by biological evolution. Many operators mimic the process of the biological evolution including Natural selection Crossover Mutation
Genetic Representation
The most important starting point to develop a genetic algorithm Each gene has its special meaning Based on this representation, we can define fitness evaluation function, crossover operator, mutation operator.
Wind 10
PlayTennis 10
1 2 1 4 3 5 5 A B C D E F G
Crossover
Exchange features of two individuals to produce two offspring (children) Selected mates may have good properties to survive in next generations So, we can expect that exchanging features may produce other good individuals
Crossover (cont.)
Single-point Crossover
1 0 1 0 1 0 0 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 1 0 0 1 1 0 0 0 1 0
Two-point Crossover
1 0 1 0 1 0 0 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 1 0 1 0 0 1 0 0 1 1 0 0 1 0 1 0 0 1 0 0 0 1
Uniform Crossover
1
1 0
Crossover template
0
0 1
0
1 0
0
1 0
1
0 0
1
1 1
0
0 0
1
0 1
0
1 0
1
0 0
1
0 1 1 0 0 1 0 1 0 0 1 1 0 0 0 1 0 1 1 0 0 0 0 1
Mutation
Usually change a single bit in a bit string This operator should happen with very low probability.
0 1 1 0 1 Mutation point (random) 0 1 1 1 1
Typical Procedures
0 1 1 0 1 0 0 0 1 1 1 1 0 1 1 0 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 0 1 1 1 0 0 0 1 1 1 0 1 1 0 1 0 0 1 1
old generation Probabilistically select individuals Crossover mates are probabilistically selected based on their fitness value.
1
0 new generation 0 1
1
1 1 1 1
0
1 1 1 1
0
1 0 1 0
1
1 1 0 0
Fission: takes a single allele value and gives it a different random allele value, breaking a cluster apart.
1 3 3 3 5 1 3 4 4 5
Example: (Cont.)
Old generation
1 2 3 3 5 1 2 3 4 5 2 1 3 3 5 2 2 3 4 5 1 1 3 1 5
2 2 2 3 5 1 2 3 5 5 2 4 3 3 4 2 2 3 4 5 2 1 3 2 5
New generation
Finally