Enhancing Performance of KNN Classifier by Means of Genetic Algorithm and Particle Swarm Optimization.

International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

!# * + !"1#, IJAFRC All Ri,hts Reserved ---$i.afrc$or,
/nhancin, 0erformance of 1&& Classifier 2 Means of
3enetic Al,orithm and 0article %-arm 4ptimi5ation$
Asha Gowda Karegowda*, Kishore B.
Department of MCA, Siddaganga Institute of Technolog, Tum!ur, India
ashag!sit"gmail.com

A 6 % 7 R A C 7
1&& is suscepti2le to noise in vie- of the fact that, it is 2ased on the distance 2et-een the test and
the trainin, sample$ Feature -ei,htin, and si,nificant feature selection can 2e the -a out to
surmount this limitation of 1&& classifier$ 7his paper proposes three methods namel8 2inar
encoded 3enetic Al,orithm (3A) for identifin, si,nificant features and real encoded 3A and
0article s-arm optimi5ation (0%4) identified feature -ei,hts for enhancin, the performance of
1&& classifier$ 7he outcome of the proposed method proved to 2e of 2etter)9ualit -hen
compared to 1&& performance -ith -ei,hts, provided 2 information ,ain, ,ain ratio and Relief
method$ Further the estimated -or: results also proved to 2e superior -hen compared to results
of prominent classifiers li:e radial 2asis function, support vector machine, decision tree, 6aesian
and &a;ve 6aes classifier$ 7he 2inar encoded 3A identified si,nificant features and real encoded
3A and 0%4 provided -ei,hts also proved to au,ment the performance of fu55 1&& classifier$
Computational -or: has 2een carried on seven different datasets availed from <CI machine
learnin, datasets$
Inde= 7erms8 Crisp 1&&, Fu55 1&&, 3enetic Al,orithm, 0article s-arm optimi5ation, feature
su2set selection, feature -ei,hts

I$ I&7R4><C7I4&

Classification is a super#ised model, which maps or classifies a data item into one of se#eral predefined
classes. Data classification is a two$step process. In the first step, a model is %uilt descri%ing a
predetermined set of data classes or concepts. Tpicall the learned model is represented in the form of
classification rules, decision trees, or mathematical formulae. In the second step the model is used for
classification.
The classifiers are of two tpes& Instance %ased or la' learners and (ager learners. (ager learners
)decision tree, Baesian classifier, S*M, Bac! propagation neural networ!+ when gi#en set of training set,
will construct a classifier model and use the constructed model to classif the test samples, pre#iousl
unseen samples. In contrast Instance %ased or la' learners )!$nearest neigh%or classifier and case %ased
reasoning classifier+ are the one in which the classifiers store all of training samples and do not %uild a
classifier until a new sample with no class la%el needs to %e classified. -a' learner does less wor! when
training samples are presented and more wor! when ma!ing a classification or prediction for test sample
./0.
1eature su%set selection is of immense importance in the field of data mining. Mining on the reduced set
of attri%utes, not onl reduces computation time and %ut also helps to ma!e the patterns easier to
understand. 2rapper model approach uses the method of classification itself to measure the significance
of features set3 hence, the feature selected depends on the classifier model used in contrast to the filter
approach, which is independent of the learning induction algorithm .40. In this paper, %inar encoded
Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

!5 * + !"1#, IJAFRC All Ri,hts Reserved ---$i.afrc$or,
Genetic Algorithm )GA+ has %een used for feature su%set selection and is wrapped with K55 classifier. In
addition to feature su%set selection, the performance of K55 classifier can %e enhanced % finding the
weights for each feature, which measures the rele#ance of feature for the classification tas! .60. 1eature
su%set selection is special case of feature weighting, where weight one is assigned to significant feature
and weight 'ero is assigned to non$significant feature. Binar encoded GA has %een used to identif the
significant feature for !$means clustering .70. In this paper real encoded GA .80 and 9article Swarm
optimi'ation )9S:+ .;0 has %een used to find the weights of features for enhancing the accurac of K55
classifier. 1or the sa!e of completeness crisp and fu'' K55 classifiers are %riefed in Section II, followed
% %rief discussion of anticipated %inar encoded GA for feature selection for K55 classifier in Section III.
The proposed real encoded GA and 9S: generated weights adopted for enhancing the performance of
K55 classifier are %riefed in Section I* and Section * respecti#el. The computational results are
presented in Section *I followed % conclusions and future enhancement in Section *II.

II$ CRI%0 A&> F<??@ 1)&/AR/%7 &/I3A64R AB34RI7AM

Crisp K55 is a simple super#ised classification techni<ue, which %elongs to instance$%ased or la'
learning famil of methods ./, =0. It delas the process of modeling the training data until it is needed to
classif the test samples. The training samples are descri%ed % n$dimensional numeric attri%utes. The
training samples are stored in an n$dimensional space. 2hen a test sample )with un!nown class la%el+ is
gi#en, the !$nearest neigh%or classifier searches for the ! training samples which are closest to the
un!nown sample, followed % appling ma>orit #oting for classification. Closeness is usuall defined in
terms of (uclidean distance. The (uclidean distance %etween two points 9 )p/, p4, ?, pn+ and @ )</, <4,
?.<n+ gi#en % e<uation /.
2
1
) ( ) , (
i i
n
i
q p Q P d =

=
e<. )/+
In spite of simplicit of K55, it suffers from <uite a few draw%ac!s such as it re<uires large memor
proportional to the si'e of training set 3 high computation cost since it needs to compute distance of each
test instance to all training samples 3 low accurac rate in multidimensional data sets with irrele#ant
features and there is no thum% rule to determine #alue of parameter ! )num%er of nearest neigh%ors+.
The accurac of K55 classifier can %e impro#ed % identifing the optimal #alue of K neigh%ors, in
addition to identifing the significant inputs for K55. The Golden section search has %een used in
com%ination with A!ai!eAs Information Criterion )AIC+ to find the optimal num%er of K nearest neigh%ors
.B0. In addition, prototpe generation and prototpe selection are used to enhance the nearest neigh%or
classifier through data reduction .C,/D0. 1urther the K55 performance can %e impro#ed % identifing
the significant features and finding the feature weights. 2eighted K55 is an eEtension of K55 classifier
which incorporates weights for indi#idual attri%utes, in contrast to K55 classifier which assumes e<ual
weights for all the attri%utes. Authors ha#e used siE different methods namel& information gain, Gain
Fatio, :ne rule Classifier, Significance 1eature (#aluator, Felief and K5519 with one attri%ute for
assigning weights to enhance the performance of K55 classifier .//0.

In case of K55 classifier, once an input #ector is assigned to a class, there is no indication of its GstrengthH
of mem%ership in that class ./40. The fu'' K55 algorithm assigns class mem%ership to the test record
rather than assigning the test record to a particular class. It assigns the mem%ership %ased on the sample
records distance from its !$nearest neigh%ors and those neigh%ors mem%ership in the possi%le classes.
The following properties must %e true for mem%ership matriE I of si'e c E n where, c and n are the
num%er of class la%els and training samples su%>ected to condition gi#en % e<uation 4 and 6 where Ii! is
the mem%ership of !
th
training record for i
th
class.
Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

!C * + !"1#, IJAFRC All Ri,hts Reserved ---$i.afrc$or,
1
1
=
=
c
i
ik
e<)4+
[ ] 1 , 0
ik
e<)6+

The wor!ing of fu'' K$nearest 5eigh%or Algorithm .//0 is as follows&
1or each test sample

= repeat steps a)d

a+ Compute distance %etween test sample E and each of the training sample
%+ 1ind ! nearest neigh%ors for test sample E
c+ Compute mem%ership of test sample E for each class ci )i J / to num%er of class la%els+ i.e i(x)
using following e<uation )7+

=
k
j
m
j
m
j
k
j
ij
i
x x
x x
x
1
)
1
2
(
)
1
2
(
1
|| ||
1
|| ||
1
) (
e<)7+

where Ii> is the mem%ership of >
th
neigh%or of test sample x for each class ci and m is the
fu''ifier #alue usuall set to 4.
d+ The results of the fu'' classification for test sample x is specified using simple crisp partition,
where a test sample is assigned to the class of maEimum mem%ership.
/ndfor

There are %asicall three different techni<ues of mem%ership assignment Ii> used in e<uation 7 for the
training samples ./40. The first method, uses a crisp la%eling, and assigns each training sample complete
mem%ership of one in its !nown class and 'ero mem%ership in all other classes. The second techni<ue
wor!s onl on two class data sets. The procedure assigns mem%ership in its !nown class %ased on its
distance from the mean of the training sample class. The third method assigns mem%ership to the
training samples according to a K$nearest neigh%or rule using e<uation 8. The K$nearest neigh%ors to
each sample E )sa E %elonging to class ci+ are found, and then mem%ership in each class is assigned
according to the following e<uation&

i j if
k
i j if
k
else
j
n
n
x
j
j
= +
=
=
, 49 . 0 * ) ( 51 . 0
! , 49 . 0 * ) (
) (
)> J / to c+ e<)8+
where, #alue n> is the num%er of the neigh%ors %elonging to the class c> and k is the total num%er of
neigh%ors of test sample x.

III$ 3A F4R I>/&7IF@I&3 %I3&IFICA&7 F/A7<R/% A&> F/A7<R/ DI/3A7% F4R CRI%0 A&> F<??@
1&& CBA%%IFI/R

Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

!E * + !"1#, IJAFRC All Ri,hts Reserved ---$i.afrc$or,
GA is a stochastic general search method, capa%le of effecti#el eEploring large search spaces. The %asic
techni<ues of the GAs, follow the Charles Darwin principle of Gsur#i#al of the fittestK. The reproduction,
crosso#er, and mutation operations are applied to parent chromosomes to generate the neEt generation
offspring. Authors ha#e used GA for optimi'ing the connection weights of feed forward neural networ!,
finding the significant features for different classifier using %oth filter and wrapper approach for #arious
classifiers and for finding the optimal centroids for !$means clustering and fu'' !$means clustering .7,
/6$/;0.

Authors ha#e applied Binar encoded GA for identifing the significant features. In Binar (ncoded GA,
the chromosome can ha#e either Ds or /s as gene #alue. The /s and Ds in the %inar encoded GA
represent that the feature is significant and not significant respecti#el. In the proposed method, with
%inar encoding GA, the length of the chromosome is e<ual to total num%er of features sa 1. 1or eEample
with dia%etic data set, total num%er of features 1 J B. The chromosome length is B. 2ith /DD//DD/
%inar encoded chromosomes, /st, 7th, 8th and Bth features are significant features and 4nd, 6rd, ;th and
=th features are not significant. The wor!ing of %inar encoded GA for finding the significant feature
su%set for K$5earest 5eigh%or is as follows&

i. initiali'e the chromosome population randoml using %inar encoding )each chromosome length
is e<ual to total num%er of features 1 for a gi#en dataset+
ii. Fepeat the steps a$d till terminating condition )maEimum num%er of generations+ is reached.

a. Appl KL5earest 5eigh%or using indi#idual chromosome representing the significant features
and find the classification accurac as fitness of the chromosome.
%. Select the chromosome resulting in highest classification accurac of K55 classifier as the
fittest chromosome and replace the low fit chromosome % highest fit chromosome
)reproduction+.
c. Select an two chromosomes randoml and appl one point crosso#er operation.
d. Appl mutation operation % randoml selecting an random chromosome and randoml
change the %it / to D and %it D to /.

The positions of %it / in the %est$fit chromosome are considered as significant attri%utes for %oth fu''
and crisp K$5earest 5eigh%or classifier.

In addition to significant feature selection using Binar encoded GA, authors ha#e applied Feal (ncoded
GA for finding feature weights for K55 classifier. 2ith the real encoded GA, chromosomes represent the
weights of the features in contrast to %inar encoded GA which is used to find the su%set of significant
features from original feature set. The %inar encoded GA can %e considered as special method for finding
the weights of features, where, 'ero weight is assigned to non significant features and weight one is
assigned to significant features. A Fepair algorithm is used with real (ncoded GA algorithms to guarantee
the feasi%ilit of chromosomes )i.e. sum of weights of all features must %e e<ual to one+. This is done %
finding the sum of all feature weights and di#iding each feature weight % total weight.

The functioning of real encoded GA for finding the feature weights for K$5earest 5eigh%or is as follows&

i. Initiali'e the chromosome population randoml using real encoding. )(ach chromosome length is
e<ual to total num%er of features 1+
ii. Fepeat the steps a$d till terminating condition )maEimum num%er of generations+ is reached.
Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

!( * + !"1#, IJAFRC All Ri,hts Reserved ---$i.afrc$or,
a+ Appl weighted KL5earest 5eigh%or to indi#idual chromosome representing the feature
weights and find the classification accurac as fitness of chromosome.
%+ Select the chromosome resulting in highest classification accurac of weighted K55 as the
fittest chromosome and replace the low fit chromosome % highest fit chromosome.
c+ Select an two chromosomes randoml and appl one point crosso#er operation )call repair
algorithm if the sum of the weights of all feature eEceeds one+
d+ Appl mutation operation % randoml selecting an chromosome and alter an one of the
weights % multipling with a random real num%er. )call repair algorithm if the sum of the
weights of all features eEceeds one+
iii. The weights in the fittest chromosome are considered as features weights for %oth crisp and
fu'' K$5earest 5eigh%or classifier.

IV$ 0%4 F4R FI&>I&3 F/A7<R/% D/I3A7% F4R CRI%0 A&> F<??@ 1&& CBA%%IFI/R

9S: are population %ased search method, inspired % the social %eha#ior of a Moc! of migrating %irds. A
particle is analogous to a chromosome )population mem%er+ in GAs. In GA the neEt generation
chromosomes are generated using crosso#er, mutation and reproduction process using parent
chromosomes. As opposed to GAs, the e#olutionar process in the 9S: does not create new %irds from
parent ones, instead, each particle flies in the search space with a #elocit ad>usted % its own !nowledge&
9%est )local search+ and %est among the companionAs !nowledge& G%est )glo%al search+ .;0. Authors ha#e
applied 9S: to optimi'e the connection weights of feed forward neural networ! ./;0 and to find the !$
means centroids ./=0.
This paper proposes real encoded 9S: algorithm to find the weights of features for K55 classifier. The
dimension of each particle is e<ual to total num%er of features for a gi#en dataset. The wor!ing of real
encoded 9S: for finding the significant feature weights for K$5earest 5eigh%or is as follows&

i. Initiali'e the particle population randoml using real encoding )(ach particle length is e<ual to
total num%er of features 1+
ii. Fepeat the steps a$e following till terminating condition )MaEimum num%er of iterations+ is
reached.
a+ Appl weighted KL5earest 5eigh%or using indi#idual particle representing the
significant feature weights and find the classification accurac.
%+ 1ind local %est for each particle. )Best accurac of indi#idual particle+
c+ 1ind glo%al %est from the population of particles. )Best accurac among all particles+
d+ Compute new #elocit using local and glo%al %est for each particle using e<uation );+.
e+
#i>)t N /+ J w#i> )t+ N c/F/)p%esti> $ Ei> )t++ N c4F4)g%esti> $ Ei> )t++ e< );+

f+ Opdate each particle position using old position and new #elocit using e<uation )=+

Ei> )t N /+ J Ei> )t+ N #i> )t N /+ e< )=+

)Call repair algorithm )as mentioned is Section 6+ if the sum of weights of all features of particle
eEceeds /+
iii. The weights represented % the glo%al %est particle are considered as final features weights for
%oth crisp and fu'' K$5earest 5eigh%or classifier.

Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

!F * + !"1#, IJAFRC All Ri,hts Reserved ---$i.afrc$or,
In e<uation ;, #i>)t N /+ is the new #elocit, w is the inertia weight, #i> )t+ is the old #elocit, c/ and c4 are
constants usuall set to 4, F/ and F4 are randoml generated num%ers, p%esti> is the particles local %est
and g%esti> is the population glo%al %est, Ei> is the old position of particle.

V$ /G0/RIM/&7AB R/%<B7%

(Eperiments ha#e %een carried out using se#en different datasets namel& Peart stat log, dia%etes, wine,
Indian li#er, #ehicle, iris and ionosphere a#ailed from OCI machine learning dataset. The data has %een
partitioned % means of holdout method with ;D$7D ratio as training set and test set. The data is
normali'ed using min$maE normali'ation method. To enhance the performance the K55 classifier, %inar
encoded GA has %een eEperimented to disco#er the significant features as eEplained in section III. In
addition real encoded GA and real encoded 9S: ha#e %een eEperimented for identifing the weights of
features for K55 classifier. The results of crisp K55 are compared with anticipated %inar encoded GA
identified significant features, and with weights identified % GA and 9S: #ersus weights identified %
information gain, gain ratio and Felief method as shown in 1igure /. (Eperiments with GA and ha#e %een
carried out % #aring the populationAs si'e and num%er of generations, in addition to changing the #alue
of K )/$/D+ for crisp K55 and fu'' K55 classifier.

1igure / illustrates, thatAs the classification accurac of crisp K55 is impro#ed % GA identified features
and GA and 9S: identified weights for all the se#en datasets. 1or heart stat log dataset and #ehicle
dataset, the crisp K55 performance was preeminent with weights identified % real encoded GA. 9S:
identified weights pro#ed to %e superlati#e for crisp K55 classifier for dia%etes and Indian li#er datasets.
1or iris dataset, the %inar encoded GA features pro#ed to top for crisp K55 classifier. 1or wine and
ionosphere datasets, %inar encoded GA features and real encoded GA and 9S: weights resulted in top
and same accurac of crisp K55 classifier.

1urther GA identified feature and GA and 9S: identified weights ha#e %een used to further enhance the
performance of fu'' K55 classifier. 1or fu'' K55 classifier, the mem%ership of training samples is
computed using two methods )a+ crisp assignment method )assign each training sample complete
mem%ership in its !nown class and 'ero mem%ership in all other classes+ and is named as 1/2K55 and
)%+ using e<uation 8 and is named as 142K55. The relati#e results of 1/2K55 classifier )with crisp
assignment method+ and 142K55 classifier )with assignment using e<uation 8+ using significant
features identified % GA and using weights identified % GA and 9S: #ersus weights identified %
information Gain, gain ratio and Felief method are shown in 1igure 4 and 1igure 6 respecti#el with
fu''ifier #alue m e<ual to 4. 2ith 1/2K55 classifier, weights identified % real encoded GA pro#ed to %e
%est for heart$statlog, wine, Indian li#er and iris datasets. 1or dia%etes dataset, 1/2K55 showed an
augmented accurac % order of 4$ 6 Q with weights identified % Gain ratio method when compared to
proposed method. 1or #ehicle and ionosphere datasets, weights gi#en information gain pro#ed to %e to
some eEtent %etter for 1/2K55 when compared to proposed method.

The classification accurac of 142K55 was paramount with real encoded GA identified weights for heart
statlog, Indian li#er, #ehicle and iris datasets when compared to other methods. Both %inar encoded GA
features and real encoded GA weights resulted in same classification accurac of 142K55 for wine and
ionosphere datasets. 1or dia%etes datasets, the %inar encoded GA pro#ided features resulted in %etter
accurac of 142K55 when compared to weights proposed % real encoded GA and 9S:. Powe#er, the
weights identified % information gain and Gain ratio resulted in slight impro#ed performance for
dia%etes dataset when compared to GA identified features. The results of 142K55 are found to enhance
when compared to 1/2K55.
Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

'" * + !"1#, IJAFRC All Ri,hts Reserved ---$i.afrc$or,
In addition the results of proposed method for impro#ing the crisp and fu'' K55 classifier accurac %
using GA identified features, and 9S: and GA identified weights is compared with fi#e well !nown
classifiers )a#aila%le in 2(KA tool+ namel radial %asis networ! )FB1+, Support #ector machine )S*M+,
Decision tree )C7.8+, Baesian and 5aR#e Baes methods as shown in 1igure 7 L 1igure ;.

1igure 7 illustrates that, %oth %inar encoded GA features and real encoded GA and 9S: weights resulted
in same classification accurac of crisp K55 classifier for wine and ionosphere datasets. 1or iris dataset,
GA identified features for crisp K55 classifier resulted in top accurac for iris dataset when compared to
other classifiers. 9S: identified weights pro#ed to finest for %oth dia%etes and Indian li#er datasets and
GA identified weights pro#ed to superlati#e for Peart stat$statlog and #ehicle datasets for crisp K55
when compared to other classifiers.

1igure 8 depicts that, 9S: identified weights pro#ed %e most eEcellent for 1/2K55 for dia%etes, when
compared to other classifiers. S*M pro#ed to preeminent for Peart$statlog dataset for 1/2K55. 1urther
GA identified weights with 1/2K55 showed %etter performance when compared to other classifiers for
wine, Indian li#er, #ehicle and iris dataset. 1igure 8 illustrates that, %oth %inar encoded GA features and
real encoded GA and 9S: weights resulted in same classification accurac of 1/2K55 ionosphere
datasets. Powe#er, the GA identified features showed a decline in performance for %oth #ehicle and iris
datasets for 1/2K55 classifier.

1igure ; depicts that GA identified weights for 142K55 pro#ed %e %est for all the datasets eEcluding
dia%etes when compared with other classifiers. 1or Dia%etes dataset, Baesian classifier showed
negligi%le enhancement when compared to 142K55 with GA identified features. :n the other hand, the
GA identified features showed a decline in performance for iris datasets with 142K55 classifier. GA and
9S: identified weights pro#ed to #ital with 14 2K55 and resulted in same accurac for ionosphere
dataset, where as GA recogni'ed features and weights pro#ed to paramount with 142KK55 for wine
dataset when compared to other classifiers.

VI$ C4&CB<%I4&%

This paper pro>ected %inar encoded GA for identifing significant features and real encoded GA and 9S:
for finding feature weights for enhancing the performance of %oth crisp and fu'' K55 classifier. Among
the three proposed methods for impro#ing the performance of K55 classifier, there is no one single
method which is superlati#e for all the se#en datasets. :#erall the two e#olutionar methods namel GA
and 9S: ha#e enhanced the performance of K55 classifier when compared to outcome of well !nown
classifier li!e radial %asis function, support #ector machine, decision tree, Baesian and 5aR#e %aes
classifiers as well as with the feature weights identified % information gain, gain ratio and Felief method
for all the eEperimented se#en datasets. As part of future enhancement, authors would li!e to eEtend the
wor! on significant feature selection using %inar 9S: and %inar cuc!oo search algorithms. In addition
to 9S: and GA, the %asic #ersion and modified #ersion of Cuc!oo search algorithms identified feature
weights can %e applied for further impro#ing the performance of K55 classifier.

VII$ R/F/R/&C/%
./0 S. Pan, and M. Kam%er, Data Mining& Concepts and Techni<ues, San 1rancisco, Morgan Kauffmann
9u%lishers, )4DD/+.
Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

'1 * + !"1#, IJAFRC All Ri,hts Reserved ---$i.afrc$or,
.40 Asha Gowda Karegowda, M.A. Saaram, A.S. Man>unath, G1eature Su%set Selection 9ro%lem using
2rapper Approach in Super#ised -earningH, .International Sournal on Computer
Applications)ISCA+ ,*ol. /, pp. /6$/=, 4D/D.
.60 S. Cost, Sal'%erg, A weighted nearest neigh%or algorithm for learning with sm%olic features,
Machine -earning, *ol. /D, 5o /, San./CC6, pp. 8=$=B.
.70 Asha Gowda Karegowda, M.A. Saaram, A.S. Man>unath, *ida T, Shama, GGenetic Algorithm %ased
Dimensionalit Feduction for Impro#ing 9erformance of !$means and fu'' !$means clustering&
A Case stud for Categori'ation of Medical datasetH, Gwalior, India, 9roceedings of Se#enth
International Conference on Bio$Inspired Computing& Theories and Applications )BIC$TA
4D/4+,Ad#ances in Intelligent Sstems and Computing, *ol.4D/, pp. /;C$/BD, 4D/6.
.80 D. Gold%erg, Genetic Algorithms in Search, :ptimi'ation and Machine learning, Addison 2esle,
)/CBC+.
.;0 F. (%erhart and S. Kenned, GA new optimi'er using particle swarm theorH, 9roceedings of the
SiEth International Smposium on Micro Machine and Puman Science, 5agoa, Sapan, pp.6C$76,
/CC8.
.=0 T.M. Co#er, 9.(. Part, G5earest 5eigh%or pattern classificationH, I((( Transactions on Information
Theor, *ol. IT/6, pp. 4/$4=, /C;=.
.B0 Asha Gowda Karegowda, M.A. Saaram, A.S. Man>unath, GCom%ining A!ai!eAs Information
Criterion )AIC+ and the Golden$Section Search Techni<ue to find :ptimal 5um%ers of 5earest
5eigh%ors, International G, Sournal of Computer Applications *ol. 4, pp. BD$;=, Ma 4D/D.
.C0 Isaac Triguero, Soa<uTn Derrac, Sal#ador GarcTa, 1rancisco Perrera,H A TaEonom and
(Eperimental Stud on 9rototpe Generation for 5earest 5eigh%or ClassificationH, I(((
Transactions on Sstems, Man, and C%ernetics, 9art C *ol. 74)/+, pp. B;$/DD, 4D/4.
./D0 Sal#ador Garcia, Soa<uin Derrac, Sose Famon Cano, 1rancisco Perrera, G9rototpe Selection for
5earest 5eigh%or Classification& TaEonom and (mpirical StudH, I((( Transactions on 9attern
Analsis and Machine Intelligence, *ol. 678)6+, pp. 7/=$768, 4D/4.
.//0 Asha Gowda Karegowda, Fa!esh Kumar Singh, M.A.Saaram, A.S .Man>unath, G Impro#ing
2eighted K$5earest 5eigh%or 1eature 9ro>ections 9erformance with Different 2eights
Assigning Methods G, International Conference on Computational Intelligence )ICCI 4D/D+
Decem%er C L //, 4D/D, Coim%atore, India.
./40 Sames M. Keller, Michael F Gra, Sames A Gi#ens SF, GA fu'' K$nearest neigh%or algorithmH, I(((
Transactions on Sstems, Man, and C%ernetics, *ol. SMC$/8, 5o. 7, pp. 8BD$8B8, /CB8.
./60 Asha Gowda Karegowda, M.A.Saaram, A.S. Man>unath, GApplication of Genetic Algorithm
:ptimi'ed 5eural 5etwor! connection weights for medical diagnosis of 9IMA Indian dia%etesH,
International Sournal of Soft Computing, *ol.4, 5o.4, pp. /8$44, Ma 4D//.
./70 Asha Gowda Karegowda, M.A.Saaram, A.S. Man>unath, G1eature su%set selection using cascaded
GA and C1S& a filter approach in super#ised learning&H, International Sournal on Computer
Applications, *ol. 46)4+, pp./$/D, 4D//.
Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

'! * + !"1#, IJAFRC All Ri,hts Reserved ---$i.afrc$or,
./80 Asha Gowda Karegowda, Shama, *ida T, M.A.Saaram, A.S .Man>unath, G Impro#ing 9erformance
of K$Means Clustering B Initiali'ing Cluster Centers Osing Genetic Algorithm and (ntrop Based
1u'' Clustering for Categori'ation :f Dia%etic 9atientsH, MSFIT, Bangalore, 9roceedings of
International Conference on Ad#ances in Computing, Ad#ances in Intelligent Sstems and
Computing, *olume /=7, Sul 7$;, pp. BCC$CD7, 4D/4.
./;0 Asha Gowda Karegowda, M.A. Saaram, GSignificant 1eature Set Dri#en, :ptimi'ed 115 for
(nhanced ClassificationH, International Sournal of Computational Intelligence and Informatics
ISS5& 446/$D48B, *ol. 4, 5o 7, Mar 4D/6.
./=0 Asha Gowda Karegowda, Seema Kumari, 9article Swarm :ptimi'ation Algorithm Based !$means
and 1u'' c$means clustering, International Sournal of Ad#anced Fesearch in Computer Science
and Software (ngineering *ol. 6, Issue =, pp. 77B$78/, Sul 4D/6 .

67
70
73
76
79
82
85
88
91
94
97
100
Heart-statlog Diabetes Wine Indian Liver Vehicle Iris ionosphere
Datasets
C
l
a
s
s
i
f
i
c
a
t
i
o
n

A
c
c
u
r
a
c
y
KNN Binary GA-WKNN Real GA-WKNN
Real PSO-WKNN Information Gain-WKNN Gain Ratio-WKNN
Relief-WKNN

Fi,ure 1$ Comparative performance of crisp 1&& classifier usin, 2inar encoded 3A identified features
and real encoded 3A and 0%4 identified feature -ei,hts vs$ information ,ain, ,ain ratio and relief method
identified -ei,hts for different datasets
Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

'' * + !"1#, IJAFRC All Ri,hts Reserved ---$i.afrc$or,

64
67
70
73
76
79
82
85
88
91
94
97
100
Heart-statlog Diabetes Wine Indian Liver Vehicle Iris Ionosphere
DataSets
C
l
a
s
s
i
f
i
c
a
t
i
o
n

A
c
c
u
r
a
c
y
F1KNN Binary GA-F1WKNN Real GA-F1WKNN
Real PSO-F1KNN Information Gain-F1WKNN Gain Ratio-F1KNN
Releif-F1WKNN

65
70
75
80
85
90
95
100
Datasets
C
l
a
s
s
i
f
i
c
a
t
i
o
n

a
c
c
u
r
a
c
y
Real PSO-F2WKNN Information Gain-F2WKNN Gain Ratio-F2WKNN
Relief-F2WKNN

Fi,ure !$ Comparative performance of F1D1&& classifier (-ith crisp method for mem2ership
assi,nment) usin, 2inar encoded 3A identified features and real encoded 3A and 0%4 identified
feature -ei,hts vs$ information ,ain, ,ain ratio and relief method for different datasets
Fi,ure '$ Comparative performance of F!D1&& classifier (-ith e9uation 5 for mem2ership
feature -ei,hts vs$ information ,ain, ,ain ratio and relief method for different datasets
Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

'# * + !"1#, IJAFRC All Ri,hts Reserved ---$i.afrc$or,

5
50
55
60
65
70
75
80
85
90
95
100
Heart-statlog Diabetes Wine Indian Liver Vehicle Iris ionosphere
DATA SETS
C
l
a
s
s
i
f
i
c
a
t
i
o
n

A
c
c
u
r
a
c
y
KNN Binary GA-WKNN Real GA-WKNN Real PSO-WKNN RBF
SVM DT Baye Nai!eBaye

45
50
55
60
65
70
75
80
85
90
95
100
Datasets
C
l
a
s
s
i
f
i
c
a
t
i
o
n

A
c
c
u
r
a
c
y
Real PSO-F1KNN RBF SVM
Decision Tree Bayesian NaiveBayes

Fi,ure 5$ Comparative performance of F1D1&& classifier (-ith crisp method for mem2ership
assi,nment) usin, 2inar encoded 3A identified features and real encoded 3A and 0%4 identified -ei,hts
vs %VM, R6F, >ecision 7ree, 6aesian and &a;ve 6aes classifiers for different datasets
Fi,ure #$ Comparative performance of Crisp 1&& classifier (-ith crisp method for mem2ership
feature -ei,hts vs %VM, R6F, >ecision 7ree, 6aesian and &a;ve 6aes classifiers for different
datasets
Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

'5 * + !"1#, IJAFRC All Ri,hts Reserved ---$i.afrc$or,

45
50
55
60
65
70
75
80
85
90
95
100
Datasets Datasets Datasets Datasets
C
l
a
s
s
i
f
i
c
a
t
i
o
n

A
c
c
u
r
a
c
y
Real PSO-F2WKNN RBF SVM
Decision Tree Bayesian NaiveBayes
ss

Fi,ure C$ Comparative performance of F!D1&& classifier (-ith e9uation 5 for mem2ership assi,nment)
usin, 2inar encoded 3A identified features and real encoded 3A and 0%4 identified -ei,hts vs %VM, R6F,
>ecision 7ree, 6aesian and &a;ve 6aes classifiers for different datasets

Enhancing Performance of KNN Classifier by Means of Genetic Algorithm and Particle Swarm Optimization.

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Enhancing Performance of KNN Classifier by Means of Genetic Algorithm and Particle Swarm Optimization.

Uploaded by

Copyright:

Available Formats

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 5, Ma !"1#$ I%%& !'#( ) #(5'

You might also like