You are on page 1of 6

128 (IJCNS) International Journal of Computer and Network Security,

Vol. 1, No. 2, November 2009

Integrated Intrusion Detection using SCT


Selvakani Kandeeban1 and R. S. Rajesh2
1
Professor and Head, Department of Computer Applications,
Francis Xavier Engineering College, Tirunelveli, Tamilnadu, India.
sselvakani@hotmail.com
2
Reader, Department of Computer Science and Engineering, Manonmanium Sundaranar University,
Tirunelveli, Tamilnadu, India
rs_rajesh@yahoo.co.in

intrusions. Because most computer systems are vulnerable to


Abstract: The Attack that target new vulnerabilities are being
created much faster than in the past. For many years attack, intrusion detection (ID) is a rapidly developing field.
companies have relied on stateful firewalls, host based antivirus When unknown types of attacks need to be detected, these
and anti spam solutions to keep their users and resources safe. techniques have the advantage to automatically retrain
But the landscape is quickly changing and the effectiveness of detection models on input data. Existing IDSs [1, 3, and 4]
these traditional single purpose point security devices are no also rely heavily on human analysts to differentiate normal
longer proving adequate. The third International Knowledge and abnormal network connections. This paper describes an
Discovery and Data mining tools competition set (KDD Cup set
integrated genetic algorithm and neural network for
’99) is used to train and test the feasibility of our proposed
model. This paper mainly addresses the issue of identifying improving intrusion detection performance. First the 41
important input features for intrusion detection. This also features in the KDD Cup set is reduced to 8 features for each
addresses the related issue of ranking the importance of input type of attack using mutual information and the rule set is
features which is itself a problem of great interest since created to improve the detection rate for known attacks
elimination of the significance and or useless inputs leads to a using genetic algorithm. Our idea is to achieve high
simplified problem and possibly faster and more accurate detection rate by introducing high level of generality when
detection. The genetic algorithm employs only the eight most
deploying the subset of the most important features of the
relevant features for each attack category for rule generation.
In this paper, it presents an intrusion detection model based on dataset. As this also results in high false positive rate, we
Soft Computing Techniques (SCT) that is by using mutual deploy additional set of rules in order to recheck the
information, genetic algorithm and neural network Radial Basis decision of the rule set for detecting attacks. Then the Radial
Function (RBF). This key idea is to aim at taking advantage of Basis Function network is used to detect unknown intrusions
classification abilities of neural network for unknown attacks [10]. Therefore the integrated model improves the
and genetic algorithm for known attacks.
performance of detecting all intrusions. Our experimental
Keywords: Genetic Algorithm, Information Gain, Knowledge results demonstrate efficiency and accuracy.
synthesis, Radial Basis Function.
2. Related work
1. Introduction
A.H.Mukkamala, G.I. Janoski and Sung [8] use Support
The complexity, as well as the importance, of distributed Vector Machines (SVMs) and Neural Networks to identify
computer systems and information resources is rapidly important features for 1998 DARPA Intrusion Detection
growing. Due to this, computers and computer networks are data. They delete one feature at a time and build SVMs and
often exposed to computer crime. Many modern systems Neural Networks using the remaining 40 features. The
lack properly implemented security services; they contain a importance of this deleted feature depends on training time,
variety of vulnerabilities and, therefore, can be compromised testing time and the accuracy for SVMs or overall accuracy,
easily. As network attacks have increased in number over false positive rate and false negative rate for Neural
the past few years, the efficiency of security systems such as Networks. The same evaluation process is done for each
firewalls have declined. feature. Features are ranked according to their importance.
It is very important that the security mechanisms of a They conclude that SVMs and neural network classifiers
system are designed to prevent unauthorized access to using only important features can achieve better or
system resources and data. However, building a complete comparable performance than classifiers that use all
secure system is impossible and the least that can be done is features.
to detect the intrusion attempts so that action can be taken to The most related work is done by Li and zhang [12],
repair the damage later. Organizations are increasingly where the general problem of GA optimized feature
implementing various systems that monitor IT security selection and extraction is addressed. He applies a GA to
breaches. Intrusion detection systems (IDSs) have gained a optimize the feature weights of a (k-nearest neighbour) kNN
considerable amount of interest within this area. The main classifier and choose optimal subset of features for a
task of IDS is to detect an intrusion and, if necessary or Bayesian classifier and a linear regression classifier. The
possible, to undertake some measures eliminating the
(IJCNS) International Journal of Computer and Network Security, 129
Vol. 1, No. 2, November 2009

optimization framework of their work is based on the and vice versa, therefore the mutual information is the same
wrapper model [12]. as the uncertainty contained in Y (or X) alone, namely the
One IDS tool that uses GAs to detect intrusions, and is entropy of Y (or X: clearly if X and Y are identical they have
available to the public is the Genetic Algorithm as an equal entropy). In a specific sense [2], mutual information
Alternative Tool for security Audit Trails Analysis quantifies the distance between the joint distribution of X
(GASSATA). GASSATA finds among all possible sets of and Y and the product of their marginal distributions.
known attacks, the subset of attacks that are the most likely Decision Independent Correlation is defined as the ratio
to have occurred in a set of audit data [11]. Since there can between the mutual information and the uncertainty of the
be many possible attack types, and finding the optimal feature. DIC is expressed as
subset is very expensive to compute. GAs are used to search I(X j ; X j )
efficiently. The population to be evolved consists of vectors DICX J ( X i, X j ) = , (2)
with a bit set for each attack that is comprised in the data H(X j )
set. Crossover and mutation converge the population to the
most probable attacks. I(X j ; X j )
A second tool that is implemented and undergoing more DICXi ( X i, X j ) = (3)
advanced enhancements is the Network Exploitation H(Xi )
Detection Analyst Assistant (NEDAA). The Applied Correlation Measure is defined as a measure to quantify
Research Laboratories of the University of Texas at Austin the information redundancy between Xi and Yi with respect
has developed the NEDAA [9], which uses different to Y as follows:
machine learning techniques, such as a finite state machine, I (Y ; X i ) + I (Y; X j ) − I (Y ; X i X j )
a decision tree, and a GA, to generate artificial intelligence QY ( X i, X j ) = (4)
(AI) rules for IDS. One network connection and its related H (Y )
behavior can be translated to represent a rule to judge The ranked lists of features is obtained by using a simple
whether or not a real-time connection is considered an forward selection hill climbing search, starting with an
intrusion or not. These rules can be modeled as empty set and evaluating each feature individually and
chromosomes inside the population. The population evolves forcing it to continue to the far side of the search space.
until the evaluation criteria are met. The generated rule set It has been shown that dependency measure or correlation
can be used as knowledge inside the IDS for judging measures qualify the accuracy of decision to predict the
whether the network connection and behaviors are potential value of one variable. However, the symmetrical uncertainty
intrusions. measure is not accurate enough to quantify the dependency
The extraction of knowledge in the form of rules has among features with respect to a given decision. A critical
been successfully explored before on RBF networks using point was neglected that the correlation or redundancy
the (hidden unit Rule Extraction) hREx algorithm [7]. This between features is strongly related with the decision
work inspired the authors to develop knowledge synthesis or variable. The feature subset is obtained as:
knowledge insertion by manipulating the RBF network 1. Generate feature set R from the ranked list of features
parameters but information flow/extraction was in the 2. For each feature for each type of attack, calculate the
opposite direction. mutual information between the feature Xi and the
decision Y, I(Y;Xi)
3. Methodology 3. Updating relevant features set R by comparing the
mutual information I(Yi;Xi)
As indicated in the introduction the basic objective of this if I(Y;Xi)≥ δx then R ← R + { Xi }
work is to determine the contribution of the 41 features in where δx is the threshold which is user defined
KDD 99 intrusion detection datasets to attack detection 4. Create working Set W by copying R
3.1 Feature Selection 5. Set goal Set G = null
6. While e(G) < δ2 do
Formally, the mutual information of two discrete random
If W = null then break
variables X and Y can be defined as:
Choose Xk Є W that subjects to
p ( x, y )
I ( X ;Y ) = ∑ ∑ p ( x , y ) log (1) (i) Mutual information where
y∈Y x∈ X p ( x) p( y) I(Y;Xk) ≥ I(Y;Xl) for all l≠k, Xl Є W
where p(x,y) is the joint probability distribution function of (ii) Correlation Measure
X and Y, and p(x) and p(y) are the marginal probability Qy(Xk,Xn) ≤ Qy(Xm,Xn) for all m≠k, Xn Є G
distribution functions of X and Y respectively. W ← W - {Xk}
Intuitively, mutual information measures the information G ← G + {Xk}
about X that is shared by Y: it measures how much knowing End Loop
one of these variables reduces our uncertainty about the 7. Obtain a feature subset from the intersection of all
other. If X and Y are independent, then X contains no the attacks subset
information about Y and vice versa, so their mutual 3.2 GA Rules Formation
information is zero: knowing X does not give any
By analyzing the dataset, rules will be generated in the rule
information about Y (and vice versa). If X and Y are
set. These rules will be in the form of an ‘if then’ format as
identical then all information conveyed by X is shared with
follows.
Y: knowing X provides all necessary information about Y
if {condition} then {act}
130 (IJCNS) International Journal of Computer and Network Security,
Vol. 1, No. 2, November 2009

The condition using this format refers to the attributes in interacting with the RBF based system in some loosely or
the rule set that forms a network connection in the dataset. tightly coupled protocol. However, by converting the fuzzy
The condition will result in a ‘true’ or ‘false’. The attack rules into RBF architecture they can be subjected to further
name will be specified only if the condition is true. analysis by rule extraction and it also avoids hybrid system
The condition in the format above refers to the attributes integration issues.
in the rule set that forms a network connection in the
dataset, which is selected from the feature selection phase.
4. Experiments and Results
Note that the condition will result in a ‘true’ or ‘false’. The
act field in the ‘if-then’ format above will refer to an action We have used an open source machine learning
once the condition is true, such as reporting an alert to the framework WEKA [Waikato Environment for Knowledge
system administrator. For example, a rule in the rule set can Analysis] written at University of Waikato, New zealand.
be defined as follows: The algorithms can either be applied directly to a data set or
if Number of “hot” indicators <= 0.0 and called from our own JAVA code. The input data for weka
Number of connections to the same host as the classifiers is represented in .ARFF [Attribute Relation
connection in the past two seconds <=500.82 and Function Format], consisting of the list of all instances with
% of connections that have “REJ” errors >0.21 the values for each instance separated by commas. As a
and <=0.01 and Number of connections to result of data set training and testing, a confusion matrix
host <=41.2 and >112.3 will be generated showing the number of instances of each
then SMURF class that has been assigned.
In this Genetic Algorithm, each chromosome represents Experiments were conducted to verify the performance of
one learning rule was evaluated. To evaluate a intrusion detection approach based on the above discussion.
chromosome, an appropriately sized network was configured All the experimental data is available from the corrected
for each of the 20 tasks. An individual of each population data set of KDD cup 1999. Important features based on
consisted of genes, where each gene represented a certain correlation Measure and Information gain was identified.
feature and its values represented the value of the feature. There were 21 types of intrusions in the test set but only 7 of
Each GA is trained during 300 generations where in each them were chosen in the framing set. Therefore the selected
generation 100 worst performed individuals are replaced data also challenged the ability to detect the unknown
with the newly generated ones. The same process was intrusions.
repeated with ten epochs and the results were analyzed. The main concern of features reduction is one of false
The second part of the GA is the fitness function. The alarms and missed intrusion detection. In this work, we
fitness function ‘F’ determines whether a rule is ‘good’ i.e. attempted to reduce the feature that may be effectively used
it detects intrusions, or whether the rule is ‘bad’, i.e. it does for intrusion detection without compromising security. We
not detect intrusions. ‘F’ is calculated for each rule. It will have specially focused on statistical techniques to test
depend on the following equation individual significance and mutual significance.
Support = A and B / N In this KDD data set each sample is unique with 34
Confidence = A and B / A numerical features and 7 symbolic features. In the
Fitness = t1 * support + t2 * confidence Table 1: Information Gain Measures
Where
N is the total number of records Attack
Information Gain Feature Type
A stands for the number of network connections matching type

the condition A DOS 1.351 src_bytes


A and B is the number of records that matches the rule. Probe 1.3596 count
T1 and t2 are the thresholds to balance the two terms. U2R 0.652 dst_bytes
When an intrusion occurs, it is notified. When an R2L 1.1599 service
intrusion does not occur, but the response confirms it as an
intrusion, then it is considered as a false alarm. Once in a Preprocessing task, we map symbolic attributes to
while the data set will have to be updated for new numeric valued attributes. Symbolic features like
connections, and hence the rule set will also be updated. protocol_type(3 different symbols – tcp, udp, icmp),
Service(6 different symbols) and flag(11 different symbols)
3.3 Knowledge Synthesis by RBF were mapped to integer values ranging from 1 to N where N
Knowledge synthesis is a technique intended for those is the number of symbols.
situations in which no actual training data is available but TABLE I shows the highest IG value for each attack
some form of domain knowledge is at hand. The experts type. There are nine features with very small information
knowledge is encoded as fuzzy sets and rules which are used gain which contribute very little to intrusion detection. Two
to synthesize new hidden unit parameters for incorporation features do not show any variations in the training set.
into a new or existing network. The fuzzy rules describe a Finally for each type of attack appropriate reduced feature
set of output classes and the possible input values denoting subset was selected. The ranked feature list is shown as in
their characteristics. The objective of converting from fuzzy TABLE II.
In the normalization step, we linearly scale each of these
rules to RBF networks is to have the knowledge in a
features to the range [0.0, 1.0]. Features having smaller
consistent format. It would be possible to have the domain
integer value ranges like duration [0, 58329],
knowledge in the form of a stand-alone fuzzy module,
(IJCNS) International Journal of Computer and Network Security, 131
Vol. 1, No. 2, November 2009

num_compromised [0,255] were scaled linearly to the range additional rule systems for detecting DoS attacks and
[0.0, 1.0]. Two features spanned over a very large integer normal connections is conformed, as the false positive rate
range, namely src_bytes [0,693375640] and dst_bytes [0, has decreased in each of the cases.
5203179] were scaled by logarithmic scaling to the range Table2: Ranked List of Features
[0.0, 20.4] and [0,15.5]. For Boolean features having values
(0 or 1), they were left unchanged.
Attack
It should be noted that the test data is not from the same Ranked List
type
probability distribution as the training data. Moreover the
test data includes novel attack types that have not been DOS 5,23,3,33,35,34,24,36,2,39,4,38,26,25,29,30,6,
appeared in the training data. 12,10,13,40,41,31,37,32,8,7,28,27,9,1,19,18,2
In the second stage, from the reduced feature subset, rules 2,20,21,14,11,17,15,16
are formed using the genetic algorithm from the KDD data Probe 23,29,27,36,4,32,34,40,35,3,30,2,5,41,28,37,3
set and tested on the KDD training set to observe their 3,25,38,26,39,10,9,12,11,6,1,8,7,21,19,20,31,2
performance with respect to detection, false alarm rate and 2,24,15,13,14,18,16,17
missed alarm rate. The only drawback in this is the rules U2R 6,3,13,15,12,14,18,19,16,17,20,4,5,1,2,10,11,7
are biased to training data set. The genetic algorithm in the ,9,8,35,36,32,34,33,40,41,37,39,38,24,25,21,2
proposed design evaluates the rules and discards the bad 3,22,29,31,30,26,28,27
rules while generating more rules to reduce the false alarm R2L 3,34,1,6,5,33,35,36,32,12,23,24,10,2,37,4,38,1
rate and to increase the intrusion detection. The GA thus 3,16,15,14,8,7,11,9,29,30,27,28,40,41,31,39,1
continues to detect intrusions and produce new rules, storing 9,20,17,18,25,26,21,22
the good rules and discard the bad ones.
dst_host_srv_count <= 227
| num_failed_logins <= 0 Ten kinds of network attacks are included in the training
| | rerror_rate <= 0 set namely back, land, Neptune, pod, smurf, teardrop,
| | | num_access_files <= 0 ipsweep, portsweep, buffer overflow and guess passwd.
| | | | protocol_type = tcp Fifteen kinds of network attacks are included in the testing
| | | | | dst_host_same_srv_rate <= 0.11 set namely perl, xlock, mailbomn\b, UDPStrom, saint,
| | | | | | dst_host_serror_rate <= 0.01 xlock, back, land ,Neptune, pod, smurf, teardrop, ipsweep,
| | | | | | dst_host_serror_rate > 0.01: warezmaster portsweep, bufferoverflow and guess-passwd. The test
| | | | | dst_host_same_srv_rate > 0.11 dataset is similar with the training data set. The only
| | | | | | is_host_login <= 0: warezmaster differences are that the test data set includes some unknown
| | | | | | is_host_login > 0: multihop attacks not accruing in the training data set.
| | | | protocol_type = udp: multihop
| | | | protocol_type = icmp: multihop a b c d e f <-- classified as
| | | num_access_files > 0: ftp_write 76 0 1 0 0 0 | a = back
| | rerror_rate > 0: imap 0 7 0 0 0 0 | b = land
| num_failed_logins > 0: guess_passwd 1 0 250 0 0 0 | c = neptune
dst_host_srv_count > 227: guess_passwd 0 0 0 17 0 0 | d = pod
The summary of the results after RBF training is given
as follows:
Correctly Classified Instances 916 99.7821 % Table 3: Performance of the Implemented System
Incorrectly Classified Instances 2 0.2179 % No of
False Positive Rate
Kappa statistic 0.9959 rules Detection Rate in %
in %
Mean absolute error 0.0008
R2 Pro Do U2 Pro
Root mean squared error 0.0269 DoS U2R
L be S R
R2L
be
Relative absolute error 0.4262 % 50 86.7 79.2 81. 86. 0.9 1.1 1.5 1.2
Root relative squared error 9.0025 % 2 1
Total Number of Instances 918 75 81.4 71.3 75. 83. 1.9 2.7 2.9 2.3
4 4
Confusion matrix showing accuracy of the original RBF 100 78.3 67 71 82. 2.3 3.1 3.6 2.7
network compared with synthesized RBF. The numbers 5
represent test cases and those lying on the diagonal have 0 0 0 0 566 0 | e = smurf
been classified accurately, while those off the diagonal have 0 0 0 0 0 0 | f = teardrop
been misclassified. The original network has an accuracy of
51.3% on the high speed data but when it is modified by The time complexity is quite low. It requires m*(n2-n)/2
inserting domain rules that are characteristic of the nature of operations for computing the pairwise feature correlation
high speed data, the accuracy goes up to 95.0 % matrix, where m is the number of instances and n is the
We have performed experiments using 50, 75, and 100 initial number of features. The feature selection requires
best performed rules for detecting attacks. From TABLE III (n2-n)/w operations for a forward selection and a backward
believing that our system out performs the best performed elimination. The hill climbing search is purely exhaustive,
model reported in literature. Moreover, our previous but the use of the stopping criterion makes the probability of
statement of reducing false positive rate when deploying exploring the entire search space small. In particular we
132 (IJCNS) International Journal of Computer and Network Security,
Vol. 1, No. 2, November 2009

have stressed the message that feature selection can help to reduce the overhead in collecting data when used in a real
focus a learning algorithm only on the most relevant network environment. The generated rules from the genetic
features insight a given dataset thus on the main aspects of algorithm DNA encoding are capable of identifying and
the considered problem. classifying attack categories aright. Once rules are extracted
using genetic algorithm, the rule base is then inserted back
100%
into a new network which has similar problem domain has
80% the desired potential. This is similar to the heuristics given
to expert systems. Also like the heuristics the
60% 100 Rules
extracted/inserted rules may be refined, as more. Since the
75 Rules
40%
numbers of features used are minimum, this method not
50 Rules
only improves the detection performance but also trimmed
20% the time required for training.
0%
5. Conclusion
U2R

U2R
R2L

P rob e

R2L

P rob e
D oS

D oS

After extensive study, we have decided to come up with a


Detection Rate False Positive Rate
unique solution, and approached the problem with a new
Figure 1. ROC Curve showing ID Performance data set formatting and optimization technique. A library of
attacks was created. This library was based on the
The linear structure of the rule makes the detection benchmark provided by the MIT Lincoln Lab that was
process efficient in real time processing of the traffic data. optimized by the KDD cups. After the data was carefully
The evaluation of our approach showed that the hybrid formatted and optimized, neural networks were chosen due
method of using discrete and continuous features can to their abilities to learn and classify. However the detection
achieve a better detection rate of network attacks. In order rate is not good for some runs because of the selection of
to increase the detection rate, we select the features that are cross over and mutation points in corresponding operations
appropriate for each type of network attacks. That is also an is random. Trained Neural Networks can make decisions
added advantage. quickly, making it possible to use them in real-time
One attraction of RBF networks is that there is a two detection. The effects of the large number of shared hidden
stage training procedure that is considerable faster than the units within the network are still an open issue.
methods used to train any other NN. Accuracy can be The modification of existing RBF networks using
effectively evaluated by using ROC curves [Receiver heuristic rules has obvious benefits when used in certain
Operating Characteristic], identify how the detection rate situations. The use of knowledge synthesis only makes sense
vary with the false positives rate. The Detection Rate DR when the available data is insufficient to build a reliable
increases as the false alarm does the same. The DR is close classifier. In such a situation it is advantageous to use
to 95% when the false alarm rate is 1.9% to 2%. However heuristic rules to modify an existing RBF network to detect
the false alarm rate is close to 1% the DR is only about 70%. infrequently encountered input vectors that would otherwise
The reason is because the number of rules in the rule base be misclassified. However, care must be taken when
for each run. applying the domain rules.
Figure 2 shows the performance comparison of this Not all issues are resolved. A system of this type requires an
research work with four of the other important IDS. agent to be running on every host that wished to be
protected. If this is not done, then any host without an agent
ROC DETECTION RATE COMPARISON
is still vulnerable to attack. Further enhancements should
120
be made by the rule learning technique by RBF for detecting
100 any unknown attacks. Future work includes development of
Percentage

Proposed Method
80 a detection scheme that is more general and able to handle
Binary Tree
60 LAMSTAR all types’ data as well as numerical.
SOM
40
ART

20
References
0 [1] Andrews and Geva , “Intrusion detection Rules and
Normal DoS Probe U2R R2L
Networks”, Proceedings of the Rule Extraction from
Attack Type
Trained Artificial Neural NetworksWorkshop,
Figure 2. Comparison with other IDS Artificial Intelligence and Simulation of Behaviour,
Brighton UK, 1996.
When compared to other IDS [5, 6] in this approach, an [2] E. Eskin, A. Arnold, M. Prerau, L. Portnoy, S. Stolfo,
efficient algorithm for feature extraction is proposed to “A geometric framework for unsupervised anomaly
remove the irrelevance during the data preparation period. detection: Detecting intrusions in unlabeled data,” in
Experimental results showed that the new decision Applications of Data Mining in Computer Security,
dependent correlation measure can be used to select the near Chapter 4, D. Barbara and S. Jajodia (editors),
optimal feature subset. The smaller number of features will Kluwer, ISBN 1-4020-7054-3,2002
result in a much faster rule generation process and it will
(IJCNS) International Journal of Computer and Network Security, 133
Vol. 1, No. 2, November 2009

[3] Kayacik G., Zincir-Heywood N., and Heywood M. On


the Capability of an SOM based Intrusion Detection Dr. R. S Rajesh received his B.E and M.E
System. In Proceedings of International Joint degrees in Electronics and Communication
Conference on Neural Networks, 2003. Engineering from Madurai Kamaraj
University, Madurai, India in the year 1988
[4] Kazienko, P., And Dorosz, P. Intrusion Detection
and 1989 respectively, and completed his
Systems (IDS) Part I - (network intrusions; attack Ph.D in Computer Science and Engineering
symptoms; IDS tasks; and IDS architecture), from Manonmaniam Sundaranar University in
http://www.windowsecurity.com, Apr 07, 2003. the year 2004.
[5] Kishore and M. V. Rao. A novel based algorithm for In September 1992 he joined in
radial basis function network. In Proceedings of the Manonmaniam Sundaranar University where he is currently
IEEE International Conference on Neural Networks, working as Reader in the Computer Science and Engineering
volume 489, pages 2007–2011, 1997. Department.
[6] Kruegel, C., and Toth, T. Using decision trees to
improve signature-based intrusion detection. In Proc.
Int’l Symp. Recent Advances in Intrusion Detection,
2003.
[7] McGarry, S.Wermter, and J. Mac-Intyre. Knowledge
extraction from local function networks. In
Seventeenth International Joint Conference on
Artificial Intelligence, volume 2, pages 765–770,
Seattle, USA, August 4th-10th 2001.
[8] Mukkamala, G. I. Janoski, and A. H. Sung. Intrusion
detection using support vector machines. In
Proceedings of the High Performance Computing
Symposium - HPC 2002, pages 178–183, San Diego,
CA, USA, April 2002.
[9] Sabhnani M., Serpen G., “Why Machine Learning
Algorithms Fail in Misuse Detection on KDD
Intrusion Detection Data Set”, In Journal of
Intelligent Data Analysis, 2004.
[10] Sinclair, C., Pierce, L., And Matzner; S. An
application of machine learning to Network Intrusion
Detection, http://www.citeseer.nj.nec.com. 2004
[11] Verwoerd T., and Hunt, R. 2001. Intrusion Detection
Techniques and Approaches, http://www.
Elsevier.com,2001.
[12] Z. Zhang, J. Li, C. N. Manikopoulos, J. Jorgenson,
and J. Ucles. HIDE: A hierarchical network intrusion
detection system using statistical preprocessing and
neural network classification. In Proceedings of the
2001 IEEE Workshop Information Assurance and
Security, pages 85–90, 2002.

Authors Profile
Selvakani Kandeeban received the MCA
degree from Manonmanium Sundaranar
University and M.Phil degree from Madurai
Kamaraj University.
Presently she is working as an Professor &
Head, MCA Dept in Francis Xavier
Engineering College, Tirunelveli. Previously
she was with Jaya College of Engineering
and Technology as an Assistant Professor,
MCA Department. She has presented 4 papers in National
Conference and 1 paper in international conference. She has
published 1 paper in National journal and 8 papers in International
Journal. She is currently pursuing her PhD degree in Network
Security.