Professional Documents
Culture Documents
N
i=1
w
h
ji
x
pi
+
h
j
(1)
i
h
p
= f
h
j
(net
h
pj
) (2)
net
o
pk
=
L
j=1
w
o
kj
i
pj
+
o
k
(3)
o
pk
= f
o
k
(net
o
pk
) (4)
o
pk
= (y
pk
o
pk
)f
o
k
(net
o
pk
) (5)
h
pj
= f
h
j
(net
h
pj
)
k
o
pk
w
o
kj
(6)
w
o
kj
(t + 1) = w
o
kj
(t) +
o
pk
i
pj
(7)
w
h
ji
(t + 1) = w
h
ji
(t) +
h
pj
x
i
(8)
In learning of the back-propagation neural network, Eq.
(1) and Eq. (2) in Table 3 indicate functions of the hid-
den layer, and Eq. (3) and Eq. (4) indicate functions of
output layer of the Layer 2 as in Figure 4. Next, learning
is achieved by teaching signals. The teaching signals use
errors in the output layer and back-propagate errors as indi-
cated by the name of back-propagation. That is to say, er-
rors in hidden layer are calculated by errors in output layer.
Errors in output layer and hidden layer are calculated by Eq.
(5) and (6). Learning is done by modication of weights of
the output layer and input layer. The modication of the
weights are calculated by Eq. (7) and (8), using learning
rate and Eq. (5) and (6) as teaching signals. Until out-
put value of the output layer and errors in target reaches
Figure 4. Procedure of N-gram and NN using
a SDX-Alg
the level of reliability, or restrictive conditions are satised,
learning continues.
This study performed a back-propagation neural net-
works learning for intrusion detection, for which 42 items of
a normal behavior pattern generated by the SDX-Alg were
used. When learning was completed, trace data evaluated
intrusion performance and compared the result with that of
N-gram technique.
4 Simulation
To detect anomaly intrusion in a system call base, this
simulation applied neural networks using a Soundex algo-
rithm and a N-gram technique, and a procedure to compare
their performance is presented in Figure 4.
For host-based intrusion detection, we used the N-gram
technique which has a simple algorithm and high detection
rate. To construct a normal behavior prole, we constructed
a prole using N-gram technique and the one using a SDX-
Alg and neural networks and then compared performances
between the two models.
4.1 N-gram Technique
A normal behavior pattern was generated by application
of N-gram technique for normal behavior data. With in-
dicating the size of window of N-gram technique, being
changed, intrusion detection rates were compared by trace
data.
First, through a construction of the proles of normal be-
haviors, intrusion behaviors and trace behaviors, the results
were obtained as presented in Table 4. If N, size of window,
Proceedings of the 10th IEEE Symposium on Computers and Communications (ISCC 2005)
1530-1346/05 $20.00 2005 IEEE
was increased, the number of normal patterns decreased ex-
cept part of them, and if redundancy of the patterns was
excluded, the number of patterns tended to increase. And
through comparison of intrusions and trace data based on
normal behavior data, the redundancy of the patterns which
do not exist in normal behavior was removed.
Table 4. Number of the anomaly patterns according to
window size of normal, intrusion and trace data
the number redundancy
windows of normal removed patterns
size (N) patterns normal intrusion trace
3 809,997 440 21 18
4 227,584 570 176 177
5 227,385 693 55 38
6 227,186 811 66 46
7 226,987 910 72 55
8 226,788 996 78 64
9 226,640 1076 84 73
10 326,179 1153 90 80
As N, window size, increased, the number of the pat-
terns detected in intrusion and trace data increased, but the
number of redundancy-removed patterns also increased. In
particular, in case that N=4, the number of redundancy-
removed patterns in intrusion and trace data suddenly in-
creased, and the largest value was obtained. With N, win-
dow size, being changed from 3 to 10, N with the largest
number of undetected anomaly patterns compared with nor-
mal data was optimal. That the number of the patterns
with redundancy removed is large means that there are more
anomaly detection information to be differentiated from
normal data.
With N , window size of normal data, being changed,
anomaly detection of the trace data composed of ten ses-
sions was performed and the results are presented in Table
5. In Table 4, with N=4, through a comparison with normal
data, various kinds of anomaly patterns were detected from
intrusion data and trace data. However, in Table 5 , when
N ranged from 4 to 10, detection rates of the sessions were
same.
Table 5.Number of Anomaly Detection According to
Window Size of Normal Behaviors using N-gram
Window N
Size 3 4 5 6 7 8 9 10
detection session
/ whole session 8/10 9/10
detection rate 80% 90%
As the size of N in N-gram increased from 3 to 10, the
number of the detected patterns increased. When the size of
window increased from 3 to 4, the detection rate increased
from 80% to 90%. Of ten sessions of trace data with N=3,
Figure 5. When neurons in hidden layer is 12
Figure 6. Output value of NN in intrusion data
and Trace data
the two sessions such as PID 107 and PID 144 were unde-
tected, but with N=4, only one session, PID 144, was unde-
tected. Intuitively, for the N-gram technique, when window
size was more than four, the highest detection rate, 90%,
was obtained. Though N was increased more, the detection
rate did not increase any longer, and only the number of the
detected anomaly patterns increased. Therefore, based on
the Occams Razor, the most effective window size is four.
4.2 Over-tting and Under-tting of Neural Net-
works Learning
For neural network learning, the number of neurons in a
hidden layer was changed from10 to 40 to investigate learn-
ing rate and errors. To overcome over-tting and under-
tting, which are disadvantages of neural network learning,
Proceedings of the 10th IEEE Symposium on Computers and Communications (ISCC 2005)
1530-1346/05 $20.00 2005 IEEE
Figure 7. Detection Results by N-gramand the
Proposed Method
the number of the neurons in the hidden layer should be
decided. The over-tting means learning even noise includ-
ing learning data, and under-tting means learning is not
perfectly achieved. Figure 5 shows that the number of neu-
rons in a hidden layer is twelve and the state that learning is
achieved by 428 epoch.
For neural network learning, the number of the neurons
in a hidden layer was decided as 12 and learning of 199
normal behavior patterns was progressed. The normal be-
havior pattern generated system call data as 42 items of the
learning pattern by a SDX-Alg, and learning was performed
with 0.01 of error rate, 0.2 of learning rate and 5000 epoch
times or fewer. Figure 6 shows the results of detection by
inputting intrusions and trace data to the learned neural net-
work.
Figure 7 shows the result of the detection by N-gramand
NN technique using SDX-Alg. Figure 7 (a) shows the re-
sults of detection with N = 3 and eight sessions of the ten
sessions in trace data were detected. Figure 7 (b) shows the
results of detection by neural network using a SDX-Alg in
the distribution of feature vectors. Nine sessions of the ten
sessions in trace data were detected and the detection rate
was 90%.
4.3 Comparison of Anomaly Detections between
Neural Networks using Soundex Algorithm
and N-gram technique
This study simulated an anomaly detection from system
call data sets of Sendmail Deamon by UNM by neural net-
works using SDX-Alg and a N-gram technique. The system
call data were transformed into 42 items of learning patterns
by SDX-Alg and NN learning was performed. And for N-
gram technique, window size was changed from 3 to 10 to
detect anomaly and then the results were compared as in
Table 6.
Table 6. Comparative Analysis of the Proposed Method
and N-gram
N-gram
Items 3 4 6 10 BPN
repe- # of 809, 227, 227, 326,
tition pattern 997 584 186 179 199
Data 9.6 3.47 4.97 12.08 22
size MB MB MB MB KB
repe- # of 1,
tition pattern 440 570 811 153 41
remove Data 5 7 14.4 34 5
size KB KB KB KB KB
Error 0.2 0.1 0.1 0.1 0.1
MDL Com- 0. 0. 1. 1. 0.
plexity 997 999 000 000 999
1. 1. 1. 1. 1.
Total 197 099 100 100 099
Epoch # - 428
Detection
Rate 80% 90% 90%
MDL(Minimum Description Length)[12] is composed
of the loss of errors L(D | H) and the loss of complexity
L(H). MDL is a more effective model as it has the less value.
The loss of error is dened as 1 - anomaly detection rate
length and the loss of complexity is dened as 1 - occu-
pancy rate of information in data description space. Table
6 compares the proposed method and N-gram technique by
MDL.
It was demonstrated that with N=4, the N-gram tech-
nique was the most effective. However, when the sug-
gested method was compared with N-gram technique with
N=4, their detection rates were identical, but in an aspect of
model complexity, it was demonstrated that the suggested
method was more effective.
Proceedings of the 10th IEEE Symposium on Computers and Communications (ISCC 2005)
1530-1346/05 $20.00 2005 IEEE
When window size of N-gram was changed from 3 to
4 and 10, detection rates were 80%, 90% and 90%. And
as window of N-gram increased, space to describe patterns
and time to process patterns increased. However, when a
SDX-Alg and neural networks were used, detection rate
was 90%. As detection of the suggested neural networks
was compared with that of N-gramtechnique with 3 of win-
dow size, the suggested method was absolutely superior in
detection rate and complexity aspect. With window size
changed from4 to 10, detection rates of these methods were
same, but neural networks technique was superior in time
and space complexity aspects.
5 Conclusion
This study applied the Soundex algorithm to solve the
problems of variable length data to be used for detection
system using the neural network of supervisor learning, a
machine learning. By transformation of variable length sys-
tem call data into a xed length patterns using the Soundex
algorithm, learning algorithm of neural networks could be
simple, and complexity in space and time required for learn-
ing aiming at intrusion detection could be overcome. To de-
tect host-based anomaly intrusion, rst, we classied ses-
sions, and generated hosts behavior patterns by transform-
ing the variable length data into a xed length pattern. For
normal behavior pattern, we detected anomaly behaviors by
learning normal behavior patterns using back-propagation
neural networks of supervisor learning. By solving difcul-
ties of a variable length data processing, a learning algo-
rithm became simple and complexity in space and time for
learning was overcome, which contributed to improvement
of anomaly intrusion detection.
Compared with the N-gram technique under the condi-
tion that neural networks and window size were 3, the sug-
gested method showed a higher detection rate, but when
windowsize was changed from4 to 10, the detection rate of
the suggested method was the same as that of the N-gram
technique, which was 90%. However, in the complexity
of time and space for algorithm performance, intrusion de-
tection of neural networks using a Soundex algorithm was
superior.
Acknowledgment : This study was supported (in part)
by research funds from Chosun University, 2004.
References
[1] Leonid Portnoy, Intrusion detection with unla-
beled data using clustering, Undergraduate Thesis,
Columbia University, 2000.
[2] Jack Marin, Daniel Ragsdale, and John Shurdu, A Hy-
brid Approach to the Prole Creation and Intrusion De-
tection, Proceedings of DARPA Information Surviv-
ability Conference and Exposition, IEEE, 2001.
[3] Nong Ye, and Xiangyang Li, A Scalable Clustering
Technique for Intrusion Signature Recognition, Pro-
ceedings of 2001 IEEE Workshop on Information As-
surance and Security, 2001.
[4] Wenke Lee, Salvatore J. Stolfo, Philip K. Chan, Eleazar
Eskin, Wei Fan, Matthew Miller, Shlomo Hershkop,
and Junxin Zhang, Real Time Data Mining - based In-
trusion Detection, IEEE, 2001.
[5] Christina Warrender, Stephanie Forrest, Barak Pearl-
mutter, Detecting Intrusion Using System Calls : Al-
ternative Data Models, 1998.
[6] http://www.archives.gov/research room/genealogy/
census/soundex.html
[7] S. Forrest, S. Hofmeyr, A. Somayaji and T. Longstaff,
A sense of self for unix processes, In IEEE Sympo-
sium on Security and Privacy, pp.120-128, 1996.
[8] Steven A. Hofmeyr, Stephanie Forrest, Anil Somayaji,
Intrusion Detection using Sequences of SystemCalls,
Journal of Computer Security, Vol.6, pp.151-180, Au-
gust 18, 1998.
[9] A. K. Ghosh, A. Schwarzbard and M. Shatz, Learn-
ing program behavior proles for intrusion detection,
Proceedings of the 1st USENIX Workshop on Intrusion
Detection and Network Monitoring, April, 1999.
[10] A. K. Ghish, J. Wanken and F. charron, Detecting
anomalous and unknown intrusions against programs,
Proceedings of the 1998 Annual Computer Security
Applications Conference(ACSAC 98), 1998.
[11] Http://cs.unm.edu/ immsec/data/synth-sm.html.
[12] Christopher M. Bishop, Neural Networks for Pattern
Recognition, Oxford Press, pp.429-433, 1995.
Proceedings of the 10th IEEE Symposium on Computers and Communications (ISCC 2005)
1530-1346/05 $20.00 2005 IEEE