You are on page 1of 12

Automatic feature extraction using genetic programming: An application to

epileptic EEG classication


Ling Guo

, Daniel Rivero, Julin Dorado, Cristian R. Munteanu, Alejandro Pazos


Department of Information Technologies and Communications, University of La Corua, Campus Elvia, 15071 A Corua, Spain
a r t i c l e i n f o
Keywords:
Genetic programming
Feature extraction
K-nearest neighbor classier (KNN)
Discrete wavelet transform (DWT)
Epilepsy
EEG classication
a b s t r a c t
This paper applies genetic programming (GP) to perform automatic feature extraction from original fea-
ture database with the aim of improving the discriminatory performance of a classier and reducing the
input feature dimensionality at the same time. The tree structure of GP naturally represents the features,
and a new function generated in this work automatically decides the number of the features extracted. In
experiments on two common epileptic EEG detection problems, the classication accuracy on the GP-
based features is signicant higher than on the original features. Simultaneously, the dimension of the
input features for the classier is much smaller than that of the original features.
2011 Elsevier Ltd. All rights reserved.
1. Introduction
Feature extraction for classication is to seek a transformation
or mapping from original features to a new feature space which
can maximize the separability of different classes. A classication
problem cannot be properly solved if important interactions and
relationships between the original features are not taken into con-
sideration. Thus, many researches agreed that the feature extrac-
tion is the most important key to any pattern recognition and
classication problem: The precise choice of features is perhaps
the most difcult task in pattern recognition (Micheli-Tzanakou,
2000), an ideal feature extraction would yield a representation
that makes the job of the classier trivial (Duda, Hart, & Stork,
2001). In most cases, feature extraction is done by human based
on the researchers knowledge, experience, and/or intuition.
Epilepsy is a type of neurological disorder disease. About 40 or
50 million people in the world suffer from epilepsy (Kandel,
Schwartz, & Jessell, 2000). The signicant characteristic of epilepsy
is recurrent seizures. Epilepsy can have several physical, psycho-
logical and social consequences, including mood disorders, injuries
and sudden death. Until now, the specic cause of epilepsy in indi-
viduals is often unknown and the mechanisms behind the seizure
are little understood. Thus, efforts toward its diagnosis and treat-
ment are of great importance.
This work has a multidisciplinary nature, in the combination of
evolutionary computation, feature extraction and biomedical sig-
nal processing. It applies genetic programming (GP) to automatic
feature extraction with the purpose of improving the discrimina-
tion performance of K-nearest neighbor (KNN) classier and
decreasing the input feature dimension. The tree structure of GP
helps itself naturally representing new features; and a new func-
tion generated in this work for GP helps the automatic feature
extraction. The novel method is veried to be successful with
application on the epileptic EEG classication problems. It is seen
that the classication accuracy has been greatly increased with
the GP extracted features. At the same time, the input feature
dimension is tremendously reduced, down to three or four features
for epileptic EEGs discrimination. In addition, through analyzing
the expression of the GP extracted features, informative measures
useful for EEGs classication are selected from original features.
The paper is organized as follows: In Section 2, an introduction of
EEG and epilepsy is presented to allow general understanding the
nature of the application. Then, discrete wavelet transform (DWT),
which is an efcient non-stationary signal processing tool, is briey
described. Later, some basic aspects of the genetic programming are
presented, since this is the key technique of the proposed method. A
short description of K-nearest neighbor classier is included in the
same part. Previous works on genetic programming application on
feature extraction and epileptic EEGs classication are also de-
scribed in this section. The epileptic EEGs classication problem
considered in this work is given in Section 3. In Section 4, the detail
explanation of the developed methodology is covered. Continu-
ously, implementation and results of the proposed methodology
on two different epileptic EEG classications are discussed. Finally,
conclusions and the future work are included.
2. State of the art
2.1. Epilepsy and electroencephalogram (EEG)
Epilepsy is the second most prevalent neurological disorder in
humans after stroke. It is characterized by recurring seizure in
0957-4174/$ - see front matter 2011 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2011.02.118

Corresponding author. Tel.: +34 981 167000x1302; fax: +34 981 167160.
E-mail address: lguo@udc.es (L. Guo).
Expert Systems with Applications 38 (2011) 1042510436
Contents lists available at ScienceDirect
Expert Systems with Applications
j our nal homepage: www. el sevi er . com/ l ocat e/ eswa
which abnormal electrical activity in the brain causes altered per-
ception or behavior. Approximately one in every 100 persons will
experience a seizure at some time in their life. Until now, the
occurrence of an epileptic seizure is unpredictable and its course
of action is little understood.
Electroencephalograhy is the recording of the electrical activity
of the brain, usually taken through several electrodes at the scalp.
EEG contains lots of valuable information relating to the different
physiological states of the brain and thus is a very useful tool for
understanding the brain disease, such as epilepsy. The detection
of epileptiform discharges occurring in the EEG is an important
component in the diagnosis and treatment of epilepsy (Subasi,
2005a). However, usually tons of data are included in EEG record-
ings and visual inspect for discriminating EEGs is time consuming
process and high costly. Much effort has been devoted to develop
epileptic EEG classication techniques.
2.2. Discrete wavelet transform
EEG signal is a complicated and non-stationary signal and its
characteristics are spatio-temporal dependent. Based on these
properties, DWT is chosen in this work for pre-analyzing the epi-
leptic EEG signal (Adeli, Zhou, & Dadmehr, 2003). Recently, wavelet
transform (WT) has been widely applied in many engineering
elds for solving various real-life problems. The Fourier transform
(FT) of a signal obtains the frequency content of the signal and
eliminates time information. The short-time Fourier transform
(STFT) is a series of FT with a xed window size. Because a large
window loses time resolution and a short window loses frequency
resolution, there always exists a trade-off between time and fre-
quency resolutions: on the one hand, a good time resolution re-
quires a short window with short time support; on the other
hand, a good frequency resolution requires a long time window.
The xed window size (xed timefrequency resolution) of the
STFT results in constraint on some applications.
Contrary to STFT, WT provides a more exible way of timefre-
quency representation of a signal by allowing the use of variable
sized analysis windows. The attractive feature of WT is that it pro-
vides accurate frequency information at low frequencies and accu-
rate time information at high frequencies (Adeli et al., 2003). This
property is important in biomedical applications, because most
signals in this eld always contain high frequency information
with short time duration and low frequency information with long
time duration. Through wavelet transform, transient features are
precisely captured and localized in both time and frequency do-
main. WT has been commonly considered as one of the most pow-
erful tools to EEG signal analysis.
The continuous wavelet transform (CWT) of a signal S(t) is de-
ned as the correlation between S(t) and the wavelet function
w
a,b
as follows (Chui, 1992):
CWT
a;b
jaj

1
2
_
1
1
Stw

t b
a
_ _
dt; 1
where a, b are called the scale (reciprocal of frequency) and transla-
tion (time localization) parameters, respectively. When a and b are
taken as discrete numbers and dened on the basis of power of two,
which are like the following:
a
j
2
j
;
b
j;k
2
j
k; j; k 2 Z:
_
2
Then the discrete wavelet transform is obtained and the Eq. (1)
becomes (Chui, 1992):
DWT
j;k
2
j=2
_
1
1
Stw

t 2
j
k
2
j
_ _
dt: 3
Mallat (1989) developed an efcient way of implementation
DWT by passing the signal through a series of low-pass and
high-pass lters. The DWT implementation procedure is schemat-
ically shown in Fig. 1, where lters h[n] and g[n] correspond to
high-pass and low-pass lters, respectively (Subasi, 2007). In the
rst stage, the signal is simultaneously passed through h[n] and
g[n] lters with the cut-off frequency being the one fourth of the
sampling frequency. The outputs of h[n] and g[n] lters are referred
to as detail (D1) and approximation (A1) coefcients of the rst
level, respectively. The same procedure is repeated for the rst
level approximation coefcients to get the second level coef-
cients. At each stage of this decomposition process, the frequency
resolution is doubled through ltering and the time resolution is
halved through downsampling. The coefcients A1, D1, A2, and
D2 represents the frequency content of the original signal within
the band 0 f
S
/4, f
S
/4 f
S
/2, 0 f
S
/8, and f
S
/8 f
S
/4, respectively,
where f
S
is the sampling frequency of the original signal x[n].
2.3. Genetic programming
Genetic programming is an evolutionary technique used to cre-
ate computer programs that represent approximate or exact solu-
tions to a problem (Koza, 1992). GP works based on the evolution
of a givenpopulation. Inthis population, every individual represents
a solution for the problemthat is intended to be solved. GP looks for
the best solution by means of a process based on the Evolution The-
ory (Darwin, 1864), in which, from an initial population with ran-
domly individuals, after subsequent generations, new individuals
are produced from old ones by means of crossover, selection and
mutationoperations, basedonnatural selection, the goodindividual
will have more chances of survival to become part of the next gener-
ation. Thus, after successive generations, obtains the best-so-far
individual corresponding to the nal solution of the problem. The
GP encoding for the solutions is tree-shaped, so the user must spec-
ify which are the terminals (leaves of the tree) and the functions
(nodes capable of having descendants) for being used by the evolu-
tionary algorithm in order to build complex expressions. Also, a t-
ness function that is used for measuring the appropriateness of the
individuals in the population has to be dened in GP. And it is the
most critic point in the design of a GP system.
The wide application of GP to various environments and its suc-
cess are due to its capability for being adaptedto numerous different
problems. Although one of the most common applications of GP is
the mathematical expressions generation (Rivero, Rabual, Dorado,
& Pazos, 2005), it has been also used in other elds such as rule gen-
eration (Bot & Langdon, 2000), lter design (Rabual et al., 2003),
and classication (Espejo, Ventura, & Herrera, 2010), etc.
2.4. K-nearest neighbor classier
K-nearest neighbor (KNN) classier (Cover & Hart, 1967) is a
nonparametric, nonlinear and relatively simple classier. It classi-
Fig. 1. Sub-band decomposition of DWT implementation.
10426 L. Guo et al. / Expert Systems with Applications 38 (2011) 1042510436
es a new sample on the basis of measuring the distance to a
number of patterns which are kept in memory. The class that
KNN classier determines for this new sample is decided by the
pattern which most resembles it, i.e. the one that has the smallest
distance to it. The common distance function used in the KNN clas-
sier is the Euclidean distance. Instead of taking the single nearest
sample, it is normally taking a majority vote from the k-nearest
neighbors. The parameter k has to be selected in practice. In this
work, k is chosen to 3.
2.5. Previous work of genetic programming application on feature
extraction
Raymer, Punch, Goodman, and Kuhn (1996) applied GP to im-
prove KNN classier performance without feature reduction. For
each attribute, they evolved a tree which is a function of the attri-
bute itself and zero or more constants. They applied this to one bio-
chemistry database and obtained better classication performance
over similar system with genetic algorithm.
In the work of Sherrah (1998), he created a feature extraction
system which uses evolutionary computation. The individuals in
his work are multi-trees, each tree encoding one feature. This
means that the number of extracted features is determined before-
hand. His system was not designed for KNN classier but for other
classication systems, such as the minimal distance to means clas-
sier, the parallelepiped classier and the Gaussian maximumlike-
lihood classier.
Tackett (1993) developed a processing tree derived from GP for
the classication of features extracted from images: measurements
from the segmented images were weighted and combined through
linear and non-linear operations. If the resulting value out of the
tree was greater than zero, then the object was classied as a
target.
Ebner and co-workers have evolved image processing operators
using GP (Ebner & Rechnerarchitektur, 1998; Ebner & Zell, 1999).
The authors pose the interest point (IP) detection as an optimiza-
tion problem attempting to evolve the Moravec operator (Ebner
& Rechnerarchitektur, 1998) using genetic programming. The
authors reported a 15% localization error between interest point
detection of his evolved operator and that obtained using the Mor-
avec detector.
Bot (2001) has used GP to evolve new features for classication
problems, adding these one-at-a-time to a KNN classier if the
newly evolved feature improved the classication performance
over a certain amount. Bots approach is a greedy algorithm and
therefore almost certainly sub-optimal.
Harvey et al. (2002) evolved pipelined image processing opera-
tions to transform multi-spectral input synthetic aperture radar
(SAR) image planes into a new set of image planes and a conven-
tional supervised classier was used to label the transformed fea-
tures. Training data were used to derive a Fisher linear
discriminant and GP was applied to nd a threshold to reduce
the output from the discriminant-nding phase to a binary image.
However, the discriminability is constrained in the discriminant-
nding phase and the GP only used as a one-dimensional search
tool to nd a threshold.
Kotani, Nakai, and Akazawa (1999) used GP to evolve the poly-
nomial combination of raw features to fed into a KNN classier and
proved an improvement in classication accuracy. The author as-
sumed in advance that the features were polynomial expressions
and was the product sum of the original patterns.
Krawiec (2002) constructed a xed-length decision vector using
GP proposing an extended method to protect useful blocks during
the evolution. This protection method, however, results in the
over-tting which is proved from his experiments. Krawiees re-
sults showed that for some datasets, his feature extraction method
actually produces worse classication performance than using the
raw input data.
Guo, Jack, and Nandi (2005) have evolved features using GP in a
condition monitoring task although it is not clear whether the ele-
ments in the vector of decision variables were evolved at the same
time or hand selected after evolution.
Firpi, Goodman, and Echauz (2006) developed some articial
features without physical meaning by means of GP, then applied
those features to a KNN classier for predicting epileptic seizures.
The researchers evaluated the performance of GP articial features
on IEEG data from seven patients and obtained satised prediction
accuracy. However, the maximum number of articial features
evolved through GP has been predened by the author.
Recently, Sabeti, Katebi, and Boostani (2009) employed GP to
select the best features from the original feature set for increasing
classier performance on EEG classication problems. In that work,
the aim of utilizing GP was to pick out the most important feature
elements, and not to create the new articial GP based features.
In the present work, GP is used to create new features from
original feature database to improve the KNN classier perfor-
mance and simultaneously decrease the input feature dimension
for the classier. The input feature dimension is automatically
determined during GP evolution, not xed beforehand or decided
by humans.
2.6. Previous work of epileptic EEGs classication
In this part, other researchers work on epileptic EEGs classica-
tion is briey reviewed. Mohseni, Maghsoudi, Kadbi, Hashemi, and
Ashourvan (2006) applied short time Fourier transform (STFT)
analysis of EEG signals and extracted features based on the pseudo
WignerVille and the smoothed-pseudo WignerVille distribution.
Then those features are used as inputs to an articial neural net-
work (ANN) for classication.
Kalayci and Ozdamar (1995) used wavelet transform to capture
some specic characteristic features of the EEG signals and then
combined with ANN to get satisfying classication result.
Nigam and Graupe (2004) described a method for automated
detection of epileptic seizures from EEG signals using a multistage
nonlinear pre-processing lter for extracting two features: relative
spike amplitude and spike occurrence frequency. Then they fed
those features to a diagnostic articial neural network.
In the work of Jahankhani, Kodogiannis, and Revett (2006), the
EEGs were decomposed with wavelet transform into different sub-
bands and some statistical information were extracted from the
wavelet coefcients. Radial basis function network (RBF) and mul-
ti-layer perceptron network (MLP) were utilized as classiers.
Subasi (2005b, 2006, 2007) decomposed the EEG signals into
timefrequency representations using discrete wavelet transform.
Some features based on DWT were obtained and applied for differ-
ent classiers for epileptic EEG classication, such as feed-forward
error back-propagation articial neural network (FEBANN), dy-
namic wavelet network (DWN), dynamic fuzzy neural network
(DFNN) and mixture of expert system (ME).
beyli (2009) employed wavelet analysis with combined neural
network model to discriminate EEG signals. The EEGs were decom-
posed into timefrequency representations using DWT and then
statistical feature were calculated. Then a two-level neural net-
work model was used to classify three types of EEG signals. The re-
sults proved that utilizing combined neural network model
achieved better classication performance than the stand-alone
neural network model.
Ocak (2009) detected epileptic seizures based on approximate
entropy (ApEn) and discrete wavelet transform. EEG signals were
rstly decomposed into approximation and detail coefcients
using DWT, and then ApEn values for each set of coefcients were
L. Guo et al. / Expert Systems with Applications 38 (2011) 1042510436 10427
computed. Finally, surrogate data analysis was used on the ApEn
values to classify EEGs.
Is ik and Sezer (in press) investigated wavelet transform for
diagnosing epilepsy. The EEGs were decomposed into several
sub-bands using wavelet transform and a set of features vectors
were extracted. Dimensions of these feature vectors were reduced
via principal component analysis method and then classied as
epileptic or healthy using multi-layer perceptron and Elman neural
networks.
Instead of applying features derived from DWT as the inputs of
the classiers to discriminate EEGs, other quantitative information
from time series of signals were also investigated. In Glers work
(Gler, beyli, & Gler, 2005), Layapunov exponents were ex-
tracted from EEGs with Jacobi matrices and then applied as inputs
to recurrent neural networks (RNNs) to obtain good classication
results. beyli (2006b) classied the EEG signals by combination
of Lyapunov exponents and fuzzy similarity index. Fuzzy sets were
obtained from the feature sets (Lyapunov exponents) of the signals
under study. The results demonstrated that the similarity between
the fuzzy sets of the studied signals indicated the variabilities in
the EEG signals. Thus, the fuzzy similarity index could discriminate
the different EEGs. In the work of beyli (2006a), the author used
the computed Lyapunov exponents of the EEG signals as inputs of
the MLPNNs trained with backpropagation, delta-bar-delta, ex-
tended delta-bar-delta, quick propagation, and LevenbergMarqu-
ardt algorithms. The classication accuracy of MLPNN trained with
the LevenbergMarquardt algorithm was 95% for healthy, seizure-
free and seizure EEGs discrimination.
In the study presented by beyli and Gler (2007), decision
making was performed in two stages: feature extraction by eigen-
vector methods and classication using the classiers trained on
the extracted features. The inputs of these expert systems com-
posed of diverse or composite features were chosen according to
the network structures. The ve-class classication accuracies of
expert system with diverse features (MME) and with composite
feature (ME) were 95.53% and 98.6%, respectively.
Except ANNs are used as classication systems, other types of
classiers are also utilized for EEGs discrimination, which includes
linear discriminant analysis (LDA), Multiclass support vector ma-
chines (SVMs), Bayesian classier, and Nearest neighbor classier.
LDA assumes normal distribution of the data, with equal covari-
ance matrix for two classes. The separating hyperplane is obtained
by seeking the projection that maximizes the distance between the
two classes means and minimizes the interclass variance. This
technique has a very low computational requirement which makes
it suitable for the online and real-time classication problem
(Garrett, Peterson, Anderson, & Thaut, 2003). SVM also uses a dis-
criminant hyperplane to identify classes. However, concerning
SVM, the selected hyperplane is the one that maximizes the mar-
gins, i.e., the distance from the nearest training points. Maximizing
the margins is known to increase the generalization capabilities
(Blankertz, Curio, & Mller, 2002). The main weak of SVM is its
relatively low execution speed. beyli (2008a) presented the mul-
ticlass support vector machine (SVM) with the error correcting
output codes (ECOC) for EEGs classication. The features were ex-
tracted by the usage of eigenvector methods which were used to
train novel classier (multiclass SVM with the ECOC) for the EEG
signals. The Bayesian classier aims at assigning to a feature vector
the class based on highest probability. The Bayers rule is used to
compute the so-called a posteriori probability that a feature vector
has of belonging to a given class. Using the MAP (maximum a pos-
teriori) rule and these probabilities, the class of this feature vector
can be estimated (Fukunaga, 1990). Nearest neighbor classiers are
relatively simple. They assign a feature vector to a class according
to its nearest neighbor(s) and they are discriminative non-linear
classiers (Garrett et al., 2003).
3. Problem description
The epileptic EEG classication problem described by Andrzejak
et al. (2001) was considered in current research. The whole dataset
consists of ve sets (denoted as Z, O, N, F and S), each containing
100 single-channel EEG segments of 23.6 s duration, with sampling
rate of 173.6 Hz. These segments were selected and cut out from
continuous multi-channel EEG recordings after visual inspection
for artifacts, e.g., due to muscle activity or eye movements. Sets Z
and O consisted of segments taken from surface EEG recordings
that were carried out on ve healthy volunteers using a standard-
ized electrode placement scheme. Volunteers were relaxed in an
awake state with eyes open (Z) and eyes closed (O), respectively.
Sets N, F and S originated from an EEG archive of presurgical diag-
nosis. Segments in set F were recorded from the epileptogenic
zone, and those in set N from the hippocampal formation of the
opposite hemisphere of the brain. While sets N and F contained
only activity measured during seizure free intervals, set S only con-
tained seizure activity. Here, segments were selected from all
recording sites exhibiting ictal activity.
In this work, two different classication problems are created
from the described dataset and then are testied by our method.
In the rst problem, two classes are examined, normal and seizure.
The normal class includes only the segment Z while the seizure
class includes the segment S. The second problem includes three
classes, normal, seizure-free and seizure. The normal class includes
only the set Z, the seizure-free class set F, and the seizure class set
S. According to the previous description, the datasets consist of 200
and 300 EEG segments for these two problems, respectively.
4. Methodology
The current research is applying genetic programming to auto-
matically extracting new features which improve the classier per-
formance and reduce the input feature dimension simultaneously.
Here, the term automatically refers two meanings: one is that the
expressions of the new features are automatically dened by GP.
Another is that the number of new features is also automatically
determined by GP.
Fig. 2 illustrates the approach of feature extraction using GP for
classication problem. The whole process consists of three stages:
Stage 1: creation of original feature database. The feature
extraction requires creating original feature database as a pre-
requisite for the subsequent procedures. In this work, the origi-
nal features are created on the basis of discrete wavelet
transform analysis for the raw EEG signals.
Stage 2: genetic programming based feature extraction system.
This is the main part of the proposed method. It consists of GP
and a KNN classier. GP non-linearly transforms original fea-
tures obtained in step 1 to a set of new features for facilitating
classication. The output of the system is GP-based features.
Stage 3: classication. The purpose of this stage is to testify the
efciency of the GP-based features on the test data. The test
data is processed by the GP-based features. Then they are fed
to a classier for evaluating the classication performance.
The detail description of the above three stages is in the
following.
4.1. Creation of original feature database
Since the EEG is non-stationary signal, discrete wavelet trans-
form is chosen to analyze the EEGs and helps create original fea-
tures. The basic theory of DWT was explained in Section 2.2.
10428 L. Guo et al. / Expert Systems with Applications 38 (2011) 1042510436
The raw EEG signal is rstly decomposed into several sub-sig-
nals by DWT, which includes one approximate and several other
detail sub-signals. Each sub-signal represents original signal in dif-
ferent frequency bands. Then, ve classic measures used in EEG
signal analysis are calculated for each sub-signal. These measures
were selected from time, statistics, and information theory to re-
veal the most important EEG signal characteristics. In order to de-
ne these measures, let S be a sampled signal with N samples,
where S
i
is the ith sample, S
i1
is the previous one. The denitions
of the ve measures are:
4.1.1. Mean value of the signal
Average (arithmetic mean) of the signal amplitudes.
1
N

N
i1
S
i
:
4.1.2. Standard deviation of the signal
The square root of the variance l represents the mean value of
the signal.

1
N 1

N
i1
S
i
l
2

_
:
4.1.3. Energy of the signal
Measures the average instantaneous energy of the signal.
1
N

N
i1
S
2
i
:
4.1.4. Curve length of the signal
Sum of the lengths of the vertical line segments between sam-
ples. It provides measures of both time and frequency
characteristics.

N
i1
jS
i1
S
i
j:
4.1.5. Skewness of the signal
Measure of the asymmetry of the data distribution. l and r rep-
resent the mean and standard deviation of the signal individually.
1
N

N
i1
S
i
l
r
_ _
4
3:
These ve measures calculated on each sub-signal of the EEG
signal are used to create the original feature database.
4.2. Genetic programming based feature extraction system
Genetic programming based feature extraction system devel-
oped in this work consists of genetic programming and a KNN clas-
sier. The main part of the system is genetic programming. Each
individual in GP represents a set of new features, which are non-
linearly transformation from original features. Afterward, those
new features are passed to the KNN classier. The goodness of
the new features is evaluated through the misclassication out of
the KNN classier.
In classic GP evolution process, each individual of the popula-
tion represents one expression. When this expression is evaluated,
it only allows obtaining one feature. That is the most common sit-
uation of GP application in feature extraction. In order to allow
each individual to automatically generate more than one feature,
in current work a new function named F is created and added to
the function set. It is a special function used to create the output
new feature vector. The F function can appear at any position of
the tree in GP. F has only one argument and the output of F is just
the copy of its input. However, the input argument of F is added to
the list of new feature vector. Thus, when evaluating an individual
in GP, depending on the number of F nodes included in the tree,
the same number of new features will be automatically generated.
The usage of function F can be clearly described with the
example shown in Fig. 3, where there are 10 input variables to a
classier. A GP tree in the Fig. 3 contains two F nodes. When a tree
is evaluated, the numerical value resulting of this evaluation is not
used. What is interesting is the sub-trees children of each F node.
Those sub-trees will determine the extracted new features expres-
sion: the rst node F implies the creation of a feature with the
expression x
2
x
5
and it is put to the new feature vector. Then sec-
ond node F implies creation of the feature with expression x
7
+ x
10
and it is also added to the same new feature vector. Thus, after n-
ishing evaluation of the whole tree in Fig. 3, the feature vector
including two new features is created. The feature dimension in
this example has been decreased from original 10 down to 2.
4.2.1. The tness function
The tness function is the most important point in GP evolution.
In current work, the tness function is dened as the misclassi-
cation of the classier on the training data, which is depicted in
Fig. 4.
Fig. 2. Diagram of the GP based feature extraction for epileptic EEG classication.
Fig. 3. Example of a GP tree creates the feature vector including two new features.
L. Guo et al. / Expert Systems with Applications 38 (2011) 1042510436 10429
The detail procedure of calculating the tness value of each
individual is:
1. The training data pre-processed by original features are ran-
domly split into a sub-training set (40% of the pre-processed
training data) and a validation set (60% of the pre-processed
training data).
2. Evaluating the tree in GP and then extracting new features.
3. The KNN classier is trained by the sub-training set processed
by the new features.
4. The trained KNN classier classies the validation set processed
by the new features, and obtains the misclassication value e
i
:
e
i
N
validation
N
correct
;
where N
validation
is the number of samples in validation set, and
N
correct
is the number of samples correctly classied
5. Repeating steps 14 several times and calculating the mean
value of e
i
, which is the tness value of the individual.
After the GP evolution procedure is ended, the best-of-all indi-
vidual is obtained. Consequently, the best new features for facili-
tating classication on the training data will be used as the
output of the GP based feature extraction system, which are called
GP-based features. One point should be mentioned: in current
work, the misclassication out of a KNN classier is used to evalu-
ate the tness of GP individuals, thus the GP-based features ob-
tained are only optimized for the KNN classier. Although it is
possible that the GP-based feature works well with other classi-
ers, the performance cannot be guaranteed. But, the idea of this
methodology can be expanded to apply on other classication sys-
tem, such as articial neural networks.
4.3. Classication
Because the GP-based features are derived from the training
data, their performance on the classication problem has to be ver-
ied on the test data. Since the KNN classier was selected as the
component of GP based feature extraction system, KNN classier
Fig. 4. Calculating the tness of each individual in GP.
Table 1
Frequency bands of EEG signals with 4-level DWT decomposition.
Sub-signals Frequency bands (Hz) Decomposition level
D1 43.486.8 1
D2 21.743.4 2
D3 10.821.7 3
D4 5.410.8 4
A4 05.4 4
0 5 10 15 20 25
200
0
200
E
E
G

s
i
g
n
a
l
0 5 10 15 20 25
20
0
20
Time (second)
D
1
0 5 10 15 20 25
50
0
50
D
2
0 5 10 15 20 25
200
0
200
D
3
0 5 10 15 20 25
200
0
200
D
4
0 5 10 15 20 25
500
0
500
A
4
Fig. 5. Approximation and details of a sample normal EEG signal.
10430 L. Guo et al. / Expert Systems with Applications 38 (2011) 1042510436
is also chosen as a classication system to verify the performance
of the GP-based features on the test data.
5. Results
In this section, the results of implementing the developed
methodology on the specic classication problem described in
Section 3 are discussed.
5.1. Original feature database
For creation of original features, there are two steps. In the rst
step, discrete wavelet transform is applied to decompose the EEG
signal into several sub-signals within different frequency bands.
Selection the number of decomposition levels and suitable wavelet
function are also important for EEG signal analysis with DWT. In
current research, the number of decomposition levels is chosen
4, which is recommended by others work (Subasi, 2006). And
the wavelet function selected is Daubechies with order 4, which
was also proven to be the best suitable wavelet function for epilep-
tic EEG signal analysis (Subasi, 2006). The frequency bands
responding to 4-level DWT decomposition with sampling fre-
quency of 173.6 Hz on the EEG signal are shown in Table 1.
Figs. 57 show ve different sub-signals (one approximation A4
and four details D1D4) of a sample normal EEG signal (set Z), epi-
leptic seizure-free EEG signal (set F) and epileptic seizure EEG sig-
nal (set S), respectively.
After one raw EEG signal is decomposed into ve sub-signals,
which individually correspond to different frequency bands de-
scribed in Table 1. Five classic measures explained in Section 4.1
are calculated on each sub-signal to form the original feature data-
base. The dimension of the original feature space is 5 5 = 25. In
order to simplify the description, a vector X which contains 25 vari-
ables corresponding to the component of original features is de-
ned. The denition of each variable is shown in Table 2.
5.2. GP congurations
Before applying genetic programming to solve a real-world
problem, several types of parameters need to be dened in
advance.
5.2.1. The function set and the terminal set
Although there are many different functions which can be
used in GP, normally only a small sub-set of those is used simul-
taneously, because the size of the search space increases expo-
nentially with the size of the function set. The function set
employed in this work is listed in Table 3. Since the functions
need to satisfy the closure property in GP, the square root, loga-
rithm, and division operators are implemented in a protected
way. Protected-division works identically to ordinary division ex-
cept that it outputs the value of nominator when its denominator
is zero. In order to avoid complex values and negative arguments,
the protected square root operator applies an absolute value
operator to the input value before taking the square root. Also,
logarithms are protected versions that output zero for an argu-
ment of zero and apply an absolute value operator to the argu-
ment to handle negative arguments.
Referring to the terminal set of this work, it includes 25 vari-
ables, which the denitions are in Table 2, and one random con-
stant. Through essentially inputting the full original feature
database to the GP, it is expected that GP can explore and exploit
0 5 10 15 20 25
500
0
500
E
E
G

s
i
g
n
a
l
0 5 10 15 20 25
20
0
20
Time (second)
D
1
0 5 10 15 20 25
100
0
100
D
2
0 5 10 15 20 25
500
0
500
D
3
0 5 10 15 20 25
1000
0
1000
D
4
0 5 10 15 20 25
1000
0
1000
A
4
Fig. 6. Approximation and details of a sample epileptic seizure-free EEG signal.
L. Guo et al. / Expert Systems with Applications 38 (2011) 1042510436 10431
the EEG signals so that increasing the chance of nding more infor-
mative and successful features to discriminate EEGs.
5.2.2. The control parameters
These parameters are used to control the GP run. There exist
much possibilities of combination with different control parame-
ters. For solving the given problems, several different combinations
of parameters have been tried. The parameters that returned the
best results are shown in Table 4.
5.2.3. The termination criteria
The GP execution is terminated when one of the following cri-
teria is met: the mean value of misclassication out of the KNN
classier on the training data is zero or the maximum number of
generation is reached.
GPLab software (Silva, 2007) is employed in this work, which is
a popular genetic programming software in Matlab.
5.3. Results
Two different epileptic EEG classication problems from the
dataset described in Section 3 are used to verify the developed
methodology: the rst problem is two-class classication (normal
and seizure classication); and the second is three-class classica-
tion (normal, seizure-free and seizure classication). The raw data-
set is pre-processed by the original features.
5.3.1. Classication performance and input feature dimension
The classication accuracies of the 50 executions on the two
classication problems are shown in Fig. 8. The procedure of calcu-
lating classication accuracy is:
0 5 10 15 20 25
2000
0
2000
E
E
G

s
i
g
n
a
l
0 5 10 15 20 25
1000
0
1000
Time (second)
D
1
0 5 10 15 20 25
1000
0
1000
D
2
0 5 10 15 20 25
2000
0
2000
D
3
0 5 10 15 20 25
5000
0
5000
D
4
0 5 10 15 20 25
5000
0
5000
A
4
Fig. 7. Approximation and details of a sample epileptic seizure EEG signal.
Table 2
Denition of variables in original feature space X.
Classic measure Frequency bands (Hz)
43.486.8 21.743.4 10.821.7 5.410.8 05.4
Mean x
1
x
6
x
11
x
16
x
21
Standard deviation x
2
x
7
x
12
x
17
x
22
Energy x
3
x
8
x
13
x
18
x
23
Curve length x
4
x
9
x
14
x
19
x
24
Skewness x
5
x
10
x
15
x
20
x
25
Table 3
The function set.
Name Number of argument Operation
+ 2 Arithmetic add
2 Arithmetic subtract
2 Arithmetic multiply
2 Protected division
log 1 Protected natural logarithm
sqrt 1 Protected square root
F 1 Output the value of input
Table 4
Control parameters in GP.
Initial population generation Ramped half-and-half method
Maximum tree depth 9
Population size 300
Number of generation 10
Crossover probability 80%
Mutation probability 20%
10432 L. Guo et al. / Expert Systems with Applications 38 (2011) 1042510436
1. Random selecting 30% of the whole pre-processed dataset is
used for obtaining GP-based features.
2. The other 70% of the pre-processed dataset is processed with
and without the GP-based features. Then they are randomly
split into training subset (40% of them) and testing subset
(60% of them).
3. The KNN classier is trained by the training subset with and
without the GP-based features. Therefore, two types of KNN
classiers are applied for comparison.
4. The two trained KNN classiers are used to classify the testing
subset with and without GP-based features, respectively, and
obtain the classication accuracy.
5. Repeating steps 2, 3 and 4 for 300 times and calculating the
mean value of classication accuracy as result of one time exe-
cution in Fig. 8.
In the Fig. 8, GPKNN classier and KNN-alone classier
represent a KNN classier with and without the GP-based features,
respectively. The upper graph in the Fig. 8 is the classication accu-
racy comparison of two classiers on two-class EEG classication.
The bottom graph in the same gure is the classication accuracy
comparison of two classiers on three-class EEG classication.
From the Fig. 8, it is obvious that GPKNN classier has much
higher classication accuracy than KNN-alone classier on both
classication problems. It means that the GP-based features
greatly improve the discrimination performance of the KNN
classier.
Another objective of the developed methodology is to reduce
the input feature dimension. The dimension of the original features
without using GP based feature extraction is 25, which is already
mentioned in Section 5.1. In Fig. 9, the individual dimension of
the GP-based features obtained on 50 executions is depicted: for
two-class EEGs classication, the maximum dimension of the GP-
based features is 4 and most of them are only 2 or 3. For three-class
EEGs classication, the maximum dimension of GP-based features
is 6 and most of them are from 2 to 5.
Table 5 is the summarization of the results in Figs. 8 and 9. The
values of classication accuracy and dimension of the input fea-
tures in the table are the mean value of 50 executions. From the ta-
ble, it is obvious that with the GP based feature extraction:
1. The classication performance of the KNN classier has been
signicantly improved. For two-class classication, the
improvement is more than 10%; and for three-class classica-
tion, the improvement is more than 25%.
2. Input feature dimension for the classier is enormously
reduced. The average number of input features is down to
2.32 for two-class classication problem and 3.48 for three-
class classication. Most number of the input features neces-
sary for two-class classication are 2 or 3, thus the average
value in the table is 2.32. For three-class classication, most
input features required are 3 or 4, so the average value is
3.48. It is logic that the input feature dimension for three-class
classication is higher than that of two-class since three-class
classication problem is more complicated and need more fea-
tures to discriminate different patterns.
5.3.2. GP-based features expression and analysis
In most cases, each time executing genetic programming based
feature extraction obtained different expressions and different
0 5 10 15 20 25 30 35 40 45 50
86
88
90
92
94
96
98
100
Number of execution
C
l
a
s
s
i
f
i
c
a
t
i
o
n

a
c
c
u
r
a
c
y

(
%
)
Normal and Seizure EEGs Classification
0 5 10 15 20 25 30 35 40 45 50
65
70
75
80
85
90
95
100
Number of execution
C
l
a
s
s
i
f
i
c
a
t
i
o
n

a
c
c
u
r
a
c
y

(
%
)
Normal, Seizurefree and Seizure EEGs Classification
GPKNN classifier
KNNalone classifier
Fig. 8. Classication accuracy comparison of GPKNN classier and KNN-alone classier on the testing subset of two epileptic EEGs classication.
L. Guo et al. / Expert Systems with Applications 38 (2011) 1042510436 10433
number of GP-based features. The expressions may be as simple as
just one of the variables or may be a complicated function with
many variables and constants. Several examples of the GP-based
features expression obtained in this work are listed in following:
Normal-seizure EEGs classication

x
24
p
x
17


x
19
p
;
x
22
;

logx
24

_
_ _
;
x
19
; x
24
; x
14
:
Normal-seizure free-seizure EEGs classication
x
24
; x
7
=x
19
;
x
7
;

logx
9

_
; x
19
=0:25355
_ _
;
x
19
; x
24
; x
12
; x
7
x
19
x
24
:
From the above equations, it can see that the features extracted
by genetic programming are intuitively difcult to interpret. This
shows that GP can nd combinations of the input variables and
0 5 10 15 20 25 30 35 40 45 50
0
1
2
3
4
Number of execution
D
i
m
e
n
s
i
o
n

o
f

G
P

b
a
s
e
d

f
e
a
t
u
r
e
s
Normal and Seizure EEGs Classification
0 5 10 15 20 25 30 35 40 45 50
0
1
2
3
4
5
6
Number of execution
D
i
m
e
n
s
i
o
n

o
f

G
P

b
a
s
e
d

f
e
a
t
u
r
e
s
Normal, Seizurefree and Seizure EEGs Classification
Fig. 9. The dimension of GP-based features over 50 executions for two epileptic EEGs classication.
Table 5
Classication accuracy and input feature dimension of the two KNN classiers on two classication problems.
Normal and seizure EEG classication Normal, seizure-free and seizure EEG classication
Classication accuracy (std) Input feature dimension Classication accuracy (std) Input feature dimension
KNN-alone classier 88.6% (1.12) 25 67.2% (1.17) 25
GPKNN classier 99.2% (0.49) 2.32 93.5% (1.20) 3.48
Table 6
Most important measures selected by GP for epileptic EEG classication.
Measures (Hz) Normal-seizure classication Normal-seizure free-seizure classication
x
24
Curve length of the signal within 05.4 U U
x
22
Standard deviation of the signal within 05.4 U
x
17
Standard deviation of the signal within 5.410.8 U U
x
12
Standard deviation of the signal within 10.821.7 U
x
19
Curve length of the signal within 5.410.8 U U
x
7
Standard deviation of the signal within 21.743.4 U
x
9
Curve length of the signal within 21.743.4 U
10434 L. Guo et al. / Expert Systems with Applications 38 (2011) 1042510436
functions that would not be found by humans. After looking into
the whole expressions of GP-based features obtained in this work,
it can be found that only several measures within the original fea-
ture database are useful for epileptic EEG classication. The se-
lected measures are listed in Table 6.
5.4. Comparison with other works
There are many other methods proposed for the epileptic EEG
signal classication, which were described in Section 2.6. Table 7
presents a comparison on the results between the method devel-
oped in this work and other proposed methods. Only methods
evaluated in the same dataset are included. Not only the classica-
tion accuracy but also the input features dimension for classier is
listed in the table for comparison. For two-class classication prob-
lem, the accuracy obtained from our method is the second best
presented for this dataset. However, the number of features re-
quired for our method is the smallest while considering the classi-
cation accuracy is higher than 99%. For the three-class problem,
the accuracy obtained from our method is also the second best pre-
sented for this dataset. While, the number of features necessary for
our method is the smallest.
6. Conclusions
In this work, genetic programming is applied to extract newfea-
tures from original feature database for classication problem. In
comparison with other pattern classication methods, genetic pro-
gramming based feature extraction automatically determines:
the dimensionality of the new features,
which measures in the original feature database are useful for
epileptic EEG classication,
how to non-linearly transform the selected original measures to
new features.
Implementation results showed that the proposed method sig-
nicantly improved the KNN classier performance. Furthermore,
the huge input feature dimension reduction was also achieved by
the developed method. For the given two problems in this paper,
three and four features are sufcient to attain high classication
accuracy. In addition, through natural evolution, the informative
measures useful for discrimination are selected by GP from the ori-
ginal feature database. Referring to the features extracted by GP, it
is hard to explain their physical meaning. Thus, it proves that GP
can discover the hidden relationship among the terminals and
functions sets, which are difcult to fulll by humans.
The limitation of the proposed approach is that GP-based fea-
ture extraction system is computationally expensive. The increase
of the original feature database size as well as the number of train-
ing data would bring about a signicant increase on the computa-
tion cost, which makes the developed method inappropriate for
real-time applications.
7. Future work
Feature extraction using genetic programming has shown great
success on epileptic EEG classication problems. The same method
can be applied to a more wide range of pattern recognition prob-
lems which are important to humans, such as the Alzheimers
and Parkinsons diseases detection and diagnosis. Current method-
ology was derived from the combination of genetic programming
and the KNN classier, and more work can investigate the perfor-
mance of genetic programming combination with other classier
systems, such as articial neural networks. These are all interesting
directions for the future work.
Acknowledgements
Ling Guo was nancially supported through a fellowship of the
Agencia Espaola de Cooperacin International (AECI) and the
Spanish Ministry of Foreign Affairs.
References
Adeli, H., Zhou, Z., & Dadmehr, N. (2003). Analysis of EEG records in an epileptic
patient using wavelet transform. Journal of Neuroscience Methods, 123(1), 6987.
Andrzejak, R., Lehnertz, K., Mormann, F., Rieke, C., David, P., & Elger, C. (2001).
Indications of nonlinear deterministic and nite-dimensional structures in time
series of brain electrical activity: Dependence on recording region and brain
state. Physical Review E, 64(6). 061907-1061907-8.
Blankertz, B., Curio, G., & Mller, K. (2002). Classifying single trial EEG: Towards
brain computer interfacing. In Advances in neural information processing systems:
Proceedings of the 2002 conference (pp. 157164). Cambridge, Massachusetts:
MIT Press.
Bot, M. (2001). Feature extraction for the k-nearest neighbour classier with genetic
programming. Lecture Notes in Computer Science, 256267.
Bot, M., & Langdon, W. (2000). Application of genetic programming to induction of
linear classication trees. In Genetic programming, proceedings of EuroGP2000
(pp. 247258). Berlin, Heidelberg: Springer-Verlag.
Chui, C. (1992). An introduction to wavelets. Boston: Academic Press.
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classication. IEEE Transactions
on Information Theory, 13(1), 2127.
Darwin, C. (1864). On the origin of species by means of natural selection or the
preservation of favoured races in the struggle for life. Cambridge, UK: Cambridge
University Press.
Duda, R., Hart, P., & Stork, D. (2001). Pattern classication. New York: Wiley.
Ebner, M., & Rechnerarchitektur, A. (1998). On the evolution of interest operators
using genetic programming. In Late breaking papers at EuroGP98: The rst
European workshop on genetic programming (pp. 610).
Ebner, M., & Zell, A. (1999). Evolving a task specic image operator. Lecture Notes in
Computer Science, 7489.
Espejo, P., Ventura, S., & Herrera, F. (2010). A survey on the application of genetic
programming to classication. IEEE Transactions on Systems, Man, and
Cybernetics, Part C: Applications and Reviews, 40(2), 121144.
Firpi, H., Goodman, E., & Echauz, J. (2006). On prediction of epileptic seizures by
means of genetic programming articial features. Annals of Biomedical
Engineering, 34(3), 515529.
Fukunaga, K. (1990). Introduction to statistical pattern recognition (2nd ed.). Boston:
Academic Press.
Table 7
The classication accuracy and the number of features required for the epileptic EEG classication of our method compared to the results of other methods.
Researchers Method Dataset Accuracy (%) Number of input features
Srinivasan et al. (2005) Time & Frequency domain featuresrecurrent neural network Z, S 99.6 5
Polat and Gnes (2007) Fast Fourier transformdecision tree Z, S 98.72 129
Nigam and Graupe (2004) Nonlinear pre-processing lterdiagnostic neural network Z, S 97.2 2
Subasi (2007) Discrete wavelet transformmixture of expert model Z, S 95 16
Tzallas et al. (2007) Time frequency analysisarticial neural network Z, S 99 13
This work GP-based feature extraction-KNN classier Z, S 99.2 2.32
Sadati et al. (2006) Discrete wavelet transformadaptive neural fuzzy network Z, F, S 85.9 6
Tzallas et al. (2007) Time frequency analysisarticial neural network Z, F, S 98.6 22
beyli (2008b) Discrete wavelet transformmixture of experts network Z, F, S 93.17 30
This work GP-based feature extraction-KNN classier Z, F, S 93.5 3.48
L. Guo et al. / Expert Systems with Applications 38 (2011) 1042510436 10435
Garrett, D., Peterson, D., Anderson, C., & Thaut, M. (2003). Comparison of linear and
nonlinear methods for EEG signal classication. IEEE Transactions on Neural
Systems and Rehabilitative Engineering, 11(2), 141144.
Gler, N., beyli, E., & Gler, I. (2005). Recurrent neural networks employing
Lyapunov exponents for EEG signals classication. Expert Systems with
Applications, 29(3), 506514.
Guo, H., Jack, L., & Nandi, A. (2005). Feature generation using genetic programming
with application to fault classication. IEEE Transactions on Systems, Man, and
Cybernetics, Part B: Cybernetics, 35(1), 8999.
Harvey, N., Theiler, J., Brumby, S., Perkins, S., Szymanski, J., Bloch, J., et al. (2002).
Comparison of GENIE and conventional supervised classiers for multispectral
image feature extraction. IEEE Transactions on Geoscience and Remote Sensing,
40(2), 393404.
Is ik, H., & Sezer, E. (in press). Diagnosis of epilepsy from electroencephalography
signals using multilayer perceptron and Elman articial neural networks and
wavelet transform. Journal of Medical Systems. doi:10.1007/s10916-010-
9440-0.
Jahankhani, P., Kodogiannis, V., & Revett, K. (2006). EEG signal classication using
wavelet feature extraction and neural networks. In IEEE John Vincent Atanasoff
2006 international symposium on modern computing (JVA06) (pp. 5257).
Kalayci, T., & Ozdamar, O. (1995). Wavelet preprocessing for automated neural
network detection of EEG spikes. IEEE Engineering in Medicine and Biology
Magazine, 14(2), 160166.
Kandel, E., Schwartz, J., & Jessell, T. (2000). Principles of neural science. New York:
McGraw-Hill, Health Professions Division.
Kotani, M., Nakai, M., & Akazawa, K. (1999). Feature extraction using evolutionary
computation. In Proceedings of the 1999 congress on evolutionary computation,
1999. CEC 99 (Vol. 2, pp. 12301236).
Koza, J. (1992). Genetic programming: On the programming of computers by means of
natural selection. Cambridge, Massachusetts: MIT Press.
Krawiec, K. (2002). Genetic programming-based construction of features for
machine learning and knowledge discovery tasks. Genetic Programming and
Evolvable Machines, 3(4), 329343.
Mallat, S. (1989). A theory for multiresolution signal decomposition: The wavelet
representation. IEEE Transactions on Pattern Analysis and Machine Intelligence,
11(7), 674693.
Micheli-Tzanakou, E. (2000). Supervised and unsupervised pattern recognition:
Feature extraction and computational intelligence. Boca Raton, FL: CRC Press.
Mohseni, H., Maghsoudi, A., Kadbi, M., Hashemi, J., & Ashourvan, A. (2006).
Automatic detection of epileptic seizure using time-frequency distributions. In
IET 3rd International conference on advances in medical, signal and information
processing, MEDSIP 2006 (pp. 14).
Nigam, V., & Graupe, D. (2004). A neural-network-based detection of epilepsy.
Neurological Research, 26(1), 5560.
Ocak, H. (2009). Automatic detection of epileptic seizures in EEG using discrete
wavelet transform and approximate entropy. Expert Systems with Applications,
36(2), 20272036.
Polat, K., & Gnes, S. (2007). Classication of epileptiform EEG using a hybrid
system based on decision tree classier and fast Fourier transform. Applied
Mathematics and Computation, 187(2), 10171026.
Rabual, J., Dorado, J., Puertas, J., Pazos, A., Santos, A., & Rivero, D. (2003). Prediction
and modelling of the rainfallrunoff transformation of a typical urban basin
using ANN and GP. Applied Articial Intelligence, 17(4), 329343.
Raymer, M., Punch, W., Goodman, E., & Kuhn, L. (1996). Genetic programming for
improved data mining: Application to the biochemistry of protein interactions.
In Proceedings of the rst annual conference on genetic programming
(pp. 375380). Cambridge, Massachusetts: MIT Press.
Rivero, D., Rabual, J., Dorado, J., & Pazos, A. (2005). Time series forecast with
anticipation using genetic programming. In Computational intelligence and
bioinspired systems, 8th international work-conference on articial neural
networks (pp. 968975). Berlin, Heidelberg: Springer.
Sabeti, M., Katebi, S., & Boostani, R. (2009). Entropy and complexity measures for
EEG signal classication of schizophrenic and control participants. Articial
Intelligence in Medicine, 47(3), 263274.
Sadati, N., Mohseni, H., & Maghsoudi, A. (2006). Epileptic seizure detection using
neural fuzzy networks. In 2006 IEEE international conference on fuzzy systems
(pp. 596600).
Sherrah, J. (1998). Automatic feature extraction for pattern recognition. Ph.D. Thesis,
The University of Adelaide.
Silva, S. (2007). GPLAB-a genetic programming toolbox for MATLAB.
Srinivasan, V., Eswaran, C., & Sriraam, N. (2005). Articial neural network based
epileptic detection using time-domain and frequency-domain features. Journal
of Medical Systems, 29(6), 647660.
Subasi, A. (2005a). Automatic recognition of alertness level from EEG by using neural
network andwavelet coefcients. Expert Systems withApplications, 28(4), 701711.
Subasi, A. (2005b). Epileptic seizure detection using dynamic wavelet network.
Expert Systems with Applications, 29(2), 343355.
Subasi, A. (2006). Automatic detection of epileptic seizure using dynamic fuzzy
neural networks. Expert Systems with Applications, 31(2), 320328.
Subasi, A. (2007). EEG signal classication using wavelet feature extraction and a
mixture of expert model. Expert Systems with Applications, 32(4), 10841093.
Tackett, W. (1993). Genetic programming for feature discovery and image
discrimination. In Proceedings of the fth international conference on genetic
algorithms, ICGA-93 (pp. 303309).
Tzallas, A., Tsipouras, M., & Fotiadis, D. (2007). A time-frequency based method for
the detection of epileptic seizures in EEG recordings. In 20th IEEE international
symposium on computer-based medical systems, 2007, CBMS07 (pp. 135140).
beyli, E. (2006a). Analysis of EEG signals using Lyapunov exponents. Neural
Network World, 16(3), 257273.
beyli, E. (2006b). Fuzzy similarity index employing Lyapunov exponents for
discrimination of EEG signals. Neural Network World, 16(5), 421431.
beyli, E. (2008a). Analysis of EEG signals by combining eigenvector methods and
multiclass support vector machines. Computers in Biology and Medicine, 38(1),
1422.
beyli, E. (2008b). Wavelet/mixture of experts network structure for EEG signals
classication. Expert Systems with Applications, 34(3), 19541962.
beyli, E. (2009). Combined neural network model employing wavelet coefcients
for EEG signals classication. Digital Signal Processing, 19(2), 297308.
beyli, E., & Gler, I. (2007). Features extracted by eigenvector methods for
detecting variability of EEG signals. Pattern Recognition Letters, 28(5), 592603.
10436 L. Guo et al. / Expert Systems with Applications 38 (2011) 1042510436

You might also like