You are on page 1of 12

*Corresponding author. Tel.: #61-7-3365-3984; fax: #61-7-3365-4999.

E-mail address: ajantha@csee.uq.edu.au (A.S. Atukorale).


Neurocomputing 35 (2000) 165}176
Hierarchical overlapped neural gas network with
application to pattern classi"cation
Ajantha S. Atukorale*, P.N. Suganthan'
Department of Computer Science and Electrical Engineering, University of Queensland,
Brisbane, Qld. 4072, Australia
'School of Electrical and Electronic Engineering, Nanyang Technological University, 639798,
Republic of Singapore
Received 12 June 1999; accepted 13 April 2000
Abstract
This paper describes our investigations into the neural gas (NG) network. The original neural
gas network is computationally expensive, as an explicit ordering of all distances between
synaptic weights and the training sample is necessary. This has a time complexity of O(Nlog N)
in its sequential implementation. An alternative scheme was proposed for the above explicit
ordering where it is carried out implicitly. In addition, a truncated weight updating rule was
used similar to Choy and Siu (IEEE Trans. Communications 46 (3) (1998) 301}304). By
implementing the above modi"cations, the NG algorithm was made to run faster in its
sequential implementation. A hierarchical overlapped neural gas architecture was developed on
top of the above modi"ed NG algorithm for the classi"cation of real world handwritten
numerals with high variations. This allowed us to obtain multiple classi"cations for each
sample presented, and the "nal classi"cation was made by fusing the individual classi"cations.
An excellent recognition rate for the NIST SD3 database was consequently obtained. 2000
Elsevier Science B.V. All rights reserved.
Keywords: Character recognition; Hierarchical overlapped architecture; Multiple classi"er
fusion; Neural gas network; Self-organizing maps
1. Introduction
Neural network models have been intensively studied for many years in an e!ort to
obtain superior performance compared to classical approaches. The self-organizing
0925-2312/00/$- see front matter 2000 Elsevier Science B.V. All rights reserved.
PII: S 0 9 2 5 - 2 3 1 2 ( 0 0 ) 0 0 3 1 5 - 5
feature map (SOFM) proposed by Kohonen [12] is one of the network paradigms
widely used for solving complex problems such as vector quantization, speech recog-
nition, combinatorial optimization, pattern recognition and modeling the structure of
the visual cortex. Kohonen's feature map is a special way of conserving the topologi-
cal relationships in input data, which also bears some limitations. In this model the
neighborhood relations between neural units have to be de"ned in advance. The
topology of the input space has to match the topology of the output space which is to
be represented. In addition, the dynamics of the SOFMalgorithmcannot be described
as a stochastic gradient descent on any energy function. A set of energy functions, one
for each weight vector, seems to be the best description of the dynamics of the
algorithm [5].
Martinetz et al. [13] and Martinetz and Schulten [14] proposed the neural gas
(NG) network algorithm for vector quantization, prediction and topology representa-
tion a few years ago. The NG network model: (1) converges quickly to low-distortion
errors, (2) reaches a distortion error lower than that resulting from K-means cluster-
ing, maximum-entropy clustering and Kohonen's SOFM, (3) obeys a gradient descent
on an energy surface. Similar to the SOFM algorithm the NG algorithm uses
a soft-max adaptation rule (i.e., not only adjust the winner reference vector, but also
a!ects all the cluster centers depending on their proximity to the input signal). This is
mainly to generate a topographic map and also to avoid con"nement to local minima
during the adaptation procedure. Despite all those advantages, the NG network
algorithm su!ers from a high time complexity problem in its sequential implementa-
tion [4].
In this paper, we discuss how such a time complexity problem associated with the
NG algorithm can be reduced e$ciently. In addition, by de"ning an hierarchical
overlapped structure [15] on top of the standard NG network, we obtained a hier-
archical overlapped neural gas (HONG) network model for the classi"cation of
real-world handwritten digits with high variations.
The paper is organized as follows. After a brief review of the NG network algorithm
in Section 2, the proposed speed-up technique is presented in Section 3. In Section 4,
the functionality of the HONG network architecture is discussed, and in Section 5, the
experimental results are presented. The paper is concluded with a brief discussion in
Section 6.
2. The neural gas algorithm
In the neural gas algorithm, the synaptic weights w
G
are adapted without any "xed
topological arrangement of the neural units within the network. Instead, it utilizes
a neighborhood-ranking of the synaptic weights w
G
for given data vector *. The synaptic
weight changes w
G
are not determined by the relative distances between the neural
units within a topologically prestructured lattice, but by the relative distances between
the neural units within the input space: hence the name neural gas network.
Information about the arrangement of the receptive "elds within the input space is
implicitly given by a set of distortions, D
T
"+""*!w
G
"", i"1,
2
, N,, associated with
166 A.S. Atukorale, P.N. Suganthan / Neurocomputing 35 (2000) 165}176
each *, where N is the number of units in the network [14]. Each time an input signal
* is presented, the ordering of the elements of the set D
T
is necessary, to determine the
adjustment of the synaptic weights w
G
. This ordering has a time complexity of
O(Nlog N) in its sequential implementation. The resulting adaptation rule can be
described as a winner-take-most instead of winner-take-all rule.
The presented input signal * is received by each neural unit i and induces excitations
f
G
(D
T
) which depend on the set of distortions D
T
. Assuming a Hebb-like rule, the
adaptation step for w
G
is given by adjusting
w
G
" f
G
(D
T
)(*!w
G
), i"1,
2
, N. (1)
The step size 3[0,1] describes the overall extent of the modi"cation (learning rate)
and f
G
(D
T
)3[0,1] accounts for the topological arrangement of the w
G
within the input
space. Martinetz and Schullen [14] reported that an exponential function exp(!k
G
/)
should give the best overall result, compared to other choices like Gaussians for the
excitation function f
G
(D
T
), where determines the number of neural units signi"cantly
changing their synaptic weights with the adaptation step (1). The rank index
k
G
"0,
2
, (N!1), describes the neighborhood-ranking of the neural units, with k
G
"0
being the closest synaptic weight (w
G
"
) to the input signal *, k
G
"1 being the second
closest synaptic weight (w
G

) to *, and so on. That is, the set +w


G
"
, w
G

,
2
, w
G
,
, is the
neighborhood-ranking of w
G
relative to the given input vector *. The neighborhood-
ranking index k
G
depends on * and the whole set of synaptic weights
W"+w

, w
`
,
2
, w
,
, and we denote this as k
G
(*, W). The original NG network
algorithm is summarized below.
NG1: Initialize the synaptic weights w
G
randomly and the training parameter values
(
G
,
D
,
G
,
D
), where
G
,
G
are initial values of (t), (t) and
D
,
D
are "nal
values of (t), (t).
NG2: Present an input vector * and compute the distortions D
T
.
NG3: Order the distortion set D
T
in ascending order.
NG4: Adapt weight vectors according to
w
G
"(t)h
H
(k
G
(*, W))(*!w
G
), i"1,
2
, N, (2)
where the parameters have the following time dependencies:
(t)"
G
(
D
/
G
)RR
`
, (t)"
G
(
D
/
G
)RR
`
, h
H
(k
G
)"exp(!k
G
/(t)).
NG5: Increment the time parameter t by 1.
NG6: Repeat NG2}NG5 until the maximum iteration t
`
is reached.
3. Implicit ranking scheme
The NG network algorithm su!ers from a high time complexity problem in its
sequential implementation [14,4]. In the original neural gas network, an explicit
ordering of all distances between synaptic weights and the training sample was
necessary. This has a time complexity of O(Nlog N) in a sequential implementation.
Recently some work has been done on speeding-up procedures for the sequential
A.S. Atukorale, P.N. Suganthan / Neurocomputing 35 (2000) 165}176 167
implementation of the NG algorithm. Ancona et al. [1] discussed the questions of
sorting accuracy and sorting completeness. With theoretical analysis and experi-
mental evidence, they have concluded that partial, exact sorting (i.e., ordering the top
few winning units correctly and keeping all other units in the list una!ected) performs
better than complete but noisy sorting (i.e., ordering the top few winning units
correctly and subjecting all other remaining units to inexact sorting). Also, they have
concluded that even a few units in partial sorting is su$cient to attain a "nal
distortion equivalent to that attained by the original NG algorithm. Moreover, they
have concluded that correct identi"cation of the best-matching unit becomes more
and more important while training proceeds. This is obvious, because as training
proceeds, the adaptation step (1) becomes equivalent to the K-means adaptation rule.
Choy and Siu [4] have also applied a partial distance elimination (PDE) method to
speed-up the NG algorithm in the above context.
In our investigations, we eliminated the explicit ordering (NG3 in the above
summary) by employing the following implicit ordering metric:
m
G
"
(d
G
!d
'"
)
(d
`
!d
'"
)
, (3)
where d
'"
and d
`
are the minimum and maximum distances between the training
sample and all reference units in the network respectively, and d
G
3D
T
, i"1,
2
, N.
The best matching winner unit will then have an index of 0, the worst matching unit
will have an index of 1, and the other units will take values between 0 and 1 (i.e.,
m
G
3[0,1]).
By employing the above modi"cation to the original NG algorithm discussed
earlier, we modi"ed the two entries NG3 and NG4 as follows.
NG3: Find d
'"
, d
`
from the distortion set D
T
.
NG4: Adapt weight vectors according to
w
G
"(t)h
H
(m
G
(*, W))(*!w
G
), i"1,
2
, N, (4)
where h
H
(m
G
)"exp(!m
G
/(t)) and (t)"(t)/(N!1).
The modi"cation in (3) sped up the running time of the sequential implementation of
the NG algorithm by about 17% on average (see Table 1).
Further modi"cation to the weight update rule in (4) is done by rewriting it as
w
G
"(t)(*!w
G
),
where (t)"(t)exp(!m
G
/(t)). We now update only those neurons with a non-
negligible e!ective overall learning rate (t) as in [14,4]. Given a threshold for (t) (say
(t)"10`), the weight update rule in (4) is modi"ed by updating neurons in the
following way:
(t)510`Nexp(!m
G
/(t))510`/(t).
168 A.S. Atukorale, P.N. Suganthan / Neurocomputing 35 (2000) 165}176
Table 1
Comparison of the processing time, number of updates and recognition rate for the original
NG algorithm (2), the implicit ranking metric (4), and the truncated weight update rule (6)
Original NG Implicit ranking Truncated update
Processing time (s) 152.93 27.72 N/A
No. of updates 106,152,000 106,152,000 707,957
Recognition rate (%) 96.83 96.81 96.98
Thus m
G
/(t)45 log(10)#log((t)). If we let r(t)"5 log(10)#log((t)), since
(t)"
G
(
D
/
G
)RR
`
then it follows that:
r(t)"5 log(10)#log(
G
)#log(
D
/
G
)t/t
`
. (5)
That is, update the weight vectors according to the following truncated weight update
rule:
w
G
"

(t)exp(!m
G
/(t))(*!w
G
) if m
G
4r(t)(t),
0 otherwise,
(6)
where r(t) is a parameter which depends only on t. Because of the above truncation,
the weight update rule (4) will update those weights with non-zero values of w
G
.
These modi"cations were able to eliminate the explicit ranking mechanism com-
pletely, and reduce the number of weight updates by about 99% on average (see Table
1). This also sped up the sequential implementation of the NG algorithm by about
96% on average (see Table 1).
4. Hierarchical overlapped neural gas architecture
By retaining the essence of the original NG algorithm and our modi"cation, we
have developed a hierarchical overlapped neural gas (HONG) network algorithm for
labeled pattern recognition.
The structure of the HONG network architecture is an adaptation of the hierarchi-
cal overlapped architecture developed for SOMs by Suganthan [15]. First, the
network is initialized with just one layer which is called the base layer. The number of
neurons in the base layer has to be chosen appropriately. In labeled pattern recogni-
tion applications, the number of distinct classes and the number of training samples
may be considered in the selection of the initial size of the network. Similar to the
SOM architecture, in the HONG network every neuron has an associated synaptic
weight vector which has the same dimension as the input feature vector. Once we have
selected the number of neurons in the base layer, we applied our modi"ed version of
the NG algorithm to adapt the synaptic weights of the neurons in the base network.
Having completed the unsupervised NG learning, the neurons in the base layer were
labeled using a simple voting mechanism. In order to "ne tune the labeled network, we
A.S. Atukorale, P.N. Suganthan / Neurocomputing 35 (2000) 165}176 169
Fig. 1. Hierarchical overlapped architecture showing three second layer NG networks grown from units,
A, B and C of the base NG network.
applied the supervised learning vector quantization (LVQ) algorithm [12]. The
overlaps for each neuron in the base layer were then obtained. For instance, if we had
100 neurons in the base layer network, then we have 100 separate second layer NG
networks grown from each neuron in the base layer network (see Fig. 1).
The overlapping is achieved by duplicating every training sample to train several
second layer NG networks. That is, the winning neuron as well as a number of
runners-up neurons will make use of the same training sample to train second layer
NG networks grown from those neurons in the base layer NG network. In Fig. 1 for
example, the overlapped NG network grown from neuron A is trained on samples
where the neuron A is the winner or one of the "rst few runners-up for all the training
samples presented to the base layer network. That is, if we have an overlap of 5 (i.e.,
the winner and 4 runners-up) for the training samples then each training sample is
being used to train "ve di!erent second layer NG networks. Fig. 1 also shows the
overlap in the feature space of the two overlapped second layer NG networks
conceptually assuming that the nodes A and B are adjacent to each other in the
170 A.S. Atukorale, P.N. Suganthan / Neurocomputing 35 (2000) 165}176
As we train up to 5 second layer NG networks using the same training sample, we claim that there is
a partial overlap between several second layer NG networks.
feature space. Once the partially overlapping training samples were obtained for each
of the second layer NG networks, we trained each of them as we trained the base layer
NG network earlier. The second layer NG networks were then labeled using a simple
voting mechanism. The testing samples were also duplicated, but to a lesser degree
(e.g., 3 times). Hence the testing samples "t well inside the feature space spanned by the
winner and several runners-up in the training data. In addition, this duplication of the
samples assists each HONG network to generate "ve independent classi"cations for
every training sample and three independent classi"cations for every testing sample.
In order to combine the outputs [3,11] of the second layer NG networks, we
employed the idea of con"dence values. That is, we obtained a con"dence value for
every sample's membership in every class( j) of each overlapped second layer NG
network (i.e., "ve for training and, three for testing) using the following:
c
H
"

1!
d
H
d

, (7)
where d
H
is the minimum distance for every class j, d

""
H"
d
H
and j"0,
2
, 9 for
numeral classi"cation. This will de"ne a con"dence value (c
H
) for the input pattern
belonging to the jth class of an overlapped second layer NG network. The class which
has the global minimum distance yields a con"dence value closer to one (in case of
a perfect match, i.e., d
H
"0, the con"dence value for that class becomes one). That is,
the higher the con"dence value for a class, more likely a sample belongs to that class.
We can also consider the above function as a basic probability assignment, because
04c
H
41. We can de"ne CG, the collection of all ten con"dence values of a second
layer NG network as
CG"+c
H
" j"0, 1,
2
, 9,, (8)
where i"1, 2,
2
, n and n is the number of overlaps considered (e.g., n"5 for training
and n"3 for testing). Here onwards we refer to this as the conxdence vector of that
second layer NG network. For example, let us assume that we are considering three
overlaps for testing data. Then we get three conxdence vectors from the corresponding
second layer NG networks. Given the individual conxdence vectors we can calculate
the overall conxdence vector of the HONG architecture by adding the individual
con"dence values according to their class label (see Fig. 1). We can then assign the
class label of the test data according to the overall conxdence vector (i.e., select the
index of the maximum con"dence value from the vector). The summary of the HONG
network algorithm is given below.
HONG1: Initialize the synaptic weights and training parameters of the base NG
network.
HONG2: Train the base NG network using the modi"ed NG algorithm as ex-
plained in (4) using the neighborhood function de"ned in (6).
A.S. Atukorale, P.N. Suganthan / Neurocomputing 35 (2000) 165}176 171
HONG3: Label the base NG network using a simple voting scheme.
HONG4: Fine tune the base map with the supervised LVQ algorithm.
HONG5: Obtain the overlaps for each unit in the base layer.
HONG6: Initialize the synaptic weights of the second level NG networks around
their base layer unit's (i.e., root unit) value.
HONG7: Train each second layer NG network as in HONG2 using the overlap-
ped samples obtained in HONG5.
HONG8: Label each of the second layer NG networks accordingly as in HONG3,
and "ne tune them as in HONG4.
HONG9: Obtain the "nal recognition rates by combining the con"dence values
generated by each of the second layer overlapped NG networks.
5. Experimental results
5.1. Implicit ranking
Table 1 compares the corresponding processing times, number of updates and the
recognition rates using our proposed implicit ranking metric with those obtained in
the original NG algorithm. The simulations were performed using the parameters
given in Subsection 5.2 on a Pentium II, 350 MHz personal computer.
The overall processing time of the NG algorithm results from two distinct phases.
First, it consists of the distance calculation phase and, second, the sorting or the
ranking phase. Note that the distance calculation time is common to both explicit
sorting and implicit ranking algorithms. In Table 1, the processing time for `Original
NGa (t
'
) refers to the time taken by the sorting procedure (using qsort(), the C library
routine with time complexity O(Nlog N)). The common distance calculation time (t

)
taken by the given parameters is 154.51 s.
The percentage of improvement of time (or the speed-up) for the implicit ranking
metric over the sorting algorithm is given by
speed up"(t
'
!t

/t
'
);100% (9)
where t
'
, t

are as shown in Fig. 2. In Table 1, the `No. of updatesa refers to the total
number of updates performed in the training phase. This is given by the total number
of training samples multiplied by the total number of neurons in the base network
multiplied by the total number of iterations. The `Recognition Ratea refers to the
training rate of the base network using the given samples.
The results in the second and third columns of Table 1 compare the original NG
algorithm against the implicit ranking metric de"ned in (3). This is a comparison
between equations (2) and (4). We have achieved a speed up of 81.87% with the
proposed implicit ranking metric.
The results in the second and fourth columns of Table 1 compare the original NG
algorithm against the truncated weight update rule. This is a comparison between
equations (2) and (6). The truncated weight update rule has reduced the number of
weight updates vastly } more than 99%. Since the processing time for this involves
172 A.S. Atukorale, P.N. Suganthan / Neurocomputing 35 (2000) 165}176
Fig. 2 . Processing time of the NG algorithm.
weight update time, we did not report its processing time. Also, this modi"cation has
increased the recognition rate slightly. This is due to the fact that, very small weight
updates are generally noisy and, eliminating them would improve recognition accuracy.
5.2. Character recognition
We performed experiments on handwritten numerals to test our proposed HONG
classi"er. These handwritten numeral samples were extracted from the NIST SD3
database (which contains 223,124 isolated handwritten numerals written by 2100
writers and scanned at 300 dots per inch) provided by the American National Institute
of Standards and Technology (NIST) [7]. We partitioned the NIST SD3 database
into non-overlapping sets shown in Table 2. The test set comprises samples from 600
writers not used in the training and validation sets.
We restricted the number of upper layers of the overlapped NG networks to two.
The base layer consisted of 250 neurons. The number of neurons for each overlapped
NG network (second layer) were determined empirically by considering the available
training samples for each of them. We found experimentally that min+300, max+35,
(training
}
samples)/8,, was a good estimate for determining the number of neurons for
the second layer. We used "ve overlaps for the training set and three overlaps for the
testing set. Through trial and error, we discovered empirically
G
"0.7,
D
"0.05,

G
"0.01,
D
"0.0001 and t
`
"4;training
}
samples, gave the best results for the
proposed network. Using the above parameters in equation (5), we calculated
the parameter r(t)"11.156!6.2;10"t which was used to truncate the weight
update rule as described in (6).
The feature extraction algorithm is brie#y summarized below.
E Prior to the feature extraction operation, pre-processing operations to the isolated
numerals were performed. This involved removing isolated blobs from the binary
image based on a ratio test.
E The pre-processed digit was then centered and only the numeral part was extracted
from the 128;128 binary image.
E The extracted binary image was rescaled to an 88;72 pixel resolution.
E Finally, each such binary image was subsampled into 8;8 blocks and the result
was an 11;9 grey scale image with pixel values in the range [0,64].
A.S. Atukorale, P.N. Suganthan / Neurocomputing 35 (2000) 165}176 173
Table 2
Partitions of SD3 data set used in our experiments
Partition(s) hsf
}
+0, 1, 2, hsf
}
+0, 1, 2, hsf
}
3
Size 106,152 53,076 63,896
Use Training Validation Testing
Table 3
Recognition rates
Training Validation Testing
Base network 99.31% 98.60% 98.84%
HONG 99.90% 99.30% 99.30%
As a result of the above feature extraction method, we were left with a feature vector of
99 elements. The recognition rates obtained using the above parameters are illustrated
in Table 3. As can be seen, the HONG architecture improves further on the high
classi"cation rate provided by the base NG network. To the best of our knowledge,
the most successful results obtained for the NIST SD3 database were by Ha and
Bunke [8]. They used a total of 223,124 samples and obtained a recognition rate of
99.54% from a test set of 173,124 samples. They designed two recognition systems
based on two distinct feature extraction methods and used a fully connected feed-
forward three layer perceptron as the classi"er for both feature extraction methods. In
addition, if the best score in the combined classi"er was less than a "xed prede"ned
threshold, they replaced the normalization operation prior to feature extraction by
a set of perturbation processes which modeled the writing habits and instruments.
6. Conclusions
In this paper, we have proposed an implicit ranking scheme to speed-up the
sequential implementation of the original NG algorithm. In contrast to Kohonen's
SOFM algorithm, the NG algorithm takes a smaller number of learning steps to
converge, does not require any prior knowledge about the structure of the network,
and its dynamics can be characterized by a global cost function.
We have also developed the HONG network architecture to obtain a better
classi"cation on con#icting data. This is particularly important in totally uncon-
strained handwritten data, since they contain con#icting information within the same
class due to the various writing styles and instruments used. The HONG network
architecture systematically partitions the input space to avoid such situations by
projecting the input data to di!erent upper level NG networks (see Fig. 1). Since the
training and the testing samples were duplicated in the upper layers, we obtained
multiple decision classi"cations for every sample. The "nal classi"cation was obtained
174 A.S. Atukorale, P.N. Suganthan / Neurocomputing 35 (2000) 165}176
by combining the individual classi"cations generated by the second level networks.
We employed the idea of con"dence values in obtaining the "nal classi"cation. The
proposed architecture was tested on handwritten numerals extracted from the NIST
19 database.
Compared to the number of applications for Kohonen's SOFM, there are relatively
few for NG in the literature [2,6,9,10,16,17]. We hope, due to the speeding up method
that we have introduced for the sequential implementation, that there will be more
applications of the NG algorithm in the future.
Acknowledgements
The authors would like to thank Marcus Gallagher, Ian Wood and Hugo Navone
of the Neural Network Laboratory, University of Queensland, Australia, for their
invaluable support and comments. The authors would also like to thank the two
reviewers for their comments and suggestions that improved the quality of this
manuscript.
References
[1] F. Ancona, S. Ridella, S. Rovetta, R. Zunino, On the importance of sorting in neural gas training of
vector quantizers, Proceedings of the ICNN-97, 1997, pp. 1804}1808.
[2] E. Ardizzone, A. Chella, R. Rizzo, Color image segmentation based on a neural gas network,
in: M. Marinaro and P.G. Morsso (Eds.), International Conference on Arti"cial Neural Networks,
1994, pp. 1161}1164.
[3] S.-B. Cho, J.H. Kim, Multiple network fusion using fuzzy logic, IEEE Trans. Neural Networks 6 (2)
(1995) 497}501.
[4] C.S.-T. Choy, W.-C. Siu, Fast sequential implementation of neural gas network for vector quantiz-
ation, IEEE Trans. Commun. 46 (3) (1998) 301}304.
[5] E. Erwin, K. Obermayer, K. Schulten, Self-organizing maps: ordering, convergence properties and
energy functions, Biol. Cybernet. 67 (1992) 47}55.
[6] M. Fontana, N. Borghese, S. Ferrari, Image reconstruction using improved neural gas,
in: M. Marinaro and R. Tagliaferri (Eds.), Proceedings of the Seventh Italian Workshop on Neural
Nets, 1996, pp. 260}265.
[7] M.D. Garris, Design, collection and analysis of handwriting sample image databases, Encyclo.
Comput. Sci. Technol. 31 (16) (1994) 189}213.
[8] T.M. Ha, H. Bunke, O!-line handwritten numeral recognition by perturbation method, IEEE Trans.
Pattern Anal. Mach. Intell. 19 (5) (1997) 535}539.
[9] T. Hofmann, J.M. Buhmann, An annealed neural gas network for robust vector quantization, in:
C. von der Malsburg, W. von Seelen, J. Vorbruggen, B. Sendho! (Eds.), Arti"cial Neural Networks,
Springer, Germany, 1996, pp. 151}156.
[10] K. Kishida, H. Miyajima, M. Maeda, Destructive fuzzy modeling using neural gas network, IEICE
Trans. Fundam. Electron. Commun. Comput. Sci. E80-A (9) (1997) 1578}1584.
[11] J. Kittler, Combining classi"ers: A Theoretical Framework, Pattern Anal. Appl. 1 (1) (1998)
18}27.
[12] T. Kohonen, Self-Organizing Maps, Springer, Berlin, 1995.
[13] T. Martinetz, S.G. Berkovich, K. Schulten, Neural gas network for vector quantization and its
application to times-series prediction, IEEE Trans. Neural Networks 4 (4) (1993) 218}226.
A.S. Atukorale, P.N. Suganthan / Neurocomputing 35 (2000) 165}176 175
[14] T. Martinetz, K. Schulten, A neural-gas network learns topologies, in: T. Kohonen, K. Makisara,
O. Simula, J. Kangas (Eds.), Arti"cial Neural Networks, North-Holland, Amsterdam, 1991, pp.
397}402.
[15] P.N. Suganthan, Hierarchical overlapped SOMs for pattern classi"cation, IEEE Trans. Neural
Networks 10 (1) (1999) 193}196.
[16] B. Zhang, M. Fu, H. Yan, Application of neural gas model in image compression, Proceedings of the
IJCNN-98, Anchorage, Alaska, USA, 1998, pp. 918}921.
[17] B. Zhang, M. Fu, H. Yan, Handwritten digit recognition by neural gas model and population
decoding, Proceedings of the IJCNN-98, Anchorage, Alaska, USA, 1998, pp. 1727}1731.
Ajantha Atukorale received his B.Sc. Degree with honors in Computer Science
from University of Colombo, Sri Lanka in 1995. He is a lecturer at University of
Colombo and currently reading for his Ph.D. degree in the Department
of Computer Science and Electrical Engineering, University of Queensland,
Australia. His research interests include Arti"cial Neural Networks, Pattern
Recognition and Fuzzy Systems. He is a student member of the IEEE and the
Australian Pattern Recognition Society.
Ponnuthurai Nagaratnam Suganthan received the BA and MA degrees in electrical
and information engineering from the University of Cambridge, UK in 1990 and
1994, respectively. He obtained his Ph.D. degree from Nanyang Technological
University, Singapore in 1996 in the "elds of neural networks and pattern recogni-
tion. From August 1995 to September 1996, he worked as a pre-doctoral research
assistant at the University of Sydney. Between October 1996 and July 1999, he was
at the University of Queensland as a lecturer. Since July 1999 he has been an
assistant professor at Nanyang Technological University, Singapore. His research
interests include neural networks, pattern recognition, computer vision and
genetic algorithms. He is a senior member of the IEEEand an associate member of
the IEE.
176 A.S. Atukorale, P.N. Suganthan / Neurocomputing 35 (2000) 165}176

You might also like