Neural Network Implemented in The CMOS Technology

Applied Mathematics and Computation xxx (2015) xxx–xxx
Contents lists available at ScienceDirect
Applied Mathematics and Computation

journal homepage: www.elsevier.com/locate/amc
An efficient initialization mechanism of neurons for Winner

Takes All Neural Network implemented in the CMOS technology
Tomasz Talaśka a,⇑, Marta Kolasa a, Rafał Długosz a,b, Pierre-André Farine b
a
UTP University of Science and Technology, Faculty of Telecommunications, Computer Science and Electrical Engineering, ul. Kaliskiego 7, 85-796 Bydgoszcz, Poland
b
Institute of Microtechnology of Swiss Federal Institute of Technology in Lausanne, Rue de la Maladière 71B, CH-2002 Neuchâtel, Switzerland
a r t i c l e i n f o a b s t r a c t
Keywords: The paper presents a new initialization mechanism based on a Convex Combination
Winner Takes All Neural Network Method (CCM) for Kohonen self-organizing Neural Networks (NNs) realized in the CMOS
Initialization mechanism technology. A proper selection of initial values of the neuron weights exhibits a strong
Convex Combination Method impact on the quality of the overall learning process. Unfortunately, in case of real input
CMOS implementation
data, e.g. biomedical data, proper initialization is not easy to perform, as an exact data dis-
Analog circuits
Low energy consumption
tribution is usually unknown. Bad initialization causes that even 70%–80% of neurons
remain inactive, which increases the quantization error and thus limits the classification
abilities of the NN. The proposed initialization algorithm has a couple of important advan-
tages. Firstly, it does not require a knowledge of data distribution in the input data space.
Secondly, there is no necessity for an initial polarization of the neuron weights before start-
ing the learning process. This feature is very convenient in case of transistor level realiza-
tions. In this case the programming lines, which in other approaches occupy a large chip
area, are not required. We proposed a modification of the original CCM algorithm. A new
parameter which in the proposed analog CMOS realization is represented by an external
current, allows to fit the behavior of the mechanism to NNs containing different numbers
of neurons. The investigations show that the modified CCM operates properly for the NN
containing even 250 neurons. A single CCM block realized in the CMOS 180 nm technology
occupies an area of 300 lm2 and dissipates an average power of 20 lW and at data rate of
up to 20 MHz.
Ó 2015 Elsevier Inc. All rights reserved.
1. Introduction
Artificial neural networks (ANNs) are universal and efficient tools used in solving various problems including identifica-
tion, classification, data compression and others. Many examples of successful application of ANNs in such areas as electron-
ics, electrical engineering, mechanical engineering, economics, health care and others can be found in literature. In the
artificial intelligence (AI) research area the main effort is usually focused on the development of new architectures of
Neural Networks (NN) and more efficient learning algorithms. The aim of these investigations is to improve the quality,
as well as the speed of the learning process. In majority of reported cases ANNs are realized in software, which is due to large
⇑ Corresponding author.
E-mail addresses: talaska@utp.edu.pl (T. Talaśka), markol@utp.edu.pl (M. Kolasa), rafal.dlugosz@gmail.com (R. Długosz), pierre-andre.farine@epfl.ch
(P.-A. Farine).
http://dx.doi.org/10.1016/j.amc.2015.04.123
0096-3003/Ó 2015 Elsevier Inc. All rights reserved.
Please cite this article in press as: T. Talaśka et al., An efficient initialization mechanism of neurons for Winner Takes All Neural Network
implemented in the CMOS technology, Appl. Math. Comput. (2015), http://dx.doi.org/10.1016/j.amc.2015.04.123
2 T. Talaśka et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx
flexibility of such implementations. If the NN does not work properly after the implementation, then it can be easily and
quickly reprogrammed.
In our work we focus on ANNs realized as Application Specific Integrated Circuits (ASICs). Designing such systems is much
more difficult than those realized in software. However, in this case the Figure-of-Merit defined, for example, as data rate
over the power dissipation can be even several orders of magnitude larger than in counterpart software realizations
[1–4]. The achieved improvements of the computation power are mostly due to parallel and asynchronous data processing
being realized in such NNs. Low power consumption, as well as low chip area result from the possibility of an accurate
matching of the internal structure of the circuit to realized functions. This is possible mainly in case of ASICs realized in
the ‘full-custom’ style. NNs being trained in an unsupervised manner, such as Winner Takes All (WTA) and Winner Takes
Most (WTM) self-organizing maps (SOMs) are of our particular interest [4–7]. The rationale behind selection of these
algorithms is their high efficiency on the one hand, and simplicity, on the other one. Learning algorithms in this case require
only basic arithmetical operations and thus can be easily implemented at the transistor level.
A typically long design process of such systems is one of the essential reasons why hardware implementations are still
quite rarely considered in practice in comparison with existing software implementations. While there is some rationale
behind such arguments, the recent developments of electronic circuits, in particular target applications where ultra low
energy consumption is of the paramount relevance, justify new investigations in the area of hardware realized ANNs. For
example, in wireless sensor networks (WSN) a substantial effort has been put to develop devices which are self-sufficient
in energy supply [8–10]. This is a class of circuits in which hardware implemented ANNs may find a wide array of
applications.
The basic question usually asked before starting designing a system of this type is whether to use analog, digital or mixed
technique. Analog circuits usually suffer from various phenomena such as a transistor mismatch, a leakage in analog memory
cells, off-sets in comparators, etc, however offer a compact structure of circuits that sometimes perform even very complex
functions. Digital circuits, on the other hand, are much more complex, yet simultaneously more robust against the
undesirable phenomena listed above. Depending on the type of the realized block and the target application different
techniques are used.
The work presented in this paper is a continuation of our former project, in which we developed from scratch a fully
analog NN [4,5]. This is one of the reasons why the presented circuit is realized in analog technique. However, we do not
limit the presented work to implementation aspects only. In fact, we propose a new algorithm that can be classified either
as an efficient initialization mechanism but also, to some extent, as a modification of the WTA learning algorithm. The
presented results can be also used in digital ASICs, as well as in pure software systems.
1.1. Problems associated with the initialization of neuron weights
The efficiency of training of ANNs depends on many circumstances. One of them is a proper polarization of neurons
weights before starting the learning phase. The weights can be viewed as coordinates that determine the location of partic-
ular neurons in an m-dimensional input data space, where m is the number of inputs of the NN. A proper distribution of neu-
rons in this space before proceeding with the learning phase has a direct impact on the convergence speed of the learning
process, as well as on the final training results [11–14]. In this work we focus on NNs trained in the unsupervised manner. In
such networks properly selected initial values of the weights have a strong impact on the number of the so-called dead neu-
rons [4]. Such neurons take part in the competition, but never win and therefore never become representatives of any data
class. This increases the quantization error of the NN [4,15,16] as discussed in Section 2.2.
Various initialization methods can be found in literature, but one universal method suitable for all learning algorithms
does not exist. The initialization process of neuron weights should reflect data distribution in a training dataset. The problem
is that such distribution is usually not known in advance. For this reason, the initial values of the weights are usually deter-
mined either empirically or randomly. More sophisticated methods also exist, but they are usually too complex to be easily
implemented at the transistor level [17,18]. The problem of the initialization can be considered from two different points of
view. Since NNs are usually implemented in software, the problem of the computational complexity of a given initialization
algorithm is of less relevance in this case. Taking into account the computational capabilities of computers used today, the
most important criterion is the efficiency of a given algorithm. In case of NNs realized as ASICs the situation is opposite. In
this case we have to face with various hardware inaccuracies and limitations that have an influence on the initialization
process.
Various initialization methods have been proposed over the past twenty years [13,19–21,23,24]. In a very common and
simple approach a random and uniform distribution of the weights over the input data space is being applied [19–21]. This
approach is very fast, which is one of its main advantages. However it is not always effective, as it does not reflect a
distribution of data over the input data space, which is usually not known. As regards to ANNs realized in hardware working
in parallel, in which each neuron is a separate circuit, this method generates several additional problems. One of them is
especially visible in large NNs, in which a net of programming lines connected to particular neuron weights makes the layout
very complex. Due to usually limited number of pins in the chip, the weights have to be programmed sequentially, which
requires an additional circuitry responsible for addressing particular memory cells. The two described problems can be
classified as implementation issues. There are also complications that occur during the operation of the NN. Potentials on
the programming lines usually differ from the values of the weights stored as voltages in corresponding memory cells.
T. Talaśka et al. / Applied Mathematics and Computation xxx (2015) xxx–xxx 3
The potential difference increases the leakage effect, which disturbs the learning process to some extent [5]. This method has
been applied in our previous prototype chip realized in the CMOS 0.18 lm technology with the Winner Takes All (WTA) NN.
It was a small network with only twelve weights designed to verify the concepts of particular components, and therefore the
described problem was insignificant in that case [4].
Another initialization method can be referred to as linear approach. In this case a signal linearly increasing over time is
provided sequentially to particular weights [21]. Looking from the hardware complexity point of view, this approach is
similar to the one in which random values are being used. The only difference relies on the type of programming signals
provided to the NN. This method also does not take into account the distribution of data in the learning dataset.
More sophisticated initialization algorithms take into account a distribution of data in the input data space. One of them
relies on locating particular neurons in places of first n learning patterns X drawn from a given dataset, where n is the
number of neurons in the NN [25]. As patterns X in a typical set are placed in a random order, the distribution of the neurons
after the initialization phase is more representative for the overall dataset than in the random and linear methods described
earlier. This approach is more efficient than the previous two at the system level, however in hardware realizations the
problem with the necessity of providing additional programming lines to particular memory cells still remains. Hence, this
approach is also not suitable for large NNs realized at the transistor level.
An interesting method that supports the learning process and to some extent can overcome the effects of bad
initialization relies on using the so-called conscience mechanism. Our previous investigations show that if this mechanism
is used together with the initialization mechanism, the number of dead neurons can be strongly reduced or eliminated, even
if the initialization is not optimal [4,14].
In this paper, we focus on another approach called Convex Combination Method (CCM). This method, although proposed
many years ago [26], has never been realized at the transistor level. In literature only rare software implementations of this
algorithm can be found [27]. This concept may be attractive for transistor level realizations, as in this case all neuron weights
pffiffiffiffiffiffiffiffiffiffi
are assumed to have values always equal to 1=m [26]. If the number of inputs, m, is fixed, the programming lines can be
eliminated and particular weights can be initialized with constant signals generated inside memory cells that store these
weights. A very important advantage of the CCM method is that it automatically adjusts the initialization process to the dis-
tribution of the input data.
The novelty of the approach presented in this paper is twofold. We modified an existing CCM algorithm in such a way to
make it suitable for larger NNs. Our investigations carried out with the original CCM algorithm show that this method is effi-
cient only in case of relatively small NNs. If the number of neurons increases above 10–20, the percentage of inactive neurons
increases rapidly. The proposed modification enables activation of much larger number of neurons. We performed investi-
gations for the number of neurons of up to 250. The results obtained are important for both software and hardware imple-
mentations. While testing the proposed algorithm, we observed that initial values of the weights can be zeroed in this case.
This is an additional advantage that eliminates the necessity of any programming of the weights. In hardware realization the
weights can be simply reset in this case.
Other novelties presented in the paper are: the implementation of the CCM algorithm in the CMOS technology, as well as
the way in which the improvement described above has been achieved. In comparison with the original CCM algorithm, only
one additional parameter (A) is required. This is important when looking from the point of view of the hardware implemen-
tation as a specialized chip (ASIC), in which many CCM blocks will operate in parallel. In the proposed implementation the
parameter A is an analog signal (current) that requires only one additional branch. The weights are initialized by the same
adaptation mechanism that is used to update the weights during the learning phase, which is also one of the advantages.
The paper is organized as follows. In next Section we present the WTA learning algorithm, as well as an overview of such
NN realized by us earlier in the CMOS technology (2.1). In this Section we also discuss the ways of how the learning process
of the WTA NN can be optimized (2.2) and the criteria of the assessment of this process (2.3). This background is necessary to
enable a comparison of the solution used in the former chip with the new algorithm, which is described in Section 3.1. In
Section 3.2 we present transistor level implementation of the new mechanism. Verification of the new solution by means
of the software model of the NN, as well on the transistor level is presented in Section 4. The conclusions are drawn in
Section 5.
2. An overview of the learning process of the WTA NN
Winner Takes All Neural Networks belong to the group of networks trained without the supervision. A typical learning
process in such networks is illustrated schematically in Fig. 1. The learning process starts with generating usually random
weights for all neurons (initialization). The subsequent learning phase is divided into epochs. During each epoch all patterns
X from the input dataset are presented to the network in the random fashion. We denote a single presentation of a pattern X
and subsequent calculation sequence as a cycle. In a single, tth, cycle the WTA NN calculates a distance between the provided
pattern XðtÞ and vectors WðtÞ in particular neurons. In the next step the NN determines which neuron is located in the closest
proximity to a given pattern X and this neuron is allowed to adapt its weights.
Different measures of the similarity between the XðtÞ and the WðtÞ vectors can be used. The most frequently used are the
Euclidean (L2) and the Manhattan (L1) distance measures defined respectively, as:
Fig. 1. Illustration of the conventional learning process of the WTA NN.
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u m
uX 2
dL2 ðXðtÞ; W j ðtÞÞ ¼ t xl ðtÞ wj;l ðtÞ ð1Þ
l¼1
and
X
m
dL1 ðXðtÞ; W j ðtÞÞ ¼ xl ðtÞ wj;l ðtÞ: ð2Þ
l¼1
In equations above d is a distance between a given pattern XðtÞ and a weights vector W j ðtÞ of any, jth, neuron in the NN, while
xl ðtÞ and wj;l ðtÞ are lth components of vectors XðtÞ and W j ðtÞ, respectively.
In the L1 measure the squaring and the rooting operations have been eliminated, which significantly simplifies the overall
learning algorithm. In our former prototype chip we have used an L22 measure that leads to the same results as the L2 one,
but does not require a rooting circuit. The adaptation of the winning, ith, neuron is performed in accordance with the
formula:
W i ðt þ 1Þ ¼ W i ðtÞ þ gðXðtÞ W i ðtÞÞ; ð3Þ
in which g is the learning rate. The weights of other neurons that lose the competition remain unchanged in the WTA
algorithm.
2.1. Realization of the WTA NN in the CMOS technology
The aim of our previous project was an implementation of a fully analog, current-mode, WTA NN in the CMOS 180 nm
technology. The advantage of using the current-mode approach was the ability to use very small currents. This, in turn,
enabled us to reduce the power dissipation to levels below 700 lW for data rates comparable with a typical PC. Using
the current-mode allowed us to simplify the structure of the circuit that, in turn, reduced the chip area to only 0.07 mm2.
The advantage of the current-mode becomes visible in a situation in which predominate the addition and subtraction oper-
ations, realized in this case in a junction according to 1st Kirchhoff’s law.
A general block diagram of the WTA NN implemented by us in the CMOS 180 nm technology is presented in Fig. 2. This
circuit reflects the scheme of the learning process shown in Fig. 1. Distance calculation circuit (DCC) is responsible for the
calculation of a distance between XðtÞ and W j ðtÞ vector in a given, jth, neuron. In the analog approach this block can be real-
ized either as a voltage-mode [28,29] or a current-mode circuit [30–32]. Example circuits of this type have been reported in
[32–34]. Our circuit has been realized in the current-mode technique due to reasons described above.
In the WTA NN as well as in other self-organizing NNs particular neurons compete with each other, but in contrary to
other NNs of this type, in the WTA NN only the winning neuron is allowed to adapt its weights. To determine which neuron
has won the competition in a given learning cycle, a Winner Selection Circuit (WSC) has to be applied. The winning neuron is
the one that is located in the closest proximity of a given pattern XðtÞ. We proposed a binary-tree WSC that allows for a pre-
cise and an unambiguous selection of this neuron [6]. Finally, the adaptation of the weights of the winning neuron is per-
formed by the adaptation mechanism (ADM) described in detail in [5].
2.2. Optimization of the learning process of the WTA NN
A significant problem encountered in the WTA NNs are dead neurons, as described in Section 1.1. They reduce the number
of classes that can be discriminated by the NN, thus increasing the mapping (quantization) error of the network. In practice,
Fig. 2. Block diagram illustrating the structure of the WTA Neural Network realized in the CMOS technology [4].
dead neurons waste the computational resources of the NN, and increase the energy consumed by the NN. This problem is
less visible in the Winner Takes Most (WTM) algorithm, as the neighborhood mechanism used in this case activates most or
even all neurons [7] in the map. Unfortunately, the WTM algorithm is more complex and slower than the WTA one.
The improvement of the learning process of the WTA NN usually relies on using an initialization, as well as the conscience
mechanism. In our previous project, shown in Fig. 2, we have used a simple random initialization mechanism described in
previous Section. As the prototype NN was small (4 neurons and 3 inputs i.e. 12 weights), the programming lines used in this
case did not increase significantly the complexity of the chip.
The realization of the conscience mechanism in an efficient way is a problem itself. Several realizations of this component
can be found in the literature [4,15,16]. The algorithm proposed by deSieno in [16] is complex and therefore it is rather not
suitable for low power and low chip area hardware implementations. Ahalt, et al. proposed a simpler solution [15]:
dcons ðXðtÞ; W j ðtÞÞ ¼ dL1=L2 ðXðtÞ; W j ðtÞÞ Lcount ðtÞ K; ð4Þ
in which the number of the wins (Lcount ) of a given neuron is multiplied by a distance dL1=L2 . In (4) K is a gain factor that deter-
mines a strength of the conscience mechanism and thus enables tuning the overall learning process. On the basis of the
Ahalt’s approach we proposed our own solution [4] more suitable for transistor level realization. The modification intro-
duced by us founded on substituting the multiplication operation by addition, which substantially simplified the circuit:
dcons ðXðtÞ; W j ðtÞÞ ¼ dL1=L2 ðXðtÞ; W j ðtÞÞ þ Lcount ðtÞ K: ð5Þ
The investigations carried out by means of the software model of the NN, as well as during subsequent measurements of
the fabricated chip showed that this modification did not have any negative impact on the learning process [4]. A real dis-
tance dL1=L2 between the XðtÞ pattern and a W j ðtÞ vector is in this case enlarged by a new signal that is proportional to the
number of the wins of a given neuron (Lcount ). As a result, the WSC responsible for detecting the winning neuron in each cycle
receives the modified distances dcons > dL1=L2 .
Although the conscience mechanism improves the learning process, the transistor level design and optimization of this
circuit is relatively complex. Our circuit reported in [4] was composed of three analog blocks that had to cooperate with each
other. One of them was an analog counter used to count the wins of a given neuron. The number of the wins was stored as a
voltage on a capacitor. The counter was followed by a voltage-to-current converter, which was necessary as the overall NN
operates in the current-mode. Since the counter itself suffered from relatively high dependence of some parameters on the
external temperature, it had to be compensated by an additional circuit [35]. Each neuron in that NN was equipped with a
separate CONS block. This circuit occupied relatively large area and substantially raised up the energy consumed by the NN.
The MCCM initialization mechanism proposed in this paper is a continuation of our previous work. In the new prototype
that is currently under development, the number of neurons will be substantially larger. We suppose that even hundred neu-
rons can be required, each possessing up to ten weights. In the new NN the previously used initialization mechanism would
be inconvenient. For this reason, we propose the new approach based on the Convex Combination Method initially described
in [26]. In our approach we introduce some modifications that make it more flexible than the conventional CCM. We denote
the proposed algorithm as Modified CCM (MCCM) (see Section 3). This block can be seen as an alternative not only for the
conventional initialization mechanism but can substitute the CONS mechanism as well. Consequently, it allows for a sub-
stantial reduction of the power dissipation, as well as the chip area of the NN.
2.3. Evaluation of the quality of the learning process of the WTA NN
The learning process of the WTA NN can be assessed by the observation of how the NN reacts on particular learning pat-
terns XðtÞ. In general, it is difficult to say what is a proper outcome of the learning process. One of the measures indicating
that the learning process advances in a right direction are decreasing (in following learning epochs) distances between the X
signals and the weights W of the winning neurons. It can be assumed that the learning process is fully satisfactory, if the NN
makes a correct quantization of the area occupied by the learning dataset. In SOMs this process is called vector quantization,
while the corresponding error is called the quantization error or a distortion measure [21]. This error is defined as follows:
1 Xz
2
Eq ¼ kXðtÞ W i ðtÞk ; ð6Þ
z t¼1
where z is the number of all learning patterns in a given learning dataset, presented to the NN during a single learning epoch.
Index i denotes a winning neuron in a given learning cycle, t.
The evaluation of the learning process on the basis of this criterion is not always optimal, as this error depends on a dis-
tribution of the input data, the range in which the weights are drawn (the initialization) and the number of neurons in the
NN. When the number of neurons is larger than the number of data classes, the quantization error may be quite small, while
dead neurons will still exist. This is due to the fact that this error is calculated only for the winning neurons. In other words
this criterion does not tell anything about the number of dead neurons.
On the other hand, if the number of neurons is less than the number of data classes, while particular classes are far from
each other, the quantization error can be very large even in case of an optimal final distribution of neurons over the input
data space. In this case a single neuron can represent multiple classes often very distant from each other. For this reason, in
addition to the quantization error it is reasonable to observe also other criteria associated with the learning process. One of
them is the number of wins of particular neurons presented on a histogram. For particular neurons the numbers of the wins
should be more or less similar.
Other criteria include observation of the areas of a dominance of particular neurons in the input data space (the Voronoi’s
diagram [22]), as well as distribution of particular neurons after completing the learning phase. We use these criteria to eval-
uate the influence of the proposed initialization algorithm on the learning process.
3. Proposed initiation mechanism based on Convex Combination Method
Before we explain how the proposed mechanism works, it is worth to tell that most of the blocks used in the WTA NN,
shown in Fig. 2, do not need to be modified. The DCC, WSC and the ADM blocks which belong to the main signal processing
path remain unchanged. In the new version, shown in Fig. 3, we eliminate the CONS and the INIT blocks together with all
programming lines (analog signal as well as digital addressing lines) and add the proposed mechanism denoted as MCCM.
3.1. An idea of the proposed mechanism
The proposed MCCM is a modification of an original CCM proposed in [26]. When looking from the point of view of par-
ticular neurons in the NN, both in the CCM and the MCCM approaches input patterns X in particular learning cycles are mod-
ified in such a way to get closer to weights vectors W of these neurons. In fact, particular patterns X remain unchanged, while
the CCM or MCCM blocks in particular neurons create images (different for each neuron) of these patterns. As a result, the
neurons ‘‘see’’ the images as lying closer to their weights vectors than corresponding patterns X. It can be compared to a ‘‘fata
morgana’’ effect. This creates a chance for badly located neurons to join the competition.
In the MCCM approach particular input signals xl are substituted with the modified signals ^ xj;l that are calculated as
follows:
^xj;l ðtÞ ¼ fðtÞ xl ðtÞ þ ½1 fðtÞ ðwj;l ðtÞ þ AÞ: ð7Þ
In this equation fðtÞ is a function whose output is increasing from a value slightly above 0, at the beginning of the initializa-
tion phase and the overall learning process, to 1, while A is a parameter introduced in the MCCM approach that enables
adjustment of the proposed mechanism to different sizes of the NN. For fðtÞ = 1 the MCCM mechanism becomes inactive.
Fig. 3. Modified structure of the WTA Neural Network with the proposed MCCM mechanism.
In (7) we consider a single x–w pair, so the total number of created images ^ xj;l ðtÞ equals the number of the weights in all
neurons in the NN. For n neurons in the NN with m inputs, there is n m different images of particular input signals, i.e. each
^ j ðtÞ patterns can be seen as attractors that, as the value of the
neuron will have its own image of each input signal. Particular X
fðtÞ function increases pull particular neurons to their final destinations.
In the original CCM the parameter A is not present or, in other words, A ¼ 0. We have intentionally added this new vari-
able, because as will be shown in Section 4, it helps to activate dead neurons even if the NN is relatively large. Taking (7) into
account, in the MCCM algorithm the adaptation of the winning neuron is performed in accordance with the following
formula:

W i ðt þ 1Þ ¼ W i ðtÞ þ g X^ i ðtÞ W i ðtÞ : ð8Þ
The NN equipped with the MCCM block operates in a very similar way as the conventional WTA NN (3). The ‘‘only’’ dif-
ference relies on substituting the X signals with the X ^ j signals in all calculations. Nevertheless, from the hardware point of
view the DCC, WSC and ADM blocks operate in the same way as in the previous approach.
The learning process of the WTA NN with the new MCCM component is schematically illustrated in Fig. 4. In comparison
with the conventional approach, shown in Fig. 1, in which the initialization and the learning phases are separated, now to
some extent they overlap each other.
As opposed to the conventional WTA algorithm, in the proposed MCCM approach, all vectors W j can be zeroed before
starting the learning process i.e. no conventional initialization is required. This causes that all X ! W distances are equal,
i.e. there is no distinct winner. To solve this problem, during the following cycles of the initialization phase particular
neurons are arbitrary selected as winners. In a software implementation it can be realized by selecting following neurons
in the loop. In analog transistor level realization even if all weights are equal, one of the neurons will be always indicated
as a winner automatically. This is possible, as small offsets exist in the comparators used in the WSC block [6]. These offsets
make some WSC inputs ‘‘stronger’’. This effect, negative itself, plays here a positive role of the arbiter. The values of the
offsets do not exceed 1% of the input signal range.
In (7) and (9) fðtÞ is a function whose output is increasing during the learning process from a value slightly above 0, at the
beginning of the learning phase, to 1. As can be noticed, for fðtÞ ¼ 0 and A ¼ 0 particular modified patterns X ^ j ðtÞ exactly
overlap corresponding vectors W j ðtÞ of particular neurons. This situation is not allowed. The adaptation of the weights of
the winning neuron (8) is performed toward the modified patterns X ^ i ðtÞ, instead of XðtÞ. If X
^ i ðtÞ overlaps its corresponding
^
neuron, then gðX i ðtÞ W i ðtÞÞ ¼ 0, and the adaptation becomes inactive. For this reason, the initial values of fðtÞ should be
greater than zero.
Fig. 4. Illustration of the learning process of the WTA NN in case of using the CCM/MCCM algorithm.
Modification of the input patterns X is illustrated in Fig. 5 for selected values of vectors X and W, for different values of fðtÞ
function and parameter A. For A = 0 the modified patterns X ^ j ðtÞ are always located on the line between a given X–W pair.
Theoretically this leads to a situation in which all neurons modify their weights toward the same area of the input data space
(if input patterns are located close to each other). The parameter A can be seen as a positive bias that shifts the modified
patterns X^ j ðtÞ toward larger values in the input data space, additionally introducing an angular shift. The last feature causes
that neurons modify their weights toward different areas which has a positive effect on the overall learning process. We
typically observed this phenomenon during the simulation tests.
An interesting phenomenon can be observed in Fig. 5. The A parameter is not an independent variable, which means that
its value is not an absolute bias. The distances to which the x signals are moved depend on the value of the fðtÞ function. At
the beginning when fðtÞ is small the distances are larger. Then as the learning process advances and fðtÞ becomes higher, the
impact of the A parameter substantially decreases and the modified patterns X ^ j ðtÞ become similar to their original counter-
parts X. This is an advantage since it allows to establish a certain constant value of the variable A, optimal for a given number
of neurons in the NN and to control only the fðtÞ function.
For fðtÞ ¼ 1 the modified patterns X ^ j ðtÞ become equal to their corresponding patterns XðtÞ and the MCCM blocks are
turned off. From this moment the NN operates as in case of the conventional WTA algorithm. At this stage the neurons
should be located relatively close to particular data classes, i.e. the learning process is in the tuning phase. Taking it into
account, the new initialization method can be also viewed as kind of the learning algorithm.
^ for different values of the fðtÞ function and the parameter A.

Fig. 5. Location of selected weights vectors W and corresponding learning patterns X and X
3.2. Hardware implementation of the MCCM block
The proposed algorithm offers several advantages when it comes to the hardware implementation. Depending on the
total number of the weights in the NN and the number of the NN inputs, the resultant circuit can occupy less chip area
and consume less power than the previous circuit [4]. This is due to the elimination of the programming lines previously
used by the INIT block, as well as counters and capacitors used by the CONS mechanism.
The proposed current-mode MCCM circuit is shown in Fig. 6. This is a modified version of the CCM circuit, whose concept
was initially proposed in [36]. The MCCM blocks are located at the inputs of each neuron (see Fig. 3), as each neuron requires
a separate image of each input signal. The value of the fðtÞ function, equal for all neurons, is controlled by A0–A3 logic signals
that control switches in two multi-output current mirrors composed of transistors M1 –M5 and M11 –M15 . In this
configuration, for given signals Ix:l ðtÞ and Iw:j;l ðtÞ we obtain 16 different levels of the resultant Îx:j;l ðtÞ signals, given by (9).
In case of the current-mode transistor-level implementation of the MCCM block, which is one of the main objectives of
this paper, particular signals in (7) are represented by currents, as shown in Fig. 6, so (7) can be rewritten as:
Îx:j;l ðtÞ ¼ fðtÞ Ix:l ðtÞ þ ½1 fðtÞ ðIw:j;l ðtÞ þ IA Þ: ð9Þ
The IA current represents the A parameter used in (7). In general, the efficiency of the new initialization mechanism
increases together with the value of the A parameter. It might be therefore assumed that the value of A could always be large
and constant. This may be true in case of software realizations. In hardware, on the other hand, the power dissipation of the
overall circuit increases together with the value of the A parameter. Therefore, it is always necessary to look for the smallest
possible value of A for a given size of the NN.
The advantage of the proposed solution relies on using only one additional signal for different sizes of the NN. This
simplifies the structure of the chip. This solution allows to design a programmable chip, in which the number of neurons
is a parameter, while the initialization mechanism can be easily tuned by a single signal only.
The sizes of particular transistors have been selected in order to minimize the mismatch effect that affects the precision of
the current mirrors. For the input signals being in the range 2–20 lA the optimal sizes of the transistors are W/L = 9/1 lm for
the PMOS type transistors and W/L = 3/1 lm for the NMOS type transistors. We do not present here a detailed analysis of this
phenomena, as the analysis is very similar to the one we already performed in [5] for the ADM block that operates with
similar currents.
In the CMOS 0.18 lm technology a single MCCM block with 34 transistors (switches are realized as transmission gates)
occupies the area of about 300 lm2 . For the comparison, to obtain a similar functionality using digital circuits, more than
2000 transistors is required with a resultant area of about 5000–10,000 lm2.
4. Verification of the proposed MCCM algorithm
The verification of the proposed algorithm presented in this Section has been carried out on two levels. First, we demon-
strate the results of investigations performed by means of the software model of the WTA NN equipped with the proposed
MCCM algorithm. The model allowed to verify the concept of the new mechanism for different values of particular
Fig. 6. The proposed circuit used in the Modified Convex Combination Initialization Method.
Fig. 7. Case study for different scenarios of the learning process for an example NN with 10 neurons: (top) Learning (quantization) error for particular
configurations of the learning algorithm; (bottom) number of the wins of particular neurons in the NN: (left) The MCCM mechanism turned off (only 3
neurons are active), the MCCM mechanism turned on but A ¼ 0 (3 neurons remain dead), (right) MCCM mechanism is active for A ¼ 0:8 and A ¼ 2.
parameters described above. These investigations become a basis for subsequent implementation of the proposed MCCM
mechanism at the transistor level. Software level investigations are mandatory, as the design process of the integrated circuit
is time-consuming. After a physical realization the structure of the chip cannot be modified any more. Realization of ASICs is
very expensive and therefore no mistakes should be made.
Proper modeling allows to assess an influence of various parameters and physical phenomena on the learning process of
the NN. There are few phenomena that have a negative impact on the learning process. Main problems are usually caused by
a leakage effect in the analog memory cells that are used to store neuron weights, a charge injection effect that occurs during
commutation of switches used in the circuit, inaccuracies that occur during fabrication of current mirrors (mismatch effect),
etc. These phenomena has been included in the proposed model. The model has been implemented in the Java environment,
while some of its modules e.g. the generator of the learning patterns in the Python environment. During the tests the initial
values of neural weights were set near to zero. The small, non-zero values different for particular weights are equivalent of
the inaccuracies of the WSC component described earlier.
4.1. Tests methodology
Tests of the proposed MCCM circuit have been divided into two main stages. In the first one we investigated the behavior
of the WTA NN without using the MCCM mechanism to achieve a point of reference for further investigations in which we
tested the NN with the MCCM mechanism being active. At this stage we investigated an impact of the A parameter on the
quality of the learning process. This enabled us to compare the MCCM approach with the conventional CCM, in which A ¼ 0.
The tests were carried out for different numbers of neurons and learning epochs. Learning datasets used in tests were
composed of random data deployed in a given number of clusters with randomly distributed centers and different disper-
sions of pattern around these centers. We performed simulations for datasets in which 500–600 patterns X were deployed
in 10 clusters (data classes).
Taking into account various hardware constraints, we assumed that each input of the NN has to be normalized before it is
provided to the network. The normalization is convenient as in this case all MCCM blocks have the same structure.
Nevertheless we performed a series of tests in which the ranges of particular inputs differed 10 times. In this case using
the same value of the A parameter allowed to obtain proper results.
Initially, the tests were carried out for ten neurons, and later we also considered larger networks with even 250 neurons.
In all tests, the following cases were investigated:
MCCM block switched off (conventional WTA without any initialization of the neuron weights and without the con-
science mechanism),
MCCM block switched on, A ¼ 0 (conventional CCM),
MCCM block switched on, A > 0.
Fig. 8. Location of particular neurons and corresponding data classes for two example cases: (top-left) MCCM turned-off, (top-right) MCCM turned-on,
(bottom) corresponding Voronoi’s diagrams.
In the tests that were carried out we made several additional assumptions:
All input signals were normalized to be in the range in-between 0 and 10. We were doing so, as the signals (currents) used
in the MCCM circuit realized in the CMOS technology are in the range in-between 0 and 10 lA. The assumption about
equal ranges of the inputs is convenient when taking the hardware implementation into consideration. In this case tran-
sistor sizes in each MCCM block can be equal, which facilitates the hardware realization.
One of the issues was how quickly the fðtÞ function should increase during the learning phase. Since in the proposed
MCCM circuit the output signal of the fðtÞ function is controlled by 4 bits (A0 –A3 ), its value changes with a step of
1/16. In the presented tests it was sufficient. However, if needed, it is possible to increase the resolution of the output
of the fðtÞ function by adding extra bits in the circuit. During the tests we incremented the fðtÞ function from an initial
value of 1/16, with a step of 1/16 after each epoch.
The initial value of the learning rate g is an important issue that has to be briefly clarified. In general, the larger the num-
ber of neurons in the NN is, the larger the initial value of the learning rate should be. For smaller numbers of neurons the
NN learned correctly for initial values of g at the level of 0.40.5, while for larger NNs better results were obtained for
g 0.8.
4.2. Verification of the proposed algorithm at the system level
Selected simulation results are compared in Fig. 7. The top diagram illustrates the quantization error for particular cases.
The smallest error appears for MCCM being active and A > 0:8, while the largest one occurs for the conventional WTA algo-
rithm without the initialization mechanism. Without the use of the MCCM block only three neurons are active, i.e. 70% of
neurons remained dead. The activity of particular neurons in considered scenarios is illustrated in Fig. .7 (bottom). It can be
Fig. 9. Simulation results for 20 (left) and 30 (right) neurons in the NN.
observed, that although the quantization error for A = 0.8 and A = 2 finally becomes equal, the learning process is faster for
the larger value of A.
Fig. 8 illustrates the location of the active neurons in relation to particular data classes. For the worst case particular active
neurons represent 3–4 classes each. The situation becomes better when we activate the CCM block (i.e. the MCCM with
A = 0). In this case only 30% of neurons remain dead (see Fig. 7), whereas the resultant quantization error reaches a smaller
value than previously. The best results are obtained when the MCCM mechanism is active and A ¼ 2. The A parameter
enables tuning the adaptation mechanism, and matching its behavior to the number of neurons in the NN and the input data-
set. The obtained results are similar to the ones obtained for the WTA NN with the conventional initialization mechanism
and the CONS block.
The outcomes of the learning process can be illustrated in a convenient way by the use of the Voronoi diagram. The
Voronoi area V j can be understood as a fraction of the input data space that is represented by a jth neuron whose distance
to particular pattern X complies with the following condition:

V j ¼ X : X W j < kX W k k; 8j – k ; ð10Þ
where k is any neuron in the NN different than j.
As a result, the overall input data space is divided into regions of domination of particular neurons. Any new pattern X
that occurs at the inputs of the NN is located in one of these regions, which means it is represented by a given neuron.
The Voronoi diagram is a good indicator of the activity of particular neurons and thus allows to assess the effectiveness
of particular learning algorithms. Fig. 8 (bottom) illustrates the Voronoi diagram for the MCCM mechanism being inactive
(left) and when it is turned on (A ¼ 0:8). As can be observed, the MCCM mechanism activated all neurons and thus it divided
the input data set into larger number of data classes.
In Fig. 9 we present the results similar to those presented in Figs. 7 and 8, but for larger NNs with 20 and 30 neurons and
for similar parameters like in the previous test. The values of the A parameter are the same. Note that for 30 neurons and
A ¼ 0:8 the quantization error remained at larger level than in the case for A ¼ 2. This means that A ¼ 0:8 is not sufficient
in this case.
To demonstrate the meaning of the value of the A parameter we carried out a series of investigations presented in
Figs. 10–13. Particular tests were selected to cover various scenarios and to demonstrate the effectiveness of the proposed
algorithm.
Fig. 10 shows how quickly the number of dead neurons diminishes for different values of the A parameter for an example
case of 40 neurons in the NN. As can be observed, when the value of A becomes smaller, the process of eliminations of inac-
tive neurons becomes longer. One can indicate a threshold value of A below which the learning process becomes not effec-
tive. For the presented case of 40 neurons the threshold value equals 1. A trade-off can be observed here. For larger values of
the A parameter the process of elimination of dead neurons is more rapid, but in case of a hardware implementation this
leads to increasing the power consumption.
On the basis of Fig. 11 threshold values of the A parameter can be determined for different numbers of neurons in the NN.
We performed these investigations for three different datasets. In each of them centers were randomly distributed.
Differences between particular cases consist in different dispersions of the learning patterns around these centers (A: 2%,
B: 10% and C: 20% of the maximum data range). The obtained results are similar, which is an important conclusion. This
means that the determined threshold values of the A parameter can be used for different datasets. We performed tests also
for such cases in which the dispersions around particular centers were different in a single dataset. The results were similar.
In Figs. 12 and 13 we went more deeply into one of the scenarios shown in Fig. 11. The presented tests were performed for
datasets with dispersion of 10% for different number of inputs in the NN. Fig. 12 presents the results for equal number of
patterns in particular classes (50 patterns per class). If the number of inputs increases, the input data space becomes less
dense and the elimination of dead neurons is more effective. The results are presented for numbers of neurons varying in
the range in-between 20 and 250. The value of the threshold level of the A parameter increases approximately linearly with
the number of neurons. Fig. 13 presents selected results comparing the situations in which the number of patterns in each
class is equal (Case A) and when it varies in-between 10 and 100 patterns (Case B). The results are comparable, although a
slightly larger A should be used if the number of patterns per cluster is different.
As can be seen, increasing the value of the A parameter improves the learning process in all presented situations. It might
conclude that instead of optimizing the value of this parameter we could use one value large enough to be effective in all
studied cases. This conclusion is not correct if we look from the point of view of transistor level realization, because the
power consumption increases together with the value of the A parameter.
For each number of neurons one can indicate a threshold value of the A parameter above which all neurons are stimu-
lated. This value is more or less constant for a given number of neurons, regardless of data distribution. However, since a
small spread is observed, a safety margin at the level of 10% of the threshold should be considered.
Fig. 10. An influence of the value of the A parameter on the number of dead neurons. The number of dead neurons is determined after each epoch. The
results are shown for an example case of the NN with 40 neurons.
Fig. 11. An influence of the value of the A parameter on the number of dead neurons for different sizes of NN and different datasets after completing the
learning process.
The results shown in Fig. 12 allow to determine a rough dependency between the number of neurons in the NN and the
corresponding threshold value of the A parameter (Ath ) described above. The comparative diagram is shown in Fig. 14 for 2, 5
and 10 inputs of the NN. A general dependency is linear with a saturation at the bottom values of Ath for small numbers of
neurons. The presented diagram can be used to roughly determine the value of the A parameter for different network sizes.
In previous Section we mentioned that the proposed method used in the WTA NN can replace the conscience mechanism
as well as the conventional initialization mechanism. To compare the effectiveness of both approaches, we conducted a ser-
ies of simulations. One of the criteria is the quantization error described above. If the MCCM method is used, in the initial
phase of the learning process the quantization error exhibits larger values. However, the most important is the final value
of this error. In both methods, the final value is similar, with differences at the level of 5%–10%. Both methods – if properly
configured – are able to activate all neurons in the NN. Selected results comparing the activity of particular neurons for both
methods are shown in Fig. 15.
4.3. Verification of the proposed MCCM circuit
The proposed MCCM circuit has been verified by means of transistor level simulations. A question can be asked if such
simulations are sufficient to correctly assess the performance of this circuit. To be sure that the circuit has been evaluated
correctly, we performed a rigorous corner analysis, in which the circuit was tested for different values of the environment
temperature (T:from 20 to +120 °C), for different supply voltages (V: from 1.2 to 1.8 V) and different transistor models (P –
process), namely typical, slow and fast. This (PVT) procedure allows to obtain reliable results, which are usually sufficient
while designing commercial chips before the chip fabrication. It is also worth to say that the proposed mechanism is based
on similar current-mode circuits as our previous circuits which have been successfully measured [4,5].
Fig. 12. An influence of the value of the A parameter on the number of dead neurons for different sizes of NN and different numbers of inputs of the NN: (a)
for smaller NNs, (b) for larger NNs (up to 250 neurons).
Fig. 13. An influence of the value of the A parameter on the number of dead neurons for two scenarios: Case A: each cluster of data is composed of 50
learning patterns, Case B: The number of patterns in particular clusters varies in-between 10 and 100.
Selected simulation results are shown in Fig. 16. In this example, the fðtÞ function increases from 0 to 1 with a step of
1/15. The output signal, Îx:j;l ðtÞ for selected values of the input current Ix:l ðtÞ, and the current Iw:j;l ðtÞ that represents a corre-
sponding weight, is shown in diagram (a). The results for selected values of the input current are visible in the ranges 0–80
(Ix ¼ 2 lA), 80–160 (Ix ¼ 7 lA), 160–240 (Ix ¼ 0 lA) and 240–320 (Ix ¼ 2 lA) ls. The circuit operates properly independently
Fig. 14. Threshold value of the A parameter above which all neurons in the NN are activated, as a function of the number of neurons, for different numbers
of the inputs of the NN.
Fig. 15. Comparison of the effectiveness of the proposed MCCM algorithm with a former approach based on the conscience mechanism [4].
Diagram illustrates the number of wins of particular neurons for an example case of NN with 20 neurons. The learning process is divided into 20 epochs.
Each of them embraces 500 presentations of the learning patterns X.
on which of the x and the w signals is greater. In the 4th period w > x, resulting in a decreasing output signal for increasing
value of the fðtÞ function. This feature is important as the virtual image has to be determined correctly in each case.
As shown in diagram (b) the power dissipation depends on the values of particular signals. The strongest impact on this
parameter exhibits the Iw:j;l ðtÞ signal. An average power for the x and the w signals varying in-between 1 and 7 lA approx-
imately equals 20 lW. The smallest power dissipation is always at the beginning of the initialization phase for fðkÞ 0 and
then increases for fðtÞ ! 1. To minimize the energy consumption of the overall NN, which is important in portable devices,
the proposed MCCM circuit should be turned off immediately after completing the initialization phase. To make it possible
additional switches are used that enable the x signals to bypass this circuit. Turning off this circuit is realized also by setting
the x and the w signals to 0. This situation is shown in the period in-between 160 and 240 r ls.
Diagram (c) illustrates a difference between the output signal Îx:j;l ðtÞ and a theoretical signal calculated on the basis of (9).
The error usually does not exceed 1%, with maximum values of 2.5%, which is acceptable level in this case.
4.4. Comparative analysis of the circuit complexity of particular solutions
One of the important issues while designing any ASIC is the circuit complexity. It is not so straightforward to evaluate
which of the solutions is better in this regard. The structure of the CONS and the INIT blocks used in the previous prototype
strongly depend on the number of the inputs and the outputs of the NN. For particular values of these two parameters the
number of transistors, N tr , and the number of the programming paths, N pr , can be expressed as follows:
Ntr ð2 1:6 ðNin Nne þ 1Þ blog2 ðNin Nne Þc þ 6 ðNin Nne Þ þ 2Þ þ 50 Nne ; ð11Þ
|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflffl{zfflfflfflffl}
INIT:BLOCK CONS:BLOCK
Fig. 16. Transistor level verification of the proposed MCCM circuit: (a) selected values of the fðtÞ function (IA =0), (b) resultant ^
xi;j signals for given wi;j and xi
signals, (c) power dissipation of the circuit.
Npr ¼ blog2 ðNin Nne Þc þ 1; ð12Þ

where N in N ne is the total number of the weights in the NN (N in is the number of inputs, while N ne is number of neurons in
the NN). The factor 1.6 in (11) has been added to reflect a real structure of particular AND logic gates used in the INIT mech-
anism. The number of addressing paths given by (12) depends on the number of the neuron weights. Each neuron must be
equipped with an address decoder, which is based on a multi-input AND gate as well as NOT gates. We assumed that the
maximum number of the inputs of particular AND gates is 3 or 4, so larger gates are realized as the composition of the smal-
ler ones.
The results of the analysis are presented in Fig. 17 for selected cases of 2, 5 and 10 inputs. We present the results for the
number of neurons up to 2048 (2, 4, 8, 16, . . ., 2048). However, in real applications, e.g. in medical healthcare systems used in
the analysis of various biomedical signals, the number of neurons will not exceed 256 [37–39].
Top diagram in Fig. 17 illustrates a total number of transistors used only in the INIT blocks. Most of these transistors are
used in the address decoders in particular neurons, which are digital circuits. For this reason, these transistors can have min-
imum sizes specific for a given CMOS technology. The speed of the decoder is not critical, as even for minimum sizes of
Fig. 17. A comparative analysis between the previous and the new version of the WTA NN. The number of transistors is shown vs. the number of neurons in
the NN, while the number of the inputs is the parameter.
transistors the commutation of the decoding block takes much less time than the following programming phase of a corre-
sponding weight. Particular INIT blocks are used only one time during the overall learning process. The area of these tran-
sistors is small, but most of the area occupied by the INIT mechanism is occupied by the programming lines.
Bottom diagram in Fig. 17 illustrates a total number of transistors used both in the INIT and in the CONS blocks. In both
diagrams we present also the numbers of transistors for the new solution based on the MCCM. As can be seen when we com-
pare a total number of transistors the MCCM circuit occupies less area, but the area is only one of the benefits. Improved
initialization abilities of the NN are much more important.
5. Conclusions
This paper presents a novel initialization mechanism of neuron weights for the WTA Neural Networks realized at the
transistor level. The idea of the proposed solution stems from the CCM mechanism proposed earlier in [26]. Compared to
the original approach, several modifications have been introduced, so that the proposed mechanism is more flexible and bet-
ter suited for larger Neural Networks.
One of the proposed innovations is the introduction of an additional parameter that enables adjustment of the mecha-
nism (in terms of minimization of the number of dead neurons) to NNs with different numbers of neurons. In the original
CCM approach, it was difficult to obtain a good performance for the number of neurons larger than ten. This problem could
be resolved, to some extent, by making certain assumptions about the initial values of neuron weights. In contrary to the
original CCM approach, in the proposed method the weights at the beginning of the learning process may have zero values.
This facilitates the hardware implementation that is one of the main objectives presented in this paper. Each case in which
the weights have to be initially polarized requires the use of additional programming lines that increases the complexity of
the system.
The proposed algorithm has been verified for the number of neurons in the NN varying in-between 10 and 250. Such a
verification was necessary to check if this solution is universal enough to be included in the new prototype of the WTA NN.
In the next step the proposed algorithm has been designed as a current-mode analog circuit. The current-mode has been
selected as in this case all summing and subtracting operations are performed simply in junctions. As a result, the circuit
features a simple structure, which allows to save the chip area of the overall NN. A single block realized in the CMOS
0.18 lm technology occupies the area of 300 lm2 . An average power dissipation of a single MCCM circuit equals about
20 lW.
The new solution is not only an initialization mechanism. It can be viewed as a learning algorithm used at the initial stage
of the overall learning process. In case of the hardware realization it substitutes several blocks used in the previous proto-
type, providing a similar functionality.
Acknowledgments
The ‘‘Development of Novel Ultra Low Power, Parallel Artificial Intelligence Circuits for the Application in Wireless Body
Area Network Used in Medical Diagnostics’’ project is realized within the POMOST programme of Foundation for Polish
Science, co-financed from European Union, Regional Development Fund.
References
[1] L. Gatet, H. Tap-Bteille, F. Bony, Comparison between analog and digital neural network implementations for range-finding applications, IEEE Trans.
Neural Netw. 20 (3) (March 2009) 460–470.
[2] Y. Oike, M. Ikeda, K. Asada, A high-speed and low-voltage associative co-processor with exact Hamming/Manhattan-distance estimation using word-
parallel and hierarchical search architecture, IEEE J. Solid-State Circuits 39 (8) (Aug. 2004) 1383–1387.
[3] S. Sasaki, M. Yasuda, H.J. Mattausch, Digital associative memory for word-parallel Manhattan-distance-based vector quantization 38th European solid-
state circuit conference (ESSCIRC), France, Sept. 2012, pp. 185–188.
[4] R. Długosz, T. Talaśka, W. Pedrycz, R. Wojtyna, Realization of the conscience mechanism in CMOS implementation of winner-takes-all self-organizing
neural networks, IEEE Trans. Neural Netw. 21 (Iss. 6) (June 2010) 961–971.
[5] R. Długosz, T. Talaśka, W. Pedrycz, Current-mode analog adaptive mechanism for ultra-low power neural networks, IEEE Trans. Circuits Syst.–II:
Express Briefs 58 (Iss. 1) (January 2011) 31–35.
[6] R. Długosz, T. Talaśka, Low power current-mode binary-tree asynchronous min/max circuit, Microelectron. J., Elsevier 41 (1) (2010) 64–73.
[7] M. Kolasa, R. Długosz, W. Pedrycz, M. Szulc, A programmable triangular neighborhood function for a Kohonen self-organizing map implemented on
chip, Neural Netw. 25 (2012) 146–160.
[8] T. Becker, M. Kluge, J. Schalk, K. Tiplady, C. Paget, U. Hilleringmann, T. Otterpohl, Autonomous sensor nodes for aircraft structural health monitoring,
IEEE Sensors J. 9 (11) (2009).
[9] M. Vodel, M. Lippmann M.W. Hardt, Energy-efficient communication with wake-up receiver technologies and an optimised protocol stack,
International Conference on Advances in ICT for Emerging Regions (ICTer), 2013.
[10] B. Latré, B. Braem, I. Moerman, C. Blondia, P. Demeester, A survey on wireless body area networks, Wireless Netw. 17 (1) (2011) 1–18.
[11] D. Nguyen, B. Widrow, Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights, International Joint
Conference on Neural Networks (IJCNN), San Diego, USA, pp. 21–26 (III), 1990.
[12] K. Kenni, K.Nakayama, H. Shimodaira, Estimation of initial weight and hidden units for fast learning of multi-layer neural network for pattern
classification, International Joint Conference on Neural Networks (IJCNN), Washington, USA, vol. 3, pp. 1652–1656, 1999.
[13] Y.K. Kim, J.B. Ra, Weight value initialization for improving training speed in the backpropagation network, International Joint Conference on Neural
Networks (IJCNN), Seattle, USA, Vol. 3, pp. 2396–2401, 1991.
[14] T. Talaśka, R. Długosz, Initialization mechanism in Kohonen neural network implemented in CMOS technology, in: 11th European Symposium on
Artificial Neural Networks (ESANN), 2008, Bruges, Belgium, pp. 337–342.
[15] S.C. Ahalt, A.K. Krishnamurthy, P. Chen, D.E. Melton, Competitive learning algorithms for vector quantization, Neural Netw. 3 (1990) 131–134.
[16] D. DeSieno, Adding a conscience to competitive learning, IEEE Conf. Neural Netw. 1 (1988) 117–124.
[17] Jim Y.F. Yam, Tommy W.S. Chow, A weight initialization method for improving training speed in feedforward neural network, Neurocomput. Issues 30
(14) (Jan. 2000) 219232.
[18] Y.F. Yam, Tommy W.S. Chow, C.T. Leung, A new method in determining initial weights of feedforward neural networks for training enhancement,
Neurocomputing 16 (1) (1 1997) 2332.
[19] G. Thimm, E. Fiesler, Neural network initialization in from neural to artificial neural computation, in: J. Mira, F. Sandoval (Eds.), International Workshop
on Artificial Neural Networks, pp. 535–542, Malaga, 1995.
[20] Y. Chen, F. Bastani, ANN with two-dendrite neurons and its weight initialization, in: International Joint Conference on Neural Networks (IJCNN),
Baltimore, USA, vol. 3, pp. 139–146, 1992.
[21] T. Kohonen, Self-Organizing Maps, Springer Verlag, Berlin, 2001.
[22] A. Okabe, B. Boots, K. Sugihara, S. Nok Chiu, Spatial Tessellations Concepts and Applications of Voronoi Diagrams, John Wiley, 2000 (ISBN 0-471-98635-
6).
[23] D. Graupe, Principles of artificial neural networks, Advanced Series on Circuits and Systems, vol. 6, World Scientific, 2007.
[24] N. Vora, S.S. Tambe, B.D. Kulkarni, Counterpropagation neural networks for fault detection and diagnosis, Comput. Chem. Eng. 21 (2) (1997) 177–185.
[25] M. Lehtokangas, J. Saarinen, Weight initialization with reference patterns, Neurocomputing 20 (1–3) (1998) 265–278.
[26] H. Nielsen, Counterpropagation networks, Appl. Opt. 26 (1987) 4979–4984.
[27] S.N. Sivanandam, S. Sumathi, S.N. Deepa, Introduction to neural networks using MATLAB 6.0, Tata McGraw-Hill Computer engineering Series, 2006.
[28] A. Gopalan, A.H. Titus, A new wide range euclidean distance circuit for neural network hardware implementations, IEEE Trans. Neural Netw. 14 (5)
(Sep. 2003).
[29] G. Cauwenberghs, V. Pedroni, A low-power CMOS analog vector quantizer, IEEE J. Solid-State Circuits 32 (8) (1997) 1278–1283.
[30] Bin-Da Liu, Chuen-Yau Chen, Ju-Ying Tsao, A modular current-mode classifier circuit for template matching application, IEEE Trans. Circuits Syst. II:
Analog Digital Signal Processing 47 (2) (2000) 145–151.
[31] S. Vlassis, G. Fikos, S. Siskos, A floating gate CMOS Euclidean distance calculator and its application to hand-written digit recognition, Int. Conf. Image
Processing 3 (2001) 350–353.
[32] K. Bult, H. Wallinga, A class of analog CMOS circuits based on the square-law characteristic of an MOS transistors in saturation, IEEE J. Solid-State
Circuits sc-22 (3) (1987) 357–365.
[33] R. Długosz, M. Kolasa, T. Talaśka, J. Pauk, R. Wojtyna, M. Szulc, K. Gargua, P-A Farine, Power, low chip area, digital distance calculation circuit for self-
organizing neural networks realized in the CMOS technology, Solid State Phenom. 199 (2013).
[34] T. Talaśka, R. Długosz, Current mode Euclidean distance calculation circuit for Kohonens neural network implemented in CMOS 0.18 lm technology,
IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), Vancouver, Canada, 2007.
[35] R. Długosz, K. Iniewski, Flexible architecture of ultra-low-power current-mode interleaved successive approximation analog-to-digital converter for
wireless sensor networks, VLSI Design J., Hindavi Publishing vol. 2007 (2007) (Article ID 45269).
[36] R. Dugosz, T. Talaśka, P.A. Farine, W. Pedrycz, Convex combination initialization method for Kohonen neural network implemented in the CMOS
technology, in: International Conference Mixed Design of Integrated Circuits and Systems (MIXDES), 2012, Warsaw, Poland.
[37] S. Osowski, T.H. Linh, ECG beat recognition using fuzzy hybrid neural network, IEEE Trans. Biomed. Eng. 48 (11) (2001) 1265–1271.
[38] G. Valenza, A. Lanata, M. Ferro, E.P. Scilingo, Real-time discrimination of multiple cardiac arrhythmias for wearable systems based on neural networks,
Comput. Cardiology 35 (2008) 1053–1056.
[39] M. Lagerholm, G. Peterson, Clustering ECG complexes using hermite functions and self-organizing maps, IEEE Trans. Biomed. Eng. 47 (7) (2000) 838–
848.

Neural Network Implemented in The CMOS Technology

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Neural Network Implemented in The CMOS Technology

Uploaded by

Copyright:

Available Formats

Applied Mathematics and Computation xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

Applied Mathematics and Computation

An efﬁcient initialization mechanism of neurons for Winner

1.1. Problems associated with the initialization of neuron weights

2. An overview of the learning process of the WTA NN

Fig. 1. Illustration of the conventional learning process of the WTA NN.

2.1. Realization of the WTA NN in the CMOS technology

2.2. Optimization of the learning process of the WTA NN

2.3. Evaluation of the quality of the learning process of the WTA NN

3. Proposed initiation mechanism based on Convex Combination Method

3.1. An idea of the proposed mechanism

^xj;l ðtÞ ¼ fðtÞ xl ðtÞ þ ½1 fðtÞ ðwj;l ðtÞ þ AÞ: ð7Þ

^ for different values of the fðtÞ function and the parameter A.

3.2. Hardware implementation of the MCCM block

4. Veriﬁcation of the proposed MCCM algorithm

4.1. Tests methodology

4.2. Veriﬁcation of the proposed algorithm at the system level

4.3. Veriﬁcation of the proposed MCCM circuit

4.4. Comparative analysis of the circuit complexity of particular solutions

Npr ¼ blog2 ðNin Nne Þc þ 1; ð12Þ

You might also like