You are on page 1of 6

Robust design of Artificial Neural Networks applying the Taguchi

methodology and DoE

Ortiz-Rodríguez, J.M.1, 2, Martínez-Blanco, M.R.2, Vega-Carrillo, H.R.1, 2


Universidad Autónoma de Zacatecas
Unidades Académicas: (1)-Estudios Nucleares, (2)-Ingeniería Eléctrica
Av. R. López Velarde #801, Zacatecas, Zacatecas, C.P. 980000, Tel.: 014929239407, Ext. 1515
morvymm@yahoo.com.mx, mrosariomb@yahoo.com.mx, fermineutron@yahoo.com

Abstract neural networks. The Taguchi method and the DoE


technique are the main techniques used. Unlike
The integration of Artificial Neural Networks and previous studies in the design of ANNs, the Taguchi
optimization provides a tool for designing robust method is used here to simplify the optimization
network parameters and improving their performance. problem.
The Taguchi method offers considerable benefits in time
and accuracy when is compared with the conventional 1.1. Current and Proposed Methods for
trial and error neural network design approach. This Optimizing Neural Networks
work is concerned with the robust design of multilayer
feedforward neural networks trained by The size and training parameters of ANNs have a
backpropagation algorithm and develops a systematic critical effect on their performance, the Taguchi
and experimental strategy which emphasizes method and DoE methodology is applied in the
simultaneous optimization artificial neural network’s optimization of the design parameters of ANNs. Being
parameters optimization under various noise conditions. a “parallel” approach, the method offers considerable
Here, we make a comparison among this method and benefits in time and accuracy when is compared with
conventional training methods. The attention is drawing the conventional serial approach of trial and error [1]
on the advantages on Taguchi methods which offer and [5]. The current practice of selecting the levels for
potential benefits in evaluating the network behavior. neural networks design parameters is based on trial and
error, which is similar to the one-factor-at-a-time
1. Introduction experiment [5]. If changing the level of one particular
design parameter has no effect on the performance of
Although a lot of research of Artificial Neural the neural network then a different design parameter is
Networks (ANNs) is been concentrated in developing varied, and the experiment is repeated in a series
ANN´s models and training algorithms to improve the approach. In contrast, Taguchi method and DoE
accuracy and convergence of the models, there still a methodology is a very powerful method based on the
conventional problem in ANNs design, determining a parallel process, where all the experiments are planed a
suitable set of structural and learning parameters priori and the results are analyzed after all the
values for an ANN still remains a difficult task, users experiments are completed [11] and [12]. There are
have to choose the architecture and determine many of three types of DoE: One-factor-at-a-time, full factorial
the parameters in a selected network [1], [2], [3] and and fractional factorial.
[4]. The integration of neural networks and There are several ways to design a Multilayered
optimization provides a tool for designing neural Feedforward Neural Network (MFFN) by using DoE:
network parameters and improving the network the full factorial experiment methodology is the most
performance. The process optimization methods frequently used approach, where the experiments have
known as Taguchi methods and Design of Experiments 2f possible combinations that most be tested (f = the
(DoE) may be applied to the design and training of number of factors each at two levels). Therefore, it is
ANN [5], [6], [7], [8], [9] and [10]. In this study, a very time-consuming when there are many factors. In
systematic, methodological and experimental approach order to minimize the number of tests required,
is introduced to obtain the optimum design of artificial fractional factorial experiments (FFEs) were

Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06)


0-7695-2569-5/06 $20.00 © 2006
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on March 05,2010 at 23:54:02 EST from IEEE Xplore. Restrictions apply.
developed. FFEs use only a portion of the total
possible combinations to estimate the effects of main
factors and the effects of the some of the interactions.
Taguchi developed a family of FFE matrixes, known
as Orthogonal Arrays (OAs), which could be utilized
in various situations. These matrixes reduce the
experimental number but still obtain reasonably rich
information [5], [6], [7], [8], [9] and [10].
In ANN design using Taguchi methodology the
engineer must recognize the application problem well
and choose a suitable ANN model. In the selected
model, the design parameters (factors) which need to Figure 1. AMFFN with input layer, output layer and (M-2)
be optimized have to be determined (Planning stage). hidden layers
Using orthogonal arrays, simulations can be executed
in a systematic way (experimentation stage). From The key distinguishing characteristic of a MFFN
simulation results, the responses can be analyzed by with the backpropagation learning algorithm is that it
level average analysis and signal-to-noise (S/N) ratio forms a nonlinear mapping from a set of input stimuli
in the Taguchi method (Analysis stage). And finally, to a set of outputs using features extracted form the
Conduct a confirmation experiment at the optimal input patterns. The neural network can be designed and
design condition an check if the robustness measure is trained to accomplish a wide variety of nonlinear
close to the predicted value (Confirmation stage) [6], mappings, some of which are very complex. This is
[7], [8], [9], [10], [11] and [12]. because the neural units in the neural network learn to
respond to features found in the input. In the MFFMs
2. Materials and methods design process trained by means of backpropagation
learning algorithm, users have to choose the
MFFN is one of the main classes of neural networks architecture and determine many of the parameters in a
and it plays an important roll in many types of selected network, several learning factors such as the
problems in science and engineering such as system initial weights, the learning rate, the number of the
identification, control, medicine, pattern recognition hidden neural layers and the number of neurons in each
and recently in nuclear sciences in neutron layer may be reselected if the iterative learning process
spectrometry and dosimetry. As a tool for scientific does not converge quickly to the desired point. The
computing and engineering applications, the trial-and-error technique is the usual way to get a better
morphology of MFFNs consists of many combination of network architecture and parameters;
interconnected signal processing elements called however, it does not use systematic methodologies for
neurons. These neurons form layered network the identification of the “best” values. The process is
configurations through only feedforward interlayered treated very much as a serial “trial and error” exercise,
synaptic connections in terms of the neural signal flow consuming much time, and does not systematically
[1], [2], [3] and [4]. In a MFFN the neurons are target a near optimal solution.
organized into layers with no feedback or cross
connections. The lowest layer of the MFNN is the 2.1. Robust design of ANN of the type MFFN
input layer in which the processing elements have with backpropagation learning algorithm
received all the weighted neural inputs, and provide
their outputs to the processing elements of the first Robust design problems are broadly classified into
hidden layer. The highest layer of the MFFN is the the static and dynamic ones. The objective of robust
output layer. The outputs from a given layer are design for a static problem is to achieve the smallest
transmitted only to the higher layers. A basic structure variation of the performance characteristic around t
of the MFFN is shown in figure 1. under various noise conditions. On the other hand,
dynamic problems are classified into Continuous-
Input-Continuous-Output (CICO), Continuous-Input-
Digital-Output (CIDO) Digital-Input-Continuous-
Output (DICO) and Digital-Input-Digital-Output
(DIDO) cases. The objective of robust design for a
dynamic problem is to achieve the smallest variation of
the performance characteristic around its target values

Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06)


0-7695-2569-5/06 $20.00 © 2006
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on March 05,2010 at 23:54:02 EST from IEEE Xplore. Restrictions apply.
over the range of the signal parameters under various The networks designed by means of the
noise conditions. In the robust design of ANNs of this methodological, systematical and experimental
work, the target is not fixed but variable, and then, the methodology can be either in the continuous or discrete
target variable for training a neural network is domain. This methodology was implemented within
considered as a signal parameter and the robust design Matlab programming environment and using the JMP
of an ANN trained by BP learning algorithm is treated statistical program [13] and [14], and is composed by
as a dynamic problem. the planning, experimentation, analysis and
Among the four cases of dynamic problem only confirmation stages.
CICO and DIDO (the DIDO problems considered, Robust Design of Artificial Neural Networks
have a binary input and a binary output BIBO) cases (RDANN) methodology was applied in nuclear
can appear in the ANN training since the output of a sciences to design several neural networks in order to
backpropagation network must be of the same type as solve problems in neutron spectrometry and dosimetry
the target variable. In the context of a neural network, field [11], [15], [16], and [17]. The steps followed to
the CICO case corresponds to the prediction (or obtain the optimum design of the artificial neural
regression-type) problem while the BIBO case networks in consideration are as follow:
corresponds to the binary classification problem.
In order to design a MFFN of high performance, the 3.1. Stage I: Planning Stage.
parameters related to the training (learning rate,
momentum) as well as the network structure (number In the planning stage it is necessary to identify the
of hidden layers and neurons) must be considered objective function and the design and noise variables.
simultaneously, together with their interaction effects. The objective function of the unfolding neutron spectra
In addition, there exist nuisance parameters in training. problem is the performance or mean square error (mse)
They include initial set of weights, division of the output of the ANN) as shows equation 1.
whole data set into the training and testing data sets,
sizes of the training and testing data sets. It is also 1 N 2
(1)
desired that the performance of a MFFN be robust to MSE
N
¦ )
i
E ( E )iANN  ) E ( E )ORIGINAL
i
these nuisance parameters.

3. Results Where: N is the number of trials, ) E ( E )ORIGINAL


i

is the original spectra and ) E ( E )iANN is the spectra


In basis to the Taguchi method and DoE
methodology it was developed a systematic and unfolded with the ANN.
methodological strategy to design robust ANNs trained Among the various parameters that affect the ANN
by means of backpropagation learning algorithm as performance, where selected four design variables each
shown in figure 2. at three levels as shown in table I. Where A and B are
the number of neurons in the first and second hidden
layers respectively, C is the momentum and D is the
learning rate.

Table I. Design Variables and their levels


Design
Level 1 Level 2 level 3
variables
A N/2 N 2N
B 0 N 2N
C 0.001 0.1 0.3
D 0.1 0.3 0.5

Noise variables each at two levels, shown in table


II, in most cases are not controlled by the user. The
initial set of weights (U) and the size of training and
testing sets (V) are selected usually randomly. Once V
is determined, which data of the whole data set to
include in the training or testing data set (W) is
randomly determined, and not controlled by the user.
Figure 2. Methodological and systematic strategy for robust
design of MFFN

Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06)


0-7695-2569-5/06 $20.00 © 2006
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on March 05,2010 at 23:54:02 EST from IEEE Xplore. Restrictions apply.
Table II. Noise variables and their levels stage was used the RDANN tool and JPM to select the
Noise best values of the final ANNs being designed.
Level 1 Level 2
variables RDANN methodology was utilized to design two
U Set 1 Set 2 ANNs capable to solve the unfolding neutron
V 6:4 8:2 spectrometry problem by using 187 neutron spectra
W Training1/Test1 Training2/Test2 data set compiled by IAEA [18] and the neutron
spectrometry unfolding with simultaneous dose
3.2. Stage II: Experimentation calculus problem respectively, using A 187 neutron
spectra data set compiled by IAEA [18] and 13
After the factors and levels are determined, a different equivalent doses per spectra. In the
suitable Orthogonal Array (OA) can be selected for application of the RDANN methodology was used a
training process. For a robust experimental design, crossed OA whit the L9(34) and L4(23) for the neutron
Taguchi suggests to use two OAs crossed. The Taguchi spectra unfolding case and the L9(34) and L4(22)
OAs are denoted by Lr(Sc) where r is the number of configuration for the neutron unfolding and dose
rows, c is the number of columns and S is the number calculus case, obtaining the following results.
of levels in each column. A design variable is assigned In the ANNs design where made conformation
to a column of the OA. Then, each row of the design experiments selecting the parameters with the optimal
OA represents a specific design of an ANN. Similarly, levels proposed in the analysis stage, with these, where
a noise variable is assigned to a column of the noise trained and tested the different ANN architectures to
OA, each row of which then corresponds to a noise determine the optimum ANN topologies. Table III
condition. In RDANN methodology, was designed a shows the best learning and architectural parameters
tool in Matlab® which automate the analysis, obtained after carrying out the RDANN methodology
processing and presentation of the information; this in the problems considered which were calculated by
tool reduce significantly the time employed in training, using the JMP statistical program and confirmed by
testing and processing all the information generated in means of confirmation tests.
the network design process. In the experimentation
stage was used the RDANN Matlab® tool. Table III. Best values in the conformation stage of the
neutron unfolding problem
A B C D
3.3. Stage III: Analysis
Neutron spectra unfolding 14 0 0.1 0.1
Neutron unfolding and dose 26 39 0.1 0.1
The signal-to-noise ratio (S/N) is a measure of both
the location and the dispersion of the measured Once the best ANN topologies were determined, a
responses. It transforms the row data to allow final training and testing was made in order to solve
quantitative evaluation of the design parameters the two problems considered in this work. In the final
considering their mean and variation. It is measured in ANN validation was performed the correlation and Chi
decibels using the formula: square statistical tests to the spectra used in the testing
stage (37 from 187) to evaluate and validate the data
S/N 10 log10 ( MSD ) (2) obtained, making a comparison between the target
neutron spectra and the spectra obtained with the
Where MSD is a measure of the mean square designed ANN.
deviation in performance, since in every design, more Tables IV and V shown that all neutron spectra and
signal and less noise is desired, the best design will dose calculus pass the Chi square statistical test, which
have the highest S/N ratio. In the analysis stage was demonstrates that statistically there is not difference
used the JMP statistical program to select the best among the doses and neutron spectra reconstructed by
values of the ANN being designed. the designed ANNs and the target neutron spectra.
Similarly tables VI and VII shown the correlation
3.4. Stage IV: Confirmation statistical test applied to the 37 neutron spectra and
doses, where can be seen that the whole data set is near
In this stage is predicted the value of the robustness of the optimum value equal to one. Where SP is the
measure at the optimal design condition, conducted a number of spectra, Ȥ2 is the chi square test and R is the
confirmation at the optimal design condition, correlation statistical test.
calculated the robustness measure for the performance
characteristic, and checked if the robustness measure is
“close” to the predicted value. In the confirmation

Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06)


0-7695-2569-5/06 $20.00 © 2006
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on March 05,2010 at 23:54:02 EST from IEEE Xplore. Restrictions apply.
Table IV. Chi square test applied to the neutron spectra
unfolding problem
SP F2 SP F2 SP
1 0.1428 0.4494 11 21 18.0694 0.0248 31
2 0.1226 0.0253 12 22 0.2441 0.0262 32
3 0.1539 0.0339 13 23 0.0526 0.0555 33
4 0.2983 0.0915 14 24 0.2196 0.0666 34
5 0.0498 0.1029 15 25 0.0611 0.1474 35
6 0.0292 0.5833 16 26 0.8350 0.0780 36
7 0.6707 0.0424 17 27 0.2429 0.1825 37
8 0.0338 0.3581 18 28 0.1664
9 0.1028 0.1883 19 29 0.0132
10 0.0558 0.1191 20 30 0.0195

Table V. Chi square test applied to the neutron spectra Figure 3. Spectrum 10/37 (187 spectra and 13 equivalent doses)
unfolding and dose calculus problem
SP F2 SP F2 SP
1 0.0810 0.1014 11 21 0.1079 6.0629 31
2 0.2307 0.3108 12 22 0.0745 0.0519 32
3 0.1085 0.0247 13 23 0.0770 0.2579 33
4 0.0293 0.1194 14 24 1.9519 0.2180 34
5 0.0224 0.0229 15 25 0.3212 0.6124 35
6 0.2411 0.0743 16 26 0.2405 0.2727 36
7 0.0625 0.3636 17 27 0.1057 0.0410 37
8 1.0789 0.2422 18 28 1.8461
9 0.3996 0.2108 19 29 0.0293
10 0.0116 0.1226 20 30 0.0570

Table VI. Correlation test applied to the neutron spectra


unfolding problem
Figure 4. Dose 4/37 (187 spectra and 13 equivalent doses)
SP R SP R SP
1 0.9944 0.9343 11 21 0.9904 0.9984 31
2 0.8651 0.9832 12 22 0.9734 0.9969 32
3 0.9814 0.9633 13 23 0.9768 0.9909 33
4 0.9788 0.8935 14 24 0.9857 0.9053 34
5 0.9840 0.9567 15 25 0.9889 0.8622 35
6 0.9992 0.8502 16 26 0.9957 0.9767 36
7 0.9358 0.9995 17 27 0.9995 0.8065 37
8 0.9883 0.9925 18 28 0.9703
9 0.9880 0.9839 19 29 0.9946
10 0.9942 0.8904 20 30 0.9925

Table VII. Correlation test applied to the neutron spectra


unfolding and dose calculus problem
SP R SP R SP
1 0.9986 0.9990 11 21 0.9748 0.2423 31 Figure 5. Spectrum 29/37 (187 spectra)
2 0.9989 0.9468 12 22 0.9963 0.9976 32
3 0.9664 0.9965 13 23 0.9924 0.9686 33
4 0.9953 0.9781 14 24 0.7328 0.9955 34
5 0.9985 0.9910 15 25 0.9627 0.9863 35
6 0.9791 0.9988 16 26 0.9984 0.9964 36
7 0.9983 0.9993 17 27 0.9428 0.9847 37
8 0.8896 0.9698 18 28 0.9426
9 0.9825 0.9560 19 29 0.9917
10 0.9975 0.9988 20 30 0.9974

Figures 3 to 6 shown the best neutron spectra


unfolded and the neutron spectra and dose calculus
obtained at final testing stage of the designed ANNs
compared with the target ones applying the RDANN.
Figure 6. Correlation test of spectrum 29/37 (187 spectra)

Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06)


0-7695-2569-5/06 $20.00 © 2006
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on March 05,2010 at 23:54:02 EST from IEEE Xplore. Restrictions apply.
4. Conclusions [6] Shyam M.N., “Robust Design”, Departament of
aerospace engineering, Indian institute of technology,
Bombay, 2002.
A systematic and experimental strategy is
developed for the robust design of ANN. In the [7] Lin, TY, and Tseng, CH., “Optimum Design for
proposed strategy, learning and structural parameters Artificial Neural Networks: an Example in a Bycicle
are simultaneously optimized under various noise Derailleur System”, Eng. Appl. Artificial Intelligence, 2000.
conditions, and the robust design problem is
formulated as a dynamic one, together chi square and [8] Peterson G.E., St. Clair D.C., Aylward S., and Bond W.,
correlation statistical analyses of the ANN output to “Using Taguchi´s Method of Experimental Design to Control
Errors in Layered Perceptrons”, IEEE transactions on neural
validate the results obtained. networks, 6, 1995.
The proposed method reduces significantly the time
required to prepare, to process and to present the [9] Chen Y., Tam S.C., Chen W.L., Zheng H.Y.,
information in an appropriate way to de designer, and “Application of Taguchi Method in the Optimization of Laser
in consequence, it reduces significantly the time spent Micro-Engraving of Photomasks”, Intern. J. Materials and
in the search of the optimal topology of the neural Product Tech., 11(3-4), 1996.
network being designed, letting to the researcher time
[10] Jiju A. and Jiju Frenie A., “Teaching the Taguchi
to solve the problem in with he is working and not Method to Industrial Engineers”, MCB University press,
spent a log time in determining the parameters 50(4), 2001.
involved with the network same.
The developed approach is proposed like a general [11] Ortiz-Rodríguez, J.M, “Diseño Robusto de Redes
method to design ANNs of the type MFFN trained by Neuronales Aplicadas en la Espectrometría de Neutrones”,
means of backpropagation learning algorithm to tesis de maestría, Unidad Académica de Estudios Nucleares,
determine the optimum parameters of the ANN being Universidad Autónoma de Zacatecas, 2005.
designed and it is useful to problems with data of the [12] Ortiz-Rodríguez J.M., Martínez-Blanco M.R., Vega-
type continuous o digital. Carrillo H.R., “Diseño Robusto de Redes Neuronales
The proposed systematic and experimental Artificiales”, Encuentro de Investigación en Ingeniería
approach is a useful alternative for the robust design of Eléctrica (ENINVIE) 2006, UAZ, 2006.
ANNs. It offers a convenient way of simultaneously
considering design and noise variables, and for [13] Matlab 7.0, Help Neural Network Toolbox, 2006.
incorporating the concept of robustness in the ANN
[14] JMP, [on line]. “The Statistical Discovery Software”. <
design process. The RDANN was applied with success http://www.jmp.com/support/jmpsoftwareupdates.shtml >
in nuclear sciences to solve neutron unfold spectrum [Consult Mayo de 2005].
and dose calculus problems, designing robust ANNs in
short time. [15] Ortiz-Rodríguez J.M., Martinez- Blanco M.R., Arteaga-
Arteaga T., Vega-Carrillo H.R., Hernández Dávila V.M.,
5. References Manzanares-Acuña E., “Reconstrucción de Espectros de
Neutrones a Partir del Sistema Espectrométrico de Esferas de
Bonner”, XVIII Conferencia Internacional y VIII Congreso
[1] Gupta M.M., Lin J., and Homma N., “Static and Nacional de dosimetría de estado sólido, Zacatecas,
Dynamic Neural Networks”, From Fundamentals to Zac., 2005.
Advanced Theory, John Wiley & Sons, 2003.
[16] Martínez - Blanco M.R., “Espectrometría de Neutrones
[2] Haykin S., “Neural networks: A Comprehensive y Calculo de Dosis Equivalentes Empleando la Metodología
Foundation”, Prentice Hall, 1999. de Diseño Robusto de Redes Neuronales Artificiales”, Tesis
de licenciatura, Unidad Académica de Ingeniería Eléctrica,
[3] Jain, A.K., Mao J., and Mohuiddin, K.M., “Artificial
Universidad Autónoma de Zacatecas, 2006.
Neural Networks”: a tutorial, IEEE, 1996.
[17] Martínez-Blanco M.R., Ortiz-Rodríguez J.M., Vega-
[4] Lippmann R.P., “An Introduction to Computing with
Carrillo H.R., “Espectrometría de Neutrones y Cálculo de
Neural Networks”, IEEE ASSP MAGAZINE, 4(2), 1987.
Dosis Equivalentes, Aplicando la Metodología de Diseño
[5] Packianather M.S. and Drake P.R., “Modelling Neural Robusto de Redes Neuronales Artificiales”, ENINVIE 2006,
Network Performance Through Response Surface UAZ, 2006.
Methodology for Classifying Wood Veneer Defects”, IMechE
[18] H.R.Vega-Carrillo, M.P. Íñiguez, “Catalogue to Select
(Proc. Instn. Mech. Engrs.), 218, Parte B, 2004.
the Initial Guess Spectrum During Unfolding”, 2002 Nucl.
Instrum. Meth. Phys. Res. A.

Proceedings of the Electronics, Robotics and Automotive Mechanics Conference (CERMA'06)


0-7695-2569-5/06 $20.00 © 2006
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on March 05,2010 at 23:54:02 EST from IEEE Xplore. Restrictions apply.

You might also like