Hybrid Model On SVM With Gaussian Lossfuntion

Hybrid model based on SVM with Gaussian loss function and adaptive
Gaussian PSO
Qi Wu
a,b,
, Shuyan Wu
c
, Jing Liu
d
a
School of Mechanical Engineering, Southeast University, Nanjing, Jiangsu 210096, China
b
Key Laboratory of Measurement and Control of CSE (School of Automation, Southeast University), Ministry of Education, Nanjing, Jiangsu 210096, China
c
Zhengzhou College of Animal Husbandry, Zhengzhou, Henan 450011, China
d
College of Information Engineering, Shanghai Maritime University, Shanghai 200135, China
a r t i c l e i n f o
Article history:
Received 15 March 2009
Received in revised form
9 July 2009
Accepted 16 July 2009
Keywords:
Support vector machine
Particle swarm optimization
Adaptive
Gaussian loss function
Forecasting
a b s t r a c t
In view of the bad capability of the standard support vector machine (SVM) in eld of white noise of
input series, a new v-SVM with Gaussian loss function which is call g-SVM is put forward to handle
white noises. To seek the unknown parameters of g-SVM, an adaptive normal Gaussian particle swarm
optimization (ANPSO) is also proposed. The results of applications show that the hybrid forecasting
model based on the g-SVM and ANPSO is feasible and effective, the comparison between the method
proposed in this paper and other ones is also given which proves this method is better than v-SVM and
other traditional methods.
& 2009 Elsevier Ltd. All rights reserved.
1. Introduction
As an implementation of the structural risk minimization
(SRM) principle where the generalization error is bounded by the
sum of the training error and a condence interval term
depending on the VC dimension, support vector machines (SVM)
(Vapnik, 1995) have recently attracted a lot of researchers from
the machine learning and pattern classication community for its
fascinating properties such as high generalization performance
and globally optimal solution (Mohammadi and Gharehpetian,
2009;

Ubeyli, 2008; Yang et al., 2009; Frias-Martinez et al., 2006;
Acr et al., 2006; Samanta et al., 2003; Liu and Chen, 2007; Hao,
2008). In SVM, original input space is mapped into a higher
dimensional feature space in which an optimal separating hyper-
plane is constructed on the basis of SRM to maximize the margin
between two classes, i.e., to maximize the generalization ability.
SVM was initially designed to solve pattern recognition problems
(Mohammadi and Gharehpetian, 2009;

Ubeyli, 2008; Yang et al.,
2009; Frias-Martinez et al., 2006; Acr et al., 2006; Samanta et al.,
2003; Liu and Chen, 2007; Hao, 2008). Recently, with
the introduction of Vapniks e-insensitive loss function, SVM
has been extended to function approximation and regression
estimation problems (Tao et al., 2005; Goel and Pal, 2009;
Osowski and Garanty, 2007; Colliez et al., 2006; Vong et al.,
2006; Bergeron et al., 2005; Huang et al., 2005; Wang et al., 2005;
Bao et al., 2005; Sun and Sun, 2003; Wu, 2009; Lute et al., 2009;
Wu et al., 2008a, b).
In SVR approach, the parameter e controls the sparseness of the
solution in an indirect way. However, it is difcult to come up
with a reasonable value of e without the prior information about
the accuracy of output values. Sch olkopf et al. (2000) modify the
original e-SVM and introduce n-SVM, where a new parameter v
controls the number of support vectors and the points that lie
outside of the e-insensitive tube. Then, the value of e in the n-SVM
is traded off between model complexity and slack variables via the
constant v.
However, some SVMs encounter some difculty in real
applications (Hao, 2008; Tao et al., 2005; Wu et al., 2008a). Some
improved SVMs have been put forward to solve the pattern
recognition problems (Liu and Chen, 2007; Hao, 2008) and
regression estimation problems (Huang et al., 2005; Wang et al.,
2005). The standard SVM adopting e -insensitive loss function has
good generalization capability in some applications (Mohammadi
and Gharehpetian, 2009;

Ubeyli, 2008; Yang et al., 2009; Frias-
Martinez et al., 2006; Acr et al., 2006; Samanta et al., 2003; Liu
and Chen, 2007; Hao, 2008; Tao et al., 2005; Goel and Pal, 2009;
Osowski and Garanty, 2007; Colliez et al., 2006; Vong et al., 2006;
Bergeron et al., 2005; Huang et al., 2005; Wang et al., 2005; Bao
ARTICLE IN PRESS
Contents lists available at ScienceDirect
journal homepage: www.elsevier.com/locate/engappai
Engineering Applications of Articial Intelligence
0952-1976/$ - see front matter & 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.engappai.2009.07.003
Corresponding author at: School of Mechanical Engineering, Southeast

University, Nanjing, Jiangsu 210096, China. Tel.: +862551166581;
fax: +8625511665260.
E-mail address: wuqi7812@163.com (Q. Wu).
Please cite this article as: Wu, Q., et al., Hybrid model based on SVM with Gaussian loss function and adaptive Gaussian PSO.
Engineering Applications of Articial Intelligence (2009), doi:10.1016/j.engappai.2009.07.003
Engineering Applications of Articial Intelligence ] (]]]]) ]]]]]]
ARTICLE IN PRESS
et al., 2005). But it is difcult to handle the normal distribution
noise parts of data set (Vapnik, 2000; Cao and Wang, 2007).
Therefore, the main contribution of this paper focuses on the
modeling of a new v-SVM that can penalize the Gaussian noises
from input series.
In this paper, a new v-SVM with Gaussian loss function, which
is called g-SVM, is proposed to penalize white noises from series
effectively, while some SVMs with e-insensitive loss function are
unsuitable to do it. Many SVM literatures focus on slack variables
with one power, while our support vector machine with square
slack variables on the conditions of immovable constraints. Our
new SVM is also different from least square SVM in Vong et al.,
2006. In Vong et al., 2006, least square SVM has also square slack
variables, but its constraint conditions have been modied into
equality form.
The g-SVM is described in Section 2. Section 3 provides a new
PSO, which is called PSO with adaptive and normal Gaussian
mutation (ANPSO), to obtain the optimal parameters of g-SVM.
Two applications based on g-SVM and ANPSO are given in Section
4, and then g-SVM is also compared with the standard v-SVM and
autoregressive moving average (ARMA Wu et al., 2008a). Section 5
draws the conclusions.
2. g-SVM
2.1. g-SVM model
Suppose training sample set T={(x
1
,y
1
),y,(x
i
,y
i
),y,(x
l
,y
l
)},
where x
i
AR
d
, y
i
AR, i =1, y, l. e-insensitive loss function can be
described as follows:
cx
i
; y
i
; f x
i
jy
i
f x
i
j
e
1
where |y
i
f(x
i
)|
e
=max{0,|y
i
f(x
i
)| e}, e is a given real number.
The standard v-SVM with e-insensitive loss function can be
described as follows:
min
w;b;n
;e
tw; n
; e
1
2
jjwjj
2
C v e
1
l
l
i 1
x
i
x
i
_ _
s:t:
_ w x
i
b y
i
rex
i
y
i
w x
i
brex
i
Z0; eZ0
2
where w is a column vector with d dimension, C40 is a penalty
factor, x
i
(*)
(i =1, yl) are slack variables, vA(0, 1] is an adjustable
regularization parameter, e is also an adjustable tube magnitude
parameter.
Denition 1. White noise
Noises with normal distribution are called white noises.
Gaussian noises are a special case of white noises. Generally, the
standard Gaussian density model N(0, 1) is commonly used to
describe noise.
However, for e-insensitive loss, it is difcult to handle the white
noises from inputting series (Cao and Wang, 2007). To solve the
shortage of e-insensitive loss, Gaussian function is selected as the
loss function of v-SVM. Then Gaussian loss function can be
dened as follows:
cx
i
; y
i
; f x
i
jy
i
f x
i
j
e
3
where |y
i
f(x
i
)|
e
=(1/2)||y
i
f(x
i
)||
2
, e is controlling parameter of
e-tube.
By the integration of Gaussian loss function and SVM theory, a
new v-SVM called g-SVM is proposed. g-SVM can penalize the
normal Gaussian noises from sample data, which is described as
follows:
min
w;b;n
;e
tw; n
; e
1
2
jjwjj
2
C v e
1
l
l
i 1
1
2
x
2
i
x
2
i

_ _
s:t:
_ w x
i
b y
i
rex
i
y
i
w x
i
b rex
i
Z0; eZ0
4
It is obvious that slack variables x
i
(*)
of g-SVM whose structure
is illuminated in Fig. 1 appear in square, ones of v-SVM do in one
power. Since g-SVM shown in Fig. 1 has the same structure with
standard v-SVM, g-SVM can be also solved by quadratic
programming (QP). Obviously, the constraint conditions of g-
SVM are the same with that of v-SVM. Compared with least square
support vector machine (LS-SVM) (Vong et al., 2006), the
constraint conditions of g-SVM appear in the form of inequation,
while that of LS-SVM do in the form of equation.
Problem (4) is a quadratic programming (QP) problem. The
steps of its solution are described as follows:
Step 1: Suppose the training sample set T={(x
1
,y
1
),y,
(x
i
,y
i
),y,(x
l
,y
l
)}, where x
i
AR
d
, y
i
AR.
Step 2: Select the kernel function K, regularization parameter v
and the penalty factor C. Construct the QP problem (4) of the g-
SVM.
Step 3: By introducing Lagrangian multipliers, a dual problem
can be dened as follows:
max
a;a
Wa; a
l
i 1
a
i
a
i
y
i

1
2
l
i;j 1
a
i
a
i
a
j
a
j
Kx
i
; x
j

l
2C
l
i 1
a
2
i
a
i
s:t:
_
l
i 1
a
i
a
i
0
l
i 1
a
i
a
i
rC v
a
i
; a
i
A0; C=l; i 1; . . . ; l
5
The Lagrangian multipliers a
i
(*)
can be determined by solving
the problem (5).
Step 4: For a new input x, construct the following regression
function
f x
l
i 1
a
i
a
i
Kx
i
; xb 6
y
x
K(x
1
, x) K(x
2
, x) K(x
i
, x) K(x
n
, x)
w
1
w
i
w
n
w
2
x
1
x
2
x
n
Fig. 1. The architecture of g-SVM.
Q. Wu et al. / Engineering Applications of Articial Intelligence ] (]]]]) ]]]]]] 2
ARTICLE IN PRESS
Select the two scalars a
j
(a
j
A(0,l/C)) and a*
k
(a*
k
A(0,l/C)),
parameter b can be computed by
b
1
2
y
j
y
k

_
l
i 1
a
i
a
i
Kx
i
; x
j
l
i 1
a
i
a
i
Kx
i
; x
k
_
7
Parameter e can be obtained by either of the following
equations:
e
l
i 1
a
i
a
i
Kx
i
; x
j
b y
j
8
or
e y
k

l
i 1
a
i
a
i
Kx
i
; x
k
b 9
Weight vector w of the problem (4) has a unique solution,
while Lagrangian multiplier a
(*)
have many solutions. Theorem 1
which explains the relationship between Lagrangian multiplier
a
(*)
and weight vector w is given as follows:
Theorem 1. Let a
(*)
=(a
1
,a
*
1
, y, a
l
,a*
l
) be arbitrary solution of dual
problem (5), the unique solution of weight vector w can be
represented as follows:
w
l
i 1
a
i
a
i
x
i
: 10
The proof of Theorem 1 is arranged in Appendix A.
3. Particle swarm optimization with adaptive and normal
Gaussian mutation
It is difcult to conrm the optimal parameters of the SVM
model. There exist experiential errors in crossover validation
method used traditionally to seek these optimal parameters of
SVM such as penalty coefcient, controlling vector and kernel
parameter. Now, evolutional algorithms such as genetic algorithm
(Samanta et al., 2003) and particle swarm optimization (Kenedy
and Eberheart, 1995) are used for parameters selection of SVM
universally. Nevertheless, these standard evolutional algorithms
encounter premature convergence and cannot obtain the global
optimum (Samanta et al., 2003; Wu et al., 2008b). To overcome
the shortage, a new PSO with adaptive and normal Gaussian
mutation operators is proposed, namely adaptive and normal
Gaussian particle swarm optimization (ANPSO), as is utilized to
optimize the parameters of g-SVM.
3.1. Standard particle swarm optimization
Similar to evolutionary computation technique (Samanta et al.,
2003), PSO uses a set of particles, representing potential solutions
to the problem under consideration. The swarm consists of n
particles. Each particle has a position X
i
=(x
i1
,x
i2
, y, x
ij
, y, x
im
), and
a velocity V
i
=(v
i1
,v
i2
, y, v
ij
, y, v
im
), where i =1,2, y, n; j =1,2, y, m,
and moves through an m-dimensional search space. According to
the global variant of the PSO, each particle moves towards its best
previous position and towards the best particle g in the swarm. Let
us denote the best previously visited position of the ith particle
that gives the best tness value as P
i
=(p
i1
,p
i2
, y, p
ij
, y, p
im
), and
the best previously visited position of the swarm that gives the
best tness as pg=(pg
1
,pg
2
, y, pg
j
, y, pg
m
).
The change of position of each particle from one iteration to
another can be computed according to the distance between the
current position and its previous best position, and the distance
between the current position and the best position of swarm.
Then the updating of particle velocity and position can be
obtained by the following equations:
v
k1
ij
w v
k
ij
c
1
r
1
p
ij
x
k
ij
c
2
r
2
pg
j
x
k
ij
11
x
k1
ij
x
k
ij
v
k1
ij
12
where w is called inertia weight and is employed to control the
impact of the previous history of velocities on the current one.
Accordingly, the parameter w regulates the trade-off between the
global and local exploration abilities of the swarm. A large inertia
weight facilitates global exploration, while a small one tends to
facilitate local exploration. A suitable value of the inertia weight w
usually provides the balance between global and local exploration
abilities, and consequently results in a reduction of the number of
iterations required to locate the optimum solution. k denotes the
iteration number, c
1
is the cognition learning factor, c
2
is the social
learning factor, r
1
and r
2
are random numbers uniformly distrib-
uted in the range [0, 1].
Thus, the particle ies through potential solutions towards P
i
k
and pg
k
in a navigated way while still exploring new areas by the
stochastic mechanism to escape from local optima. Since there
was no actual mechanism for controlling the velocity of a particle,
it was necessary to impose a maximum value V
max
on it. If the
velocity exceeds the threshold, it is set equal to V
max
, which
controls the maximum travel distance at each iteration to avoid
this particle ying past good solutions. The PSO is terminated with
a maximal number of generations or the best particle position
when the entire swarm cannot be improved further after a
sufciently large number of generations. The PSOs have shown
its robustness and efcacy in solving function value optimiza-
tion problems in real number spaces (Wu, 2009; Wu et al.,
2008b; Zhao and Yin, 2009; Fei et al., 2009; Krohling and Coelho,
2006).
3.2. The PSO with adaptive and normal Gaussian mutation operators
(ANPSO)
One of the major drawbacks of the standard PSO is its
premature convergence, especially while handling problems with
more local optima. Some improved PSOs have been published to
solve many real problems such as parameters identication (Cao
and Wang, 2007), constrained optimization problems (Zhao and
Yin, 2009). Aim at these drawbacks of the standard PSO, the
adaptive mutation operator is proposed to regulate the inertia
weight of velocity by means of the tness value of object function
and iterative variable. The normal Gaussian mutation operator
(Zhao and Yin, 2009) is also considered to correct the direction of
particle velocity at the same time. The adaptive mutation is highly
efcient operator on the conditions of real number code. The
quality of the solution is related tightly with the mutation
operator. The aforementioned problem is addressed by incorpor-
ating adaptive mutation and normal Gaussian mutation for the
previous velocity of the particle. Thus, the PSO with adaptive and
normal Gaussian mutation operators (ANPSO) can update the
particle velocity and position by the following equations:
v
k1
id
1 lw
k
id
v
k
id
lN0; s
k
i
c
1
r
1
p
id
x
k
id
c
2
r
2
p
gd
x
k
id
13
x
k1
id
x
k
id
v
k1
id
14
w
k
id
b
_
1 f x
k
i
=f x
k
m
_
1 bw
0
id
expak
2
15
s
k1
i
s
k
i
expN
i
0; Ds 16
ARTICLE IN PRESS
where Ds is standard error of normal Gaussian distribution,
b is the adaptive coefcient, l is the increment coefcient,
a is the attenuation coefcient of controlling particle
velocity.
The rst item of Eq. (13) denotes the adaptive mutation of
velocity inertia weight mutation based on iterative variable (k)
and the tness function value f(x
k
). The particles with the bigger
tness mutate in a smaller scope, while the ones with the smaller
tness mutate in a bigger scope by Eq. (15). The second item of Eq.
(13) represents the normal Gaussian mutation based on the
iterative variable (k). The particles mutate in big scope on the
smaller iterative variable (k) and search the local optimal value,
while the particles mutate in small scope on the bigger iterative
variable, search the optimal value in small space and gradually
reach the global optimal value. The operator of normal Gaussian
mutation correct the change of particle velocity is represented in
Eqs. (13) and (16). In the strategy of normal Gaussian mutation,
the proposed velocity vector v
k+1
=(v
1
k+1
,v
2
k+1
, y, v
m
k+1
) consists
of last generation velocity vector v
k
=(v
1
k
,v
2
k
, y, v
m
k
) and Gaussian
perturbation vector r
k
=(s
1
k
,s
2
k
, y, s
m
k
). The Gaussian perturba-
tion vector mutates itself by Eq. (16) on the each iterative process
as a controlling vector of velocity vector.
The adaptive and normal Gaussian mutation operators can
restore the diversity loss of the population and improve the
capability of the global searching performance of the proposed
algorithm.
3.3. The parameters selection of g-SVM based on ANPSO
To analyze the regression estimation performance of g-SVM,
two applications are studied in this paper. Then, we combine the
forecasts with g-SVM, a hybrid model shown in Fig. 2 can be
described as follows: ANPSO optimizes the parameters of g-SVM.
The obtained optimal parameters are input into g-SVM, then g-
SVM gives the forecasting results.
Many actual applications (Wu et al., 2008a, b) suggest that
radial basis functions tend to perform well under general
smoothness assumptions, so that they should be considered
especially if no additional knowledge of the data is available. In
this paper Gaussian radial basis function is used as the kernel
function of g-SVM.
4. Applications
In many real applications, the observed input data cannot be
measured precisely and usually described in linguistic levels or
ambiguous metrics. However, traditional support vector regres-
sion (SVR) method cannot cope with qualitative information. It is
well known that fuzzy logic is a powerful tool to deal with fuzzy
and uncertain data. All linguistic information of inuencing
factors is dealt with by fuzzy comprehensive evaluation and
forms numerical information. Suppose the number of variables is
n, and n=n
1
+n
2
, where n
1
and n
2
, respectively, denote the number
of fuzzy linguistic variables and crisp numerical variables. The
linguistic variables are evaluated in several description levels, and
a real number between 0 and 1 can be assigned to each
description level. Distinct numerical variables have different
dimensions and should be normalized rstly. The following
normalization is adopted:
x
d
i

x
d
i
minx
d
i
j
l
i 1
maxx
d
i
j
l
i 1
minx
d
i
j
l
i 1
; d 1; 2; . . . ; n
2
17
where l is the number of samples, x
i
d
and x i
d
denote the original
value and the normalized value, respectively. In fact, all the
numerical variables from (1) through (16) are the normalized
values although they are not marked by bars.
The g-SVM has been implemented in Matlab 7.1 programming
language. The following two experiments are made on a 1.80GHz
Core (TM) 2 CPU personal computer (PC) with 1.0G memory
under Microsoft Windows xp professional. The proposed ANPSO
algorithm has been implemented in Matlab 7.1 programming
language. For standard PSO, the inertia weight w
0
varies in (0, 1],
positive acceleration constants (c
1
,c
2
) are equal to 2 generally. A
biggish inertia weight w
0
can facilitate global exploration, while a
lesser one tends to facilitate local exploration. Therefore, the
initial parameters of inertia weight and positive acceleration
constants in ANPSO are given as follows: w
0
=0.9 and c
1
,c
2
=2. The
standard error of normal distribution (DsA(0, 1]) is equal to 0.5 by
experiment conrmation. The adaptive coefcient (bA(0, 1]) is
equal to 0.8. The Gaussian perturbation adjusts (corrects) the
direction of particle velocity by increment coefcient (lA(0, 1]).
The tness accuracy of the normalized samples is equal to 0.0002.
The attenuation coefcient of controlling particle velocity (a41)
is equal to 2. To evaluate forecasting capacity of g-SVM, some
evaluation indexes, such as mean absolute error (MAE), mean
absolute percentage error (MAPE) and mean square error (MSE),
are utilized to handle the forecasting results of g-SVM. The
computational formulations of these selected indexes are shown
in Table 1.
Example 1. The g-SVM model is applied in car sale forecasts.
In this experiment, car sale series are selected from past sale
record in a typical company. The detailed characteristic data and
sale series of these cars compose the corresponding training and
testing sample sets. During the process of the car sale series
forecasting, six inuencing factors shown in Table 2, viz., brand
famous degree (BF), performance parameter (PP), form beauty
(FB), sales experience (SE), oil price (OP) and dweller deposit (DD)
are taken into account.
The optimal combinational parameters are obtained by
Algorithm ANPSO, viz., C=715.11, v=0.96 and s=0.01. Fig. 3
illuminates the forecasting results provided by ANPSO g-SVM.
To analyze the forecasting capability of the proposed hybrid
model (ANPSO g-SVM) based on ANPSO and g-SVM, the models
(autoregressive moving average (ARMA), hybrid model (PSO v-
SVM) based on PSO and standard v-SVM, and hybrid model
(ANPSO v-SVM) based on ANPSO and standard v-SVM) are
ANPSO
g-SVM
Input Data
Fuzzy Pretreatment
Normalization
Accuracy
Limitation
Forecasting
Results
Y
N
Output the optimal
combinational paramters
Accuracy check
Output the current
combinational parameters
Fig. 2. The ANPSO optimizes the parameters of g-SVM.
ARTICLE IN PRESS
selected to handle the above car sale series. Their forecasting
results are shown in Table 3.
The indexes MAE, MAPE and MSE are used to evaluate the
forecasting capability of four models shown in Table 4. The sample
data of the latest 12 months are used for testing set. The
corresponding forecasting results are used to analyze the
forecasting performance of the above models. It is obvious that
the forecasting accuracy provided by support vector machines
excel the one by autoregressive moving average (ARMA). For the
same v-SVM, the indexes (MAE, MAPE and MSE) provided by
ANPSO are better than ones of PSO. For the same ANPSO, the
indexes (MAE, MAPE and MSE) provided by ANPSO g-SVM with
Gaussian loss function are better than ones of ANPSO v-SVM with
e-insensitive loss function. Considering the non-linear inuence
from multidimensional series, it is found that g-SVM can handle
the normal Gaussian noise from input series effectively.
Example 2. The g-SVM model is applied in the estimation of
product design time.
As global competition increases and product life cycle shortens,
companies try to employ effective management to accelerate
product development. However, product development projects
are often suffered with schedule overruns. In most cases,
problems of overruns were due to poor estimations. That is
coincident with the saying you cant control what you do not
measure (DeMarco, 1998). In the whole product development
process (PDP), product design is an important phase. The control
and decision of product development is based on the pre-
estimation of product design time (PDT). Nevertheless, PDP
always means the brand-new or modied product design. Thus
the cycle time of design process cannot be measured directly.
Much attention has been focused on reducing the time/cost in
product design (Grifn, 1997a, b; Bashir and Thomson, 2001; Seo
et al., 2002), but little systematic research has been conducted
into the time estimation. Traditionally, approximate design time is
determined empirically by designers in companies. With the
increase of market competition and product complexity, compa-
nies require more accurate and creditable solutions.
To illustrate the time estimation method, the design of plastic
injection mold is studied. The injection mold is a kind of single-
piece-designed product and the design process is usually driven
by customer orders. Some time factors with large inuencing
weights are gathered to develop a factor list, as shown in Table 5.
The rst three factors are expressed as linguistic information and
the last three factors are expressed as numerical data.
In this experiment, 72 sets of molds with corresponding design
time are selected from past projects in a typical company. The
detailed characteristic data and design time of these molds
compose the corresponding patterns, as shown in Table 6. We
train the g-SVM with 60 patterns, and the others are used for
testing. The simulation environment of Example 2 is the same as
that of Example 1.
To analyze the estimating capability of ANPSOg-SVM, the
models (ARMA, PSOv-SVM and ANPSOv-SVM) are also selected to
Table 1
The expression of the selected error indexes.
Indexes Appellation Formula Requirement of indexes
MAE Mean absolute error
t =1
n
|y
t
y t
|/n The less the better
MAPE Mean absolute percentage error (100
t =1
n
|y
t
y t
/y
t
|)/n The less the better
MSE Mean square error
t =1
n
(y
t
y t
)
2
/n The less the better
Table 2
Inuencing factors of product sale forecasts.
Product characteristics Unit Expression Weight
Brand famous degree (BF) Dimensionless Linguistic information 0.9
Performance parameter (PP) Dimensionless Linguistic information 0.8
Form beauty (FB) Dimensionless Linguistic information 0.8
Sales experience (SE) Dimensionless Linguistic information 0.5
Oil price (OP) Dimensionless Linguistic information 0.8
Dweller deposit (DD) Dimensionless Numerical information 0.4
Fig. 3. The car sales forecasting results from ANPSOg-SVM model.
Table 3
Comparison of forecasting result from four different models.
No. Real value Forecasting Value
ARMA PSOv-SVM ANPSOv-SVM ANPSOg-SVM
1 1781 1500 1781 1769 1773
2 2561 1527 2489 2476 2499
3 258 1551 416 404 382
4 1135 1546 1171 1158 1136
5 2094 1536 2069 2056 2079
6 453 1562 554 541 520
7 1625 1555 1673 1686 1588
8 908 1583 1001 989 967
9 2047 1531 1957 1945 1967
10 580 1580 667 655 633
11 569 1617 696 684 662
12 914 1627 1008 994 972
Table 4
Error statistic of four forecasting models.
Model MAE MAPE MSE
ARMA 725.67 1.1695 655300
PSOv-SVM 77.58 0.1299 78214
ANPSOv-SVM 75.5 0.1195 70282
ANPSOg-SVM 54.75 0.0925 41726
ARTICLE IN PRESS
handle the above series. Their estimating results are shown in
Table 7.
The indexes (MAE, MAPE and MSE) are also used to evaluate
the estimating capability of three models shown in Table 8.
The sample data of the latest 12 molds are used for testing set. The
corresponding estimating results are used to analyze the
forecasting performance of the above models. The conclusion
can be obtained as follows: For the same v-SVM, the indexes
(MAE, MAPE and MSE) provided by ANPSO are better than ones of
PSO. For the same ANPSO, the indexes (MAE, MAPE and MSE)
provided by ANPSOg-SVM with Gaussian loss function are better
than ones of ANPSOv-SVM with e-insensitive loss function.
Considering the non-linear inuence from multidimensional
series, it is obvious that g-SVM can handle the normal Gaussian
noise from input series effectively. ANPSO is also available for the
g-SVM to seek the optimal parameters.
It is shown in Tables 2 and 3 of Example 1 and Tables 7 and 8 of
Example 2 that the selection of loss function is important for
production sale forecasting. The optimal loss function in regres-
sion estimation is actually related to the noise in the data (Vapnik,
2000). For the normal additive noise, the Gaussian loss function is
the best choice. For the noise with symmetric density, Hubers
least modulus function performs best (Cao and Wang, 2007).
Considering the enterprise environment, some errors exist
inevitably in the process of data gather and estimation. Thus,
the above forecasting (estimating) results are satisfying. The
results of applications indicate that the forecasting (estimating)
method based on g-SVM is feasible and effective.
5. Conclusion
In this paper, a new version of SVM, named g-SVM, is proposed
to handle white noises from inputting data by the integration of
Gaussian loss function and n-SVM. The performance of the g-SVM
is evaluated by the above two examples, and the results of the
simulation demonstrate that the g-SVM is effective in dealing
with white noises of uncertain data and nite samples. Moreover,
it is shown that the parameter-choosing algorithm (ANPSO)
presented here is available for the g-SVM to seek the optimal
parameters.
Compared to ARMA, the g-SVM has some other attractive
properties, such as the strong learning capability for these cases
with a small quantity of data, the good generalization perfor-
mance, the insensitivity to noise or outliers (Cao and Wang, 2007)
and the steerable approximation parameters. Compared to v-SVM,
the g-SVM can penalize the white noise of data effectively.
In two experiments, the xed adaptive coefcients (b,l), the
second step control parameter Ds of normal mutation and the
attenuation parameter a of the control velocity are adopted.
However, how to choose an appropriate combinatorial coefcients
are not explored deeply in this paper. The research on the velocity
changes is a meaningful problem for future research when
different above-mentioned parameters are adopted.
Acknowledgements
This work is supported by the National Natural Science
Foundation of China under grant 60904043, China Postdoctoral
Science Foundation (20090451152), Jiangsu Planned Projects for
Postdoctoral Research Funds (0901023C) and Shanghai Education
Table 5
Time factors of product characteristic for injection model.
Product characteristics Unit Expression Weight
Structure complexity (SC) Dimensionless Linguistic information 0.9
Model difculty (MD) Dimensionless Linguistic information 0.7
Wainscot gauge variation (WGV) Dimensionless Linguistic information 0.7
Cavity number (CN) Dimensionless Numerical information 0.8
Mold size (height/diameter) (MS) Dimensionless Numerical information 0.55
Form feature number (FFN) Dimensionless Numerical information 0.55
Table 6
Learning and testing data.
Molds Input data Desired outputs
(h)
No. Name SC MD WGV CN MS FFN
1 Global handle L L L 4 3.1 3 23
2 Water bottle lid H L H 4 0.56 7 45.5
3 Medicine lid H M VL 4 1.5 6 37
4 Footbath basin VL VL VL 1 0.5 3 10
5 Litter basket L M H 1 2.1 12 42.5
6 Plastic silk
ower
L M M 1 7.1 4 29.5
7 Dining chair M H L 1 0.5 15 48
8 Spindling
bushing
H VL L 2 8.07 2 30
9 Three-way pipe H L L 1 0.45 5 24.5
10 Hydrant shell VH H M 1 0.3 7 49
71 Paper-lead
pulley
L M H 10 5 10 59
72 Winding tray M M VH 12 7.9 2 69
Table 7
Comparison of estimating results from three different models.
No. Real value Forecasting value
PSOv-SVM ANPSOv-SVM ANPSOg-SVM
1 69 70.41 70.32 70.02
2 59 59.79 59.59 59.17
3 61.5 62.79 62.53 62. 43
4 55.5 57.82 57.37 55.86
5 50.5 51.71 52.10 52.26
6 28.5 31.74 30.93 30.27
7 64.5 65.85 65.39 65.31
8 70 71.31 71.33 71.02
9 39 38.98 39.34 39.60
10 42 43.89 43.55 43.03
11 67 68.48 68.29 68.02
12 95.5 94.26 94.88 95.08
Table 8
Error statistics of three estimating models.
Model MAE MAPE MSE
PSOv-SVM 1.4625 0.0295 2.7050
ANPSOv-SVM 1.2383 0.0251 1.8534
ANPSOg-SVM 0.9092 0.0186 1.0524
ARTICLE IN PRESS
Development Foundation Chenguang Project (2008CG55). We
thank the editor-in-chief and three reviewers for their helpful
comments that greatly improved the article.
Appendix A. The proof of Theorem 1
Proof. Suppose H=(y i
,y j
(xi
x j
))
l l
, e=(1, y, 1)
T
, a=(a
1
, y, a
l
)
T
,
y=(y
1
, y, y
l
)
T
. If a
i
(*)
=(a
1
,a*
1
, y, a
l
,a*
l
) is the arbitrary solution of
dual problem (5), in the light of KarushKuhnTucker (KKT)
conditions, an existing multiplier b,s
(*)
,e,n
(*)
make the following
formulations:
Ha
a ybeee s
0 18
Ha
ay beee sn 0 19
s
Z0; x
Z0; eZ0 20
n
T
a

C
l
e
_ _
0; s
T
a
0; ee
T
aa
Cv 0 21
The following inequality can be obtained by Eqs. (18)(21):
Ha
a ybeZ ee n
22
Ha
ay beZ ee n 23
x
Z0; eZ0 24
Let w=
i =1
l
(a*
i
a
i
)x
i
, Eqs. (25)(27) are equipollent to Eqs.
(22)(24), respectively.
w x
i
b y
i
rex
i
25
y
i
w x
i
b rex
i
26
n
Z0; eZ0 27
Henece, (w,b,n
(*)
,e) satisfy the constraint conditions problem (4).
According to Eq. (21), the original problem (4) can be described
as follows:
1
2
JwJ
2
C v e
1
2
l
i 1
x
2
i
x
2
i

_ _
1
2
JwJ
2
C
v e
1
2
l
i 1
x
2
i
x
2
i

_ _
a
T
Ha
a ybeee s
a
T
Ha
ay beee sn
1
2
a
a
T
Ha
a ee
T
a
a Cvy
T
a
a
be
T
a
a n
T
a
C
2
e
_ _
s
T
a
1
2
a
a
T
Ha
a y
T
a
a
1
2C
a
2
a
2
28
Eq. (28) is equal to Eq. (5), namely, the original problem Eq. (4)
is equal to the objective function of dual problem. According to
Wolfe theorem, (w,b,n
(*)
,e) is also the solution of the original
problem (4), where w=
i =1
l
(a*
i
a
i
)x
i
. This completes the proof of
Theorem 1. &
References
Acr, N., O
zdamar, O
., G uzelis-, C., 2006. Automatic classication of auditory

brainstem responses using SVM-based feature selection algorithm for thresh-
old detection. Engineering Applications of Articial Intelligence 19 (2), 209
218.
Bao, Y.K., Liu, Z.T., Guo, L., Wang, W., 2005. Forecasting STOCK composite index by
fuzzy support vector machines regression. In: Proceedings of the 2005
International Conference on Machine Learning and Cybernetics, vol. 6 nos.
1821, pp. 35353540.
Bashir, H.A., Thomson, V., 2001. Models for estimating design effort and time.
Design Studies 22 (2), 141155.
Bergeron, C., Cheriet, F., Ronsky, J., Zernicke, R., Labelle, H., 2005. Prediction of
anterior scoliotic spinal curve from trunk surface using support vector
regression. Engineering Applications of Articial Intelligence 18 (8),
973983.
Cao, L.J., Wang, X.M., 2007. Support vector machine based methods for nancial
and engineering problems. Shanghai university of nance and economics
Press.
Colliez, J., Dufrenois, F., Hamad, D., 2006. Optic ow estimation by support vector
regression. Engineering Applications of Articial Intelligence 19 (7), 761768.
DeMarco, T., 1998. Controlling Software Projects: Management, Measurement, And
Estimation. Yourdon Press, Englewood Cliffs, NJ.
Fei, S.W., Wang, M.J., Miao, Y.B., et al., 2009. Particle swarm optimization-based
support vector machine for forecasting dissolved gases content in power
transformer oil. Energy Conversion and Management 50 (6), 16041609.
Frias-Martinez, E., Sanchez, A., Velez, J., 2006. Support vector machines versus
multi-layer perceptrons for efcient off-line signature recognition. Engineering
Applications of Articial Intelligence 19 (6), 693704.
Goel, A., Pal, M., 2009. Application of support vector machines in scour prediction
on grade-control structures. Engineering Applications of Articial Intelligence
22 (2), 216223.
Grifn, A., 1997a. Modeling and measuring product development cycle time across
industries. Journal of Engineering and Technology Management 14 (1), 124.
Grifn, A., 1997b. The effect of project and process characteristics on product
development cycle time. Journal of Marketing Research 34 (1),
2435.
Hao, P.Y., 2008. Fuzzy one-class support vector machines. Fuzzy Sets and Systems,
159(18), 23172336.
Huang, C.J., Lai, W.K., Luo, R.L., Yan, Y.L., 2005. Application of support vector
machines to bandwidth reservation in sectored cellular communications.
Engineering Applications of Articial Intelligence 18 (5), 585594.
Kenedy, J., Eberheart, R., 1995. Particle swarm optimization. In: Proceedings of the
IEEE International Conference on Neural Networks, pp. 19421948.
Krohling, A.R., Coelho, L.S., 2006. Coevolutionary particle swarm optimization
using Gaussian distribution for solving constrained optimization problems.
IEEE Transactions on Systems, Man, and CyberneticsPart B: Cybernetics 36
(6), 14071416.
Liu, Y.H., Chen, Y.T., 2007. Face recognition using total margin-based adaptive fuzzy
support vector machines. IEEE Transactions on Neural Networks 18 (1), 178
192.
Lute, V., Upadhyay, A., Singh, K.K., 2009. Support vector machine based
aerodynamic analysis of cable stayed bridges. Advances in Engineering
Software 40 (9), 830835.
Mohammadi, M., Gharehpetian, G.B., 2009. On-line voltage security assessment of
power systems using core vector machines. Engineering Applications of
Articial Intelligence, 22(4-5), 695-701.
Osowski, S., Garanty, K., 2007. Forecasting of the daily meteorological pollution
using wavelets and support vector machine. Engineering Applications of
Articial Intelligence 20 (6), 745755.
Samanta, B., Al-Balushi, K.R., Al-Araimi, S.A., 2003. Articial neural networks and
support vector machines with genetic algorithm for bearing fault detection.
Sch olkopf, B., Smola, A.J., Williamson, R.C., et al., 2000. New support vector
algorithms. Neural Computation 12 (5), 12071245.
Seo, K.K., Park, J.H., Jang, D.S., Wallace, D., 2002. Approximate estimation of the
product life cycle cost using articial neural networks in conceptual design.
International Journal of Advanced Manufacturing Technology 19 (6), 461471.
Sun, Z.H., Sun, Y.X., 2003. Fuzzy support vector machine for regression estimation.
IEEE International Conference on Systems, Man and Cybernetics, 4(1821);
2003, pp. 33363341.
Tao, Q., Wu, G.W., Wang, F.Y., 2005. Posterior probability support vector machines
for unbalanced data. IEEE Transactions on Neural Networks 16 (6), 15611573.
Ubeyli, E.D., 2008. Support vector machines for detection of electrocardiographic

changes in partial epileptic patients. Engineering Applications of Articial
Intelligence 21 (8), 11961203.
Vapnik, V., 1995. The Nature of Statistical Learning. Springer, New York.
Vapnik, V., 2000. The Nature of Statistical Learning. Springer, New York.
Vong, C.M., Wong, P.K., Li, Y.P., 2006. Prediction of automotive engine power and
torque using least squares support vector machines and Bayesian inference.
Wang, Y.Q., Wang, S.Y., Lai, K.K., 2005. A new fuzzy support vector Machinema-
chine to Evaluate Credit Riskevaluate credit risk. IEEE Transactions on Fuzzy
Systems 13 (6), 820831.
ARTICLE IN PRESS
Wu, Q., 2009. The forecasting model based on wavelet v-support vector machine.
Expert Systems with Applications 36 (4), 76047610.
Wu, Q., Yan, H.S., Yang, H.B., 2008a. A hybrid forecasting model based on chaotic
mapping and improved support vector machine. In: Proceedings of the 9th
International Conference for Young Computer Scientists, pp. 27012706.
Wu, Q., Yan, H.S., Yang, H.B., 2008b. A forecasting model based support vector
machine and particle swarm optimization. In: Proceedings of the 2008
Workshop on Power Electronics and Intelligent Transportation System, pp.
218222.
Yang, S.Y., Huang, Q., Li, L.L., et al., 2009. An integrated scheme for feature selection
and parameter setting in the support vector machine modeling and its
application to the prediction of pharmacokinetic properties of drugs. Articial
Intelligence in Medicine 46 (2), 155163.
Zhao, H.B., Yin, S., 2009. Geomechanical parameters identication by particle
swarm optimization and support vector machine. Applied Mathematical
Modelling 33 (10), 39974012.

Hybrid Model On SVM With Gaussian Lossfuntion

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hybrid Model On SVM With Gaussian Lossfuntion

Uploaded by

Copyright:

Available Formats

Hybrid model based on SVM with Gaussian loss function and adaptive

Corresponding author at: School of Mechanical Engineering, Southeast

., G uzelis-, C., 2006. Automatic classication of auditory

Ubeyli, E.D., 2008. Support vector machines for detection of electrocardiographic

You might also like