You are on page 1of 15

STRUCTURAL CONTROL AND HEALTH MONITORING

Struct. Control Health Monit. 2016; 23:252266


Published online 29 June 2015 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/stc.1767

Performance improvement method of support vector machine-based


model monitoring dam safety

Huaizhi Su1,2,*,, Zhexin Chen3 and Zhiping Wen4


1
State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Hohai University, Nanjing 210098, China
2
College of Water Conservancy and Hydropower Engineering, Hohai University, Nanjing 210098, China
3
National Engineering Research Center of Water Resources Efcient Utilization and Engineering Safety, Nanjing 210098, China
4
Department of Computer Engineering, Nanjing Institute of Technology, Nanjing 211167, China

SUMMARY

Under the comprehensive inuence of material and loads, dam structural behavior presents the time-varying non-
linear characteristics. To forecast the dam structural behavior (displacement, stress, seepage, etc.), the models
monitoring dam safety are often built according to the prototype observations on dam safety. However, the model-
ing process is usually fullled with the ofine and static pattern. As time goes on, the tting and forecasting ability
of built static model will decline gradually. The article is focused on the support vector machine (SVM)-based
model monitoring dam safety. The methods are studied to advance the adaptability of SVM model and reduce
the modeling time. By implementing the impact analysis for SVM parameters and input vector, the optimization
method of SVM parameters and input vector is presented to enhance the efciency of building the SVM-based
static model monitoring dam safety. To describe dynamically the time-varying mapping relationship between
dam structural behavior (effect-quantity) and its cause (inuence-quantity), the way is developed to update in real
time above model by making the most use of new observations. The displacement of one actual dam is taken as an
example to verify the modeling efciency and forecasting ability. Copyright 2015 John Wiley & Sons, Ltd.

Received 13 November 2014; Revised 20 March 2015; Accepted 4 June 2015

KEY WORDS: dam safety; monitoring model; support vector machine; modeling efciency enhancement;
adaptability advancement

1. INTRODUCTION
According to the prototype observations on dam structural behaviors, such as displacement, stress,
strain, seepage, etc., the methods in statistics, mechanics, and information science are often adopted
to build the model monitoring dam safety. As a mathematical expression describing the nonlinear map-
ping relationship between dam structural behavior (effect-quantity) and its cause (inuence-quantity),
the model monitoring dam safety can be used to evaluate and forecast the time-varying structural status
during dam service [14]. In recent years, some methods of signal processing and articial intelligence
are widely applied to the modeling eld of dam safety, which promotes the development of monitoring
model of dam safety [58].
In fact, the construction of monitoring model can be equivalent to a machine learning problem.
Articial neural network (ANN) and support vector machine (SVM) are usually chosen as learning
machine [913]. Dam effect-quantity and its inuence-quantity are regarded as output vector and input
vector of one learning machine. The observed data consist of training sample set. The training

*Correspondence to: Huaizhi Su, State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Hohai
University, Nanjing 210098, China.

E-mail: su_huaizhi@hhu.edu.cn

Copyright 2015 John Wiley & Sons, Ltd.


PERFORMANCE IMPROVEMENT METHOD OF MONITORING MODEL OF DAM SAFETY 253

operation is implemented to obtain one learning machine which is essentially an implicit expression of
monitoring model of dam safety.
Because of the structural risk minimization criterion followed, SVM has the good generalization
ability, and the overtting phenomenon can be avoided. The nonlinear mapping from low dimensional
space to high dimensional space is implemented with the kernel technology, and the curse of dimen-
sionality is overcome effectively. The global optimal solution can be obtained by solving the convex
quadratic programming problem. SVM has been used to build the models monitoring dam seepage
and displacement [1315]. Meanwhile, some methods, such as genetic algorithm (GA), particle swarm
optimization (PSO) algorithm, rough set, sequential minimal optimization algorithm, and principal
component analysis method, have been adopted to improve the tting ability of SVM model monitor-
ing dam safety [1620].
In practice, the input dimension of SVM is still limited. Under the premise of ensuring the forecast-
ing precision, the proper sample feature chosen can reduce the input space dimension, decrease the
computational complexity and save the training time. To advance the modeling efciency of SVM,
the optimization of SVM input vector needs to be implemented. In addition, the modeling process is
usually fullled with the ofine and static pattern. The training sample set is xed, and all training sam-
ple data are sent once to SVM. For one actual dam, under the comprehensive inuence of material and
loads, its structural behavior presents the typical nonlinear characteristics. The mapping relationship
between dam effect-quantity and its inuence-quantity is time-varying. So the forecasting precision
of one static model will reduce gradually as time goes on. To describe adaptively above time-varying
mapping relationship, new observations should be made the most use to train dynamically SVM, and
the model monitoring dam safety needs to be updated in real time.
This paper is focused on advancing the adaptability of SVM-based model monitoring dam safety
and reducing the modeling time. This paper is organized as follows. In Section 2, the statistical learning
principle of SVM is introduced, the method optimizing SVM parameters and input vector is presented,
and the SVM-based static model monitoring dam safety is built. Section 3 presents the method
updating in real time the model monitoring dam safety according to new observations. In Section 4,
the displacement of one actual dam is analyzed by the proposed method and two models monitoring
dam displacement, namely static model and real time updated model, are built. Section 5 summarize
the main conclusions reached in this work.

2. SVM-BASED STATIC MODEL MONITORING DAM SAFETY


SVM is a learning machine based on statistical learning theory. Because of global optimization and
good generalization ability, SVM is applied widely to statistical classication and regression analysis
under the nonlinear and high-dimensional conditions [21,22].

2.1. Statistical learning principle of SVM


The following linear regression problem is discussed rstly. Based on a given sample set (xi, yi)
(i = 1, 2, , n), where xi Rd is the input vector, its component is called the input feature factor,
and yi R represents the output vector, a linear function f (x) =  x + b needs to be determined so
that the output y corresponding with new input x can be calculated by y = f (x).
To ensure the robustness and sparseness of SVM decision function, the following -insensitive loss
function is adopted.

cx; y; f x jy  f xj maxf0; jy  f xj  g (1)

For given > 0, the hyperplane f (x) =  x + b is panned () up and down along y axis. The brushed re-
gion is called the hyperplane -band. If one sample point falls into the -band, namely the difference between
forecasted value f(x) and actual value y of sample point is less than , the forecasted value f (x) is sup-
posed to no cause a loss. If one sample point falls outside the -band, the loss = |y  f (x) |  is caused.
If all given sample points fall into the -band of the hyperplane f (x) =  x + b, namely the following
condition is satised,

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
254 H. SU, Z. CHEN AND Z. WEN

 yi  w  xi b ; i 1; 2; ; n (2)
then the hyperplane f (x) =  x + b is called the hard -band hyperplane of given sample set, as shown
in Figure 1. For a given sample set, if the hard -band hyperplane exists and is smaller, then it is
reasonable that the hard -band hyperplane is regarded as the solution of linear regression problem.
Based on a given sample set, two types of point sets, namely D+ and D, are constructed by the y
value for each sample point increased or decreased . Hard -band hyperplane is equivalent to separat-
ing the hyperplanes of D+ and D. The regression problem can be converted as follows.

1
min kk2 ; (3)
; b 2
s:t:  xi b  yi ; i 1;2;;n
yi   xi  b ; i 1;2;;n
 
Because of the allowance of tting error, slack variables i ; i ; i 1; 2; ; n and penalty factor
(C) are introduced. By the mapping : x (x), data is mapped from input space of low dimension to
feature space of high dimension, and a nonlinear regression problem is transformed into a linear
problem in high dimensional space. The original optimization problem of SVM regression can be
described as

1 Xn Xn
min  kk2 C i C i ; (4)
;b;; 2 i1 i1

s:t:  xi b  yi i ;
yi   xi  b i ;
i ; i 0; i 1;2;;n

Above problem can be converted as the following dual problem by introducing the kernel function
K(xi, xj) = (xi)  (xj).

1 X
n      X
n   Xn  
min i  i j  j K xi ; xj i i yi i  i ; (5)
; 2 i1; j1 i1 i1
X
n  
s:t: i  i 0; 0 i ; i C; i 1;2;;n
i1

The optimal solutions ^ ; ^  can obtained by solving above optimization problem with Lagrange
multipliers (, *) as the variables, and the solution of original problem as Equation (4) can be ob-
tained. The following decision function can be given.

Figure 1. Hard -band hyperplane.

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
PERFORMANCE IMPROVEMENT METHOD OF MONITORING MODEL OF DAM SAFETY 255

X
n  
f x i ^ i K xi ; x b^
^ (6)
i1
 
In any solution pair ^ i ; ^ i of Equation (5), there is at most solution which is not zero. For all
sample points (xi, yi) in the -band, ^ i ^ i 0, which means that these points have no contribution
to the decision function. Only sample points (xi, yi) corresponding with ^ i 0 or ^ i 0 can affect
the decision function, which are taken as support vector. Therefore, the decision function obtained
by SVM regression can be described as
X
s  
f x i ^ i K xi ; x b^
^ (7)
i1

where s represents the support vector number. Figure 2 shows the structure of SVM implementing
regression analysis.

2.2. SVM parameter determination


When SVM is used to solve the nonlinear regression problem, appropriate kernel function chosen
instead of inner product operation in high-dimensional feature space can prevent the computational
complexity increase. The kernel function should not only satisfy Mercer condition, but also can reect
the distribution characteristics of training data. As a universal kernel function, radial basis function
(RBF) can apply to the samples of any distribution by selecting reasonably the parameter . In this
paper, RBF as Equation (8) is adopted. The solved optimization problem during SVM regression
can be converted into a convex quadratic programming problem as Equation (9).
 
K xi ; x exp kxi  xk2 (8)

1 X n      2  X n   X n  
min i  i j  j exp xi  xj  i i yi i  i ; (9)
; 2 i1; j1 i1 i1
Xn  
s:t: i  i 0; 0 i ; i C; i 1;2;;n
i1

So the following parameters, namely insensitive loss function parameter , penalty factor C, and
RBF parameter , need to be determined during SVM regression. Different values of above parameters
will cause very different generalization capability of SVM regression model.
The insensitive loss function parameter controls the tting error of SVM regression and affects the
support vector number and the generalization ability. If is too large, the precision cannot come up to
the expected standard and under learning is caused. If is too small, the tting precision is
overemphasized which will lead to over learning. The insensitive loss function parameter are
generally chosen in the interval (0.00010.1).

Figure 2. SVM structure.

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
256 H. SU, Z. CHEN AND Z. WEN

The penalty factor C is used to adjust the balance between condence range and empirical error by
controlling the punishment for different error. To avoid over learning phenomenon, small C is usually
adopted under the condition of satisfying the expected precision. The RBF parameter can affect the
generalization ability of SVM by changing implicitly the mapping function. C and are usually deter-
mined at once with multi-parameter optimizing algorithm. The conventional optimizing algorithms,
such as grid search method, genetic algorithm, particle swarm optimization, etc., mostly follow the
cross validation (CV) pattern [1618,2326]. The samples, which are selected to build SVM model,
are divided into two parts, namely training set and verifying set. The training set is used to train
SVM, and the verifying set is used to test SVM. The forecasting precision (cross validation mean
square error, CVMSE) is taken as an indicator assessing the performance of SVM model. Grid search
method is adopted in this paper. The value ranges of C and are given. The certain step length is cho-
sen to generate grids. Let the values of C and traverse in the grid nodes. For every given C and in
pair, above cross validation operation is implemented. The C and in pair, which is corresponding with
high forecasting precision (namely small CVMSE) and small C, is regarded as optimal parameter pair.

2.3. Input feature optimization of SVM


To reduce the input space dimension of SVM under the premise of ensuring the prediction accuracy,
the sensitivity analysis for SVM input on output is implemented to optimize the input features of
SVM. The sensitivity of input feature on output can be expressed as the partial derivative of SVM out-
put on input feature. The sensitivity function of the lth feature factor xl in SVM input x on SVM output
f(x) is dened as follows.

 
1 N f xk 
Sxl  (10)
N k1 xl 

f xk
where N represents the sample number of training set; the partial derivative xl is calculated as
follows.

!
f x X s  
i ^ i K xi ; x b^
^
xl xl i1
! !
X s 

 Xd  2
^
^ i ^ i exp  xij  xj b
xl i1 j1
! (11)
X
s   Xd  2

2 i ^ i exp 
^ xij  xj xil  xl
i1 j1
X
s  

2 i ^ i K xi ; xxil  xl
^
i1

where s represents the support vector number.


According to the sample data in training set, the sensitivity of each feature factor in SVM input on
output can be obtained with Equation (10). The feature factor sensitivity can be regarded as an indica-
tor describing the contribution of feature factor to SVM output. The feature factor with little contribu-
tion is eliminated and the remaining feature factors constitute new input vector. New SVM model is
built after the SVM parameters are determined again. The CVMSE of above new SVM model is
compared with the best CVMSE. If the CVMSE change is less than the preset threshold, the effect
of eliminating this feature factor on the forecasting precision of SVM model is little, and this feature
factor can be eliminated. If the CVMSE change is larger than the preset threshold, the elimination of
this feature factor will cause the forecasting precision decrease of SVM model, and this feature factor
should be retained. Above process is implemented repeatedly to determine whether other feature
factors can be eliminated.

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
PERFORMANCE IMPROVEMENT METHOD OF MONITORING MODEL OF DAM SAFETY 257

2.4. Construction process of SVM-based static model monitoring dam safety


There are two kinds of observed quantity on dam safety, namely dam structural behavior (effect-
quantity) and its cause (inuence-quantity). The effect-quantity includes displacement, stress, strain,
seepage, etc. The inuence-quantity has water level (water pressure), temperature (water temperature,
air temperature, internal temperature of dam body and dam foundation), sediment pressure, seismic
load, etc. Inuence-quantity and effect-quantity are regarded as independent variable (x1, , xn) and
dependent variable y, respectively. For an actual dam, it shows a strong nonlinear relationship between
dependent and independent variable. Dam displacement is taken as an example. For the dam displace-
ment caused by the action of water load, temperature load, and other loads, it can be divided into water
level component H, temperature component T and aging component [15], namely

H T : (12)
Water level component H represents the displacement change caused by the turn and deformation of
dam body and its foundation under the action of water load and own weight of dam body. It is usually
elastic or recoverable. Water level component H is mainly related to the upstream water depth, namely
X
n
H ai H i (13)
i1

where H is the upstream water depth; n is taken as 3 for gravity dam, and 4 for arch dam.
Temperature component T expresses the displacement caused by the temperature change of dam
body concrete and dam foundation rock. When there are enough thermometers in dam body and
dam foundation, and these thermometers can describe the dam temperature eld, temperature compo-
nent T can be calculated with Equation (14). For one concrete dam running for many years, the inter-
nal temperature eld can be regarded as quasi steady state. If there are not temperature observations of
above dams, temperature component T can be calculated with Equation (15).

X
m1 X
m2 X
m2
T bi T i or T b1i T i b2i i (14)
i1 i1 i1

m3 
X
2it 2it
T b1i sin b2i cos (15)
i1
365 365

where Ti represents the observation of the ith thermometer; T i and i denote the average value of ob-
served temperature at ith layer and the temperature gradient respectively; t represents the cumulative
days from the monitoring day to the beginning day; m1 is the thermometer number in the dam; m2
expresses the number of the layers installing the thermometers; and m3 is the cycle number taken
usually as 1 or 2, which represents annual cycle or semiannual cycle, respectively.
Aging component is used to represent the creep and plastic deformation of dam body and dam
foundation, the autogenous volume deformation caused by dam cracks, and the irreversible displace-
ment. It is described usually with the following logarithmic function and linear function.

c1 c2 ln (16)
where = t/100.
The observed data series pair, (xk, yk), k = 1, 2, , n, is selected. xk Rd and yk R are taken as input
vector and output vector of SVM respectively. The former N groups of observed data are substituted
into SVM to train SVM. A SVM-based static model monitoring dam safety is built. To assess the
forecasting ability of built model, mean square error (MSE) as Equation (17) and squared correlation
coefcient as Equation (18) are taken.

1 N
MSE f xi  yi 2 ; i 1; 2; ; N (17)
N i1

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
258 H. SU, Z. CHEN AND Z. WEN

!2
X
N X
N X
N
N f xi yi  f x i yi
i1 i1 i1
r 0
2
!2 10 !2 1 (18)
X
N X
N X
N X
N
@N 2
f xi  f x i A@N y2i  yi A
i1 i1 i1 i1

Figure 3 shows the owchart building SVM-based static model monitoring dam safety. Main steps
are as follows.
(1) Select the feature factors forming an initial input set F0 of SVM according to Equations (1216),
determine the SVM parameters, obtain the corresponding mean square error of cross validation
and minimum mean square error of cross validation, which are represented as CVMSE0 and
CVMSE* respectively. The sample data corresponding with F0 are used to train SVM and a
model (Model0) monitoring dam safety is built. Set the threshold A.
(2) Calculate the sensitivity of each feature factor in F0 on SVM output with Equations (10) and (11).
(3) Eliminate the feature factor with minimum sensitivity, and reconstitute the input set F based on
the remaining feature factors.
(4) According to new input set F, renew the SVM parameters, calculate the corresponding mean
square error of cross validation (CVMSE), train SVM and build a new model (Model) monitor-
ing dam safety. If CVMSE CVMSE*, let CVMSE* = CVMSE. If (CVMSE-CVMSE*) A,
let the current optimal input set F0 = F, the current optimal model Model0 = Model, return to
Step (2) and go on. Otherwise, output the optimal input set F0 and the optimal model Model0,
and exit the program.

Figure 3. Establishing process of SVM-based static model monitoring dam safety.

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
PERFORMANCE IMPROVEMENT METHOD OF MONITORING MODEL OF DAM SAFETY 259

3. REAL TIME UPDATED MODEL MONITORING DAM SAFETY


3.1. Support vector feature analysis
For a SVM regression problem, assume that ^ n1 and *n 1 are the solutions of optimization problem as
Equation (6), the follows can be known from KarushKuhnTucker (KKT) condition.
If ^ i ^ i 0, then the corresponding sample point (xi, yi) falls into the -band or on its boundary,
namely | yi  f (xi)| is satised. It is the non-support vector.
If 0 < ^ i < C and ^ i 0, or ^ i 0 and 0 < ^ i < C, then the corresponding sample point (xi, yi)
falls on the -band boundary, namely | yi  f (xi) | = is satised. It is the boundary support vector.
If ^ i C and ^ i 0, or ^ i 0 and ^ i C, then the corresponding sample point (xi, yi) falls
outside the -band or on its boundary, namely | yi  f (xi) | is satised. It is the non-boundary
support vector.
New sample point (xi, yi) does not participate in the previous training of SVM, its Lagrange multi-
pliers are i i 0. Therefore, if and only if | yi  f (xi) | , new sample point (xi, yi) satises KKT
condition.
If new sample point satises KKT condition, the training of SVM is implemented again, which does
not affect the support vector distribution, namely the decision function is not changed. Hence, if new
sample point satises that | yi  f (xi) | , then the model monitoring dam safety does not need to be
reconstructed.
If new sample point does not satisfy KKT condition, the support vector distribution will be changed
after the training of SVM is implemented again. The original support vector probably becomes a
non-support vector, or the original non-support vector probably becomes a support vector, as shown
in Figure 4. Figure 4 (a) illustrates the results obtained by implementing regression analysis for seven
training sample points. In Figure 4 (a), two sample points marked with solid dot, namely No. 8 and
No. 9, do not participate in the training of SVM. Above two points, which fall outside the -band,
do not satisfy KKT condition. Figure 4 (b) illustrates the results obtained by implementing regression
analysis for new training sample set consisting of seven original sample points and two points added
which do not satisfy KKT condition. The follows can be seen from Figure 4 (b). No. 8 point added be-
comes a support vector. No. 9 point added becomes a non-support vector. The original non-support
vector as No. 6 point becomes a support vector and the original support vector as No. 7 point becomes
a non-support vector. For the sample points far away from two added points, namely Nos. 15, their
support vector distributions change little, and the non-support vector points among them do not affect
the decision function. Therefore, if new sample point satises that |yi  f(xi)| > , then a new training
sample set, which is made up of new added points, the points near them, and support vector points
in original training sample set, needs to be taken to train SVM and reconstruct the model monitoring
dam safety.

(a) (b)
Figure 4. Support vector distribution diagram under the inuence of new sample points.

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
260 H. SU, Z. CHEN AND Z. WEN

3.2. Construction process of real time updated model monitoring dam safety
Figure 5 shows the owchart update dynamically SVM-based model monitoring dam safety. The
following steps need to be implemented.
(1) Select the former m data pairs from observed data series as an initial training sample set of
SVM. Implement the training of SVM and build a static model monitoring dam safety with
the method presented in Section 2.4.
(2) Forecast the (m + 1)th value f (xm + 1) of effect-quantity with above built model.
(3) Calculate the error | ym + 1  f (xm + 1)| between the forecasted value f (xm + 1) and the data ym + 1
collected real timely.
(4) If | ym + 1  f (xm + 1) | ( is the insensitive loss function parameter), let m=m+1, go to Step (2).
Otherwise go to Step (5).
(5) Construct a new training sample set, namely, {support vector sample in original training
sample set}{New added sample (ym + 1)}{l samples (ym, ym  1, , ym  l + 1) near ym + 1}.
Update the length (m) of training sample set.
(6) Implement the training of SVM based on above new training sample set and reconstruct the
model monitoring dam safety, go to Step (2).

4. EXAMPLE ANALYSIS
One roller compacted concrete gravity dam is taken as an example. The maximum dam height is
113.0 m, the length of dam crest is 308.5 m, and the elevation of dam crest is 179.0 m. One measuring
device is arranged at No. 5 dam section crest to observe the horizontal displacement along river direc-
tion. According to the collected observations, the proposed method is adopted to analyze and forecast
the horizontal displacement of No. 5 dam section crest. Figure 6 shows the time curve of horizontal
displacement measured daily from January 1, 2003 to December 31, 2007. Based on the pretreated ob-
servations, the proposed method is used to build two models monitoring dam displacement, namely
static model and real time updated model. The tting and forecasting ability of above two models is

Figure 5. Establishing process of real time updated model monitoring dam safety.

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
PERFORMANCE IMPROVEMENT METHOD OF MONITORING MODEL OF DAM SAFETY 261

Figure 6. Time curve of measured horizontal displacement.

compared. It should be pointed out that the modeling method proposed in this paper is universal and
can also be used to analyze and forecast other structural behavior (stress, seepage, etc.). Only the input
and output of SVM need to be changed.

4.1. Static model monitoring horizontal displacement


The 1461 sample points from January 1, 2003 to December 31, 2006 are chosen to train SVM and
build the model monitoring horizontal displacement. The 364 sample points in 2007 are used to test
the forecasting ability of built model. The horizontal displacement is regarded as SVM output.
According to Equations. (1216), the following factors are taken to form an initial input set F0
of SVM.

F 0 x1 ; x2 ; x3 ; x4 ; x5 ; x6 ; x7 ; x8 ; x9 


2t 2t 4t 4t (19)
H; H 2 ; H 3 ; sin ; cos ; sin ; cos ; ln ;
365 365 365 365

The dimensions and ranges of feature factors in F0 are different, which would probably cause
the tting and forecasting precision to be reduced. In Equation (19), the numerical difference be-
tween the feature factors of water level component and other feature factors is large. All feature
factors in F0 are normalized with Equation (20).
     
xij xj max  xj min  xij  xj min = xj max  xj min xj min (20)

where xij and xij are the ith values of the jth feature factor before and after the normalization is im-
plemented; xj min and xj max are the minimum value and the maximum value of the jth feature factor; xj min
and xj max are the parameters of mapping range, which can be taken as 0 and 1 or 1 and 2.
RBF is taken as the kernel function of SVM. The static model monitoring horizontal displacement is
built by analyzing the inuence of SVM parameters on the performance of SVM model and optimizing
the feature factors of SVM input.

4.1.1. Impact analysis for SVM parameters.

(1) Impact analysis for insensitive loss function parameter


The varying characteristic on the tting and forecasting ability of built model, as the insensitive loss
function parameter is changed, is analyzed. is taken in turn as 0.1, 0.05, 0.01, 0.005, 0.001, and
0.0001. For one given , grid search method is adopted to determine penalty factor C and RBF
parameter . Table I lists the calculated indicators assessing the tting and forecasting ability of built
model under the condition of different . The following can be seen from Table I.

With the decrease of insensitive loss function parameter , the support vector number increases
gradually.

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
262 H. SU, Z. CHEN AND Z. WEN

Table I. Calculated indicators assessing the tting and forecasting ability under different .
Support vector number Fitting MSE Fitting r2 Forecasting MSE Forecasting r2
0.1 18 0.00293 0.9615 0.00332 0.9233
0.05 199 0.00122 0.9809 0.00234 0.9456
0.01 905 0.00082 0.9870 0.00197 0.9555
0.005 1135 0.00082 0.9867 0.00198 0.9526
0.001 1385 0.00081 0.9865 0.00199 0.9522
0.0001 1457 0.00081 0.9865 0.00203 0.9514

When the given is too large, the time-varying characteristic of sample series cannot be learned
sufciently, the tting and forecasting precision is low. As the given becomes small, the tting
and forecasting precision is improved gradually.
When the given is reduced to a certain extent, the tting precision is not greatly improved, but
the tting and forecasting precision is reduced, namely the over-learning phenomenon appears.
According to above analyzed results, it is suggested that the insensitive loss function parameter is
taken as 0.01.
(2) Impact analysis for penalty factor C
The penalty factor C is taken in turn as 0.01, 0.1, 1, 10, and 100. Let that the insensitive loss
function parameter = 0.01 and the RBF parameter = 0.1895. Table II lists the calculated indicators
assessing the tting and forecasting ability of built model under the condition of different C. The
following can be seen from Table II.

The support vector number decreases gradually as the penalty factor C increases.
Smaller given C means that the punishment for tting loss is light, which can cause low tting and
forecasting precision.
As the given C increases, the tting precision is improved gradually, but the forecasting precision
increases and then decreases, namely the over-learning phenomenon appears, which makes the
generalization ability of built model be decreased.
It is suggested from above impact analysis for penalty factor C that, small C should be chosen under
the condition of ensuring the expected tting precision.
(3) Impact analysis for RBF parameter
The RBF parameter is taken in turn as 0.01, 0.1, 1, 10, and 100. Let that the insensitive loss
function parameter = 0.01 and the penalty factor C = 1.7411. Table III lists the calculated indicators

Table II. Calculated indicators assessing the tting and forecasting ability under different C.
C Support vector number Fitting MSE Fitting r2 Forecasting MSE Forecasting r2
0.01 1223 0.00431 0.9536 0.00683 0.9250
0.1 1043 0.00147 0.9764 0.00182 0.9630
1 913 0.00087 0.9860 0.00180 0.9596
10 875 0.00061 0.9900 0.00327 0.9374
100 789 0.00034 0.9944 0.01580 0.8421

Table III. Calculated indicators assessing the tting and forecasting ability under different .
Support vector number Fitting MSE Fitting r2 Forecasting MSE Forecasting r2
0.01 1176 0.00286 0.9554 0.00375 0.9107
0.1 942 0.00105 0.9829 0.00193 0.9648
1 597 0.00018 0.9970 0.00988 0.8435
10 201 0.00005 0.9994 0.02197 0.8816
100 821 0.00009 0.9995 0.06509 0.2861

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
PERFORMANCE IMPROVEMENT METHOD OF MONITORING MODEL OF DAM SAFETY 263

assessing the tting and forecasting ability of built model under the condition of different . It can be
seen from Table III that, as the given RBF parameter increases, the support vector number decreases
and then increases, the tting precision and the forecasting precision increase and then decrease.
(4) Horizontal displacement monitoring model built by SVM with initial input vector F0
According to above impact analysis for SVM parameters, the chosen insensitive loss function
parameter is 0.01. Based on the cross validation mean square error (CVMSE) of training sample
set and the penalty factor C as the indicators assessing the performance of built model, penalty factor
C, and RBF parameter are determined at once. When the CVMSE difference of two pairs of C and is
no more than 104, the parameter pair corresponding with smaller C can be regarded as best one. The
nal parameters are taken as follows, namely C = 1.7411 and = 0.1895. After three parameters are
determined, SVM with initial input vector F0 as Equation (19) are trained, and the model monitoring
horizontal displacement, which is denoted as Model 1, is built.

4.1.2. Sensitivity analysis for input feature factors. Let the threshold A = 0.0001. The feature factors in
SVM input are optimized by implementing the sensitivity analysis for input feature factors on SVM
4t
output. The selecting process is given in Table IV. After cos 365 and H are eliminated in turn, the
condition, namely (CVMSE-CVMSE*) A, is satised, which means that the elimination of above
two factors does not have too much negative impact on the forecasting precision of built model.
4t
Sequentially sin 365 is eliminated, (CVMSE-CVMSE*) > A, which means that the elimination of this
factor makes the forecasting precision be decreased, then this factor should be retained. It should be
noted that, after one feature factor is eliminated, the sensitivity of the remaining feature factors on
SVM output will be changed. So, the sensitivity should be calculated again with Equations (10) and (11).
Table IV lists the calculated indicators assessing the tting and forecasting ability of built model
during the feature factor selection. It can be seen from Table IV that, three models, namely Model 1,
Model 2, and Model 3, have about the same tting and forecasting ability, but the modeling time of
Model 3 is greatly reduced. Equation (21) shows the optimal input vector F0 of SVM, which can make
the forecasting precision high and the feature factor number as little as possible.



2t 2t 4t
F 0 H 2 ; H 3 ; sin ; cos ;sin ; ln ; (21)
365 365 365

Table IV. Fitting and forecasting ability of different models.


Model Model 1 Model 2 Model 3 Model 4
0.01 0.01 0.01 0.01
C 1.7411 9.1896 3.0314 9.1896
0.1895 0.1895 0.1895 0.1088
CVMSE 8.66e4 8.41e4 9.37e4 15e4
CVMSE* 8.66e4 8.41e4 8.41e4 8.41e4
CVMSE-CVMSE* 0 0 0.96e4 6.59e4
Sensitivity of feature x1 0.0727 0.0793 Eliminate Eliminate
Sensitivity of feature x2 0.0936 0.0998 0.1090 0.0963
Sensitivity of feature x3 0.1097 0.1126 0.1652 0.1911
Sensitivity of feature x4 0.6007 0.5927 0.5937 0.6493
Sensitivity of feature x5 0.1632 0.1207 0.1194 0.1657
Sensitivity of feature x6 0.0652 0.0965 0.0847 Eliminate
Sensitivity of feature x7 0.0606 Eliminate Eliminate Eliminate
Sensitivity of feature x8 0.1913 0.2243 0.2035 0.2084
Sensitivity of feature x9 0.2124 0.2544 0.2066 0.2624
Fitting MSE 0.0008 0.0007 0.0009 0.0014
Fitting r2 0.9870 0.9884 0.9861 0.9777
Forecasting MSE 0.0019 0.0021 0.0018 0.0023
Forecasting r2 0.9555 0.9513 0.9568 0.9518
Modeling time (s) 346 313 264 246

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
264 H. SU, Z. CHEN AND Z. WEN

Model 3 is taken as the corresponding optimal model, namely nal chosen static model monitoring
horizontal displacement. Figure 7 shows the tted and forecasted results with Model 3.

4.2. Real time updated model monitoring horizontal displacement


Based on the result obtained in Section 4.1.2, the following structure of SVM is designed, namely
F 0 H 2 ; H 3 ; sin 365
2t
; cos 365
2t
;sin 365
4t
; ln ; is taken as the input vector of SVM and the horizontal
displacement is regarded as SVM output. The 365 sample points from January 1, 2003 to December
31, 2003 are taken to constitute an initial training set of SVM. Let insensitive loss function parameter
is 0.01. Penalty factor C and RBF parameter determined with grid search method are as follows.
C = 0.399 and = 1.7411. An initial model monitoring horizontal displacement is built by training SVM.
Let that the effect range (l) of new sample point, which does not satisfy KKT condition, is 10. The
1460 sample points from January 1, 2004 to December 31, 2007 are used to update dynamically the

(a) Time curve of horizontal displacement

(b) Time curve of modeling error


Figure 7. Time curve on calculated result of static model.

(a) Time curve of horizontal displacement

(b) Time curve of modeling error


Figure 8. Time curve on calculated result of real time updated model.

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
PERFORMANCE IMPROVEMENT METHOD OF MONITORING MODEL OF DAM SAFETY 265

model monitoring horizontal displacement, and the real time forecast of horizontal displacement is
implemented. Figure 8 shows the tted and forecasted results with real time updated model. The tting
mean square error of initial model is 0.000232 and the forecasting mean square error of real time
updated model is 0.000580. Compared with the static model built in Section 4.1.2, the forecasting
precision of real time updated model is higher.

5. CONCLUSIONS
To advance the adaptability of SVM-based model monitoring dam safety and reduce the modeling
time, the methods are studied to optimize the key parameters and input vector of SVM. By making
the most use of new observations, a method is proposed to update in real time the model monitoring
dam safety. The displacement of one actual dam is taken as an example to compare and contrast the
modeling efciency and forecasting ability of two monitoring models built by static and dynamic
patterns.
(1) Impact analysis for SVM parameters on the performance of built model is implemented, and
the criteria determining SVM parameters are presented. Based on the restriction of input space
dimension in training SVM, a sensitivity-based optimization method is proposed to select the
input vector of SVM. The SVM-based static model monitoring dam safety is built by combin-
ing above achievements. It is shown from an example that, because of SVM key parameters
and input vector optimized, the modeling time is greatly reduced under the condition of ensur-
ing the forecasting precision.
(2) By updating in real time the model monitoring dam safety, the time-varying nonlinear mapping
relationship between dam structural behavior (effect-quantity) and its cause (inuence-quantity)
can be described more reasonably, and dam structural behavior can be forecasted more
accurately.

ACKNOWLEDGEMENTS

This research has been partially supported by the National Natural Science Foundation of China (SN: 51179066,
41323001, 51139001), the Jiangsu Natural Science Foundation (SN: BK2012036), the Doctoral Program
of Higher Education of China (SN: 20130094110010), the Non-prot Industry Financial Program of MWR
(SN: 201301061, 201201038), the Open Foundation of State Key Laboratory of Hydrology-Water Resources
and Hydraulic Engineering (SN: 20145027612), the Research Program on Natural Science for Colleges and
Universities in Jiangsu Province (SN: 14KJB520016), and the Science and Technology Innovation Foundation
by Nanjing Institute of Technology (SN: CKJ2010010).

REFERENCES

1. Mata J, Tavares de Castro A, S da Costa J. Constructing statistical models for arch dam deformation. Structural Control
and Health Monitoring 2014; 21(3):423437.
2. De Sortis A, Paoliani P. Statistical analysis and structural identication in concrete dam monitoring. Engineering Structures
2007; 29(1):110120.
3. Su HZ, Hu J, Wu ZR. A study of safety evaluation and early-warning method for dam global behavior. Structural Health
Monitoring 2012; 11(3):269279.
4. Gu CS, Zhao EF, Jin Y, Su HZ. Singular value diagnosis in dam safety monitoring effect values. Science China Technolog-
ical Sciences 2011; 54(5):11691176.
5. Stojanovic B, Milivojevic M, Ivanovic M, Milivojevic N, Divac D. Adaptive system for dam behavior modeling based on
linear regression and genetic algorithms. Advances in Engineering Software 2013; 65:182190.
6. Kang F, Li JJ, Xu Q. Damage detection based on improved particle swarm optimization using vibration data. Applied Soft
Computing 2012; 12(8):23292335.
7. Xi GY, Yue JP, Zhou BX, Tang P. Application of an articial immune algorithm on a statistical model of dam displacement.
Computers & Mathematics with Applications 2011; 62(10):39803986.
8. Xu C, Yue D, Deng C. Hybrid GA/SIMPLS as alternative regression model in dam deformation analysis. Engineering
Applications of Articial Intelligence 2011; 25(3):468475.
9. Mata J. Interpretation of concrete dam behaviour with articial neural network and multiple linear regression models.
Engineering Structures 2011; 33(3):903910.
10. Kao CY, Loh CH. Monitoring of long-term static deformation data of Fei-Tsui arch dam using articial neural network-
based approaches. Structural Control and Health Monitoring 2013; 20(3):282303.

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc
266 H. SU, Z. CHEN AND Z. WEN

11. Karimi I, Khaji N, Ahmadi MT, Mirzayee M. System identication of concrete gravity dams using articial neural networks
based on a hybrid nite element-boundary element approach. Engineering Structures 2010; 32(11):35833591.
12. Rankovi V, Grujovi N, Divac D, Milivojevi N, Novakovi A. Modelling of dam behaviour based on neuro-fuzzy
identication. Engineering Structures 2012; 35(1):107113.
13. Rankovi V, Grujovi N, Divac D, Milivojevi N. Development of support vector regression identication model for
prediction of dam structural behaviour. Structural Safety 2014; 48:3339.
14. Ou JP, Li H. Structural health monitoring in mainland China: review and future trends. Structural Health Monitoring 2010;
9(3):219231.
15. Su HZ, Wen ZP, Wu ZR. Study on an intelligent inference engine in early-warning system of dam health. Water Resources
Management 2011; 25(6):15451563.
16. Kao LJ, Chiu CC, Lu CJ, Chang CH. A hybrid approach by integrating wavelet-based feature extraction with MARS and
SVR for stock index forecasting. Decision Support Systems 2013; 54(3):12281244.
17. Kuang FJ, Xu WH, Zhang SY. A novel hybrid KPCA and SVM with GA model for intrusion detection. Applied Soft
Computing 2014; 18:178184.
18. Samanta B, Al-Balushi KR, Al-Araimi SA. Articial neural networks and support vector machines with genetic algorithm
for bearing fault detection. Engineering Applications of Articial Intelligence 2003; 16(78):657665.
19. Cheng L, Zheng DJ. Two online dam safety monitoring models based on the process of extracting environmental effect.
Advances in Engineering Software 2013; 57:4856.
20. Su HZ, Wu ZR, Wen ZP. Identication model for dam behavior based on wavelet network. Computer-Aided Civil and
Infrastructure Engineering 2007; 22(6):438448.
21. Vapnik V. The Nature of Statistical Learning Theory. Springer: New York, 1995.
22. Smola AJ, Scholkopf B. A tutorial on support vector regression. Statistics and Computing 2004; 14(3):199222.
23. Martnez Lpez FJ, Martnez Puertas S, Torres Arriaza JA. Training of support vector machine with the use of multivariate
normalization. Applied Soft Computing 2014; 24:11051111.
24. Wang XH, Mao HL, Zhu CM, Huang ZF. Damage localization in hydraulic turbine blades using kernel-independent
component analysis and support vector machines. Proceedings of the Institution of Mechanical Engineers, Part C: Journal
of Mechanical Engineering Science 2009; 223(2):525529.
25. Chapelle O, Vapnik V, Bousquet O, Mukherjee S. Choosing multiple parameters for support vector machines. Machine
Learning 2002; 46(13):131159.
26. Liu XL, Jia DX, Li H. Research on kernel parameter optimization of support vector machine in speaker recognition. Science
Technology and Engineering 2010; 10(7):16691673.

Copyright 2015 John Wiley & Sons, Ltd. Struct. Control Health Monit. 2016; 23: 252266
DOI: 10.1002/stc

You might also like