Professional Documents
Culture Documents
a r t i c l e
i n f o
Article history:
Received 25 March 2014
Received in revised form 27 May 2014
Accepted 30 May 2014
Available online 28 June 2014
Keywords:
Landslide
LiDAR
GIS
Remote sensing
Conditioning factors
Susceptibility mapping
Malaysia
a b s t r a c t
Landslide susceptibility, hazards, and risks have been extensively explored and analyzed in the past decades.
However, choosing relevant conditioning factors in such analyses remains a challenging task. Landslide susceptibility mapping employs topological, environmental, geological, and hydrological parameters. Some researchers
assume that as the number of conditioning factors increases, the precision of the generated susceptibility map increases. By contrast, other case studies prove that a small number of conditioning factors are sufcient to produce
landslide susceptibility maps with a reasonable quality. This study investigates the effects of conditioning factors
on landslide susceptibility mapping. Bukit Antarabangsa, Ulu Klang, Malaysia was selected as the study area,
because it is a catchment area with a high potential of landslide occurrence. A spatial database of 31 landslide locations was evaluated to map landslide-susceptible areas. Two datasets of conditioning factors were constructed
in GIS environment. The rst dataset was derived from high-resolution airborne laser scanning data (LiDAR),
which contains eight landslide conditioning factors: altitude, slope, aspect, curvature, stream power index
(SPI), topographic wetness index (TWI), topographic roughness index (TRI), and sediment transport index
(STI). The second dataset was gathered by using the same conditioning factors of the rst dataset, but with the
addition of other conditioning factors: geological and environmental factors of soil, geology, land use/cover
(LULC), distance from river, and distance from road. Two different datasets were constructed to compare the efciency of one over the other in landslide susceptibility zonation. Three methods were implemented to recognize
the importance of different conditioning factors in landslide susceptibility mapping. Three different types of
models such as weights-of-evidence (WoE) (bivariate statistical analysis), logistic regression (LR) (multivariate
statistical analysis), and data-driven support vector machine (SVM) were used to determine the optimal landslide conditioning factors. The area under curve (AUC) was used to assess the obtained results. The prediction
rates of WoE, LR, and SVM obtained from only the LiDAR-derived conditioning factors were 59%, 86%, and 84%,
respectively. The prediction rates of the WoE, LR, and SVM obtained from the second dataset were 65%, 66%,
and 69%, respectively. The LiDAR-derived conditioning factors were more sufcient in generating an accurate
landslide susceptibility map. Using additional factors, such as geology, LULC, and so on, does not signicantly increase the accuracy of the map. The ndings of this study can be used as reference for future analysis in selecting
data for landslide conditioning factors.
2014 Elsevier Inc. All rights reserved.
1. Introduction
Landslides are a catastrophic phenomenon and a dynamic process
that contributes to the destruction and transformation of a given landscape (Lee & Pradhan, 2006). Various natural and man-made factors
trigger landslides (Guzzetti, Reichenbach, Cardinali, Galli, & Ardizzone,
2005). Meteorological variations, such as strong or continued precipitation, and tectonic forces, such as earthquakes, are the main factors that
trigger landslides (Huang et al., 2012), although natural forces, such as
rainfall, and human activities also trigger them (Guadagno, Martino, &
Corresponding author. Tel.: +60 3 89466383; fax: +60 3 89468470.
E-mail addresses: biswajeet24@gmail.com, biswajeet@lycos.com (B. Pradhan).
1
Tel.: +60 3 89466383; fax: +603 89468470.
http://dx.doi.org/10.1016/j.rse.2014.05.013
0034-4257/ 2014 Elsevier Inc. All rights reserved.
regions (Domnguez-Cuesta, Jimnez-Snchez, & Berrezueta, 2007). Differences between the characteristics of the factors should be evaluated to
produce a landslide susceptibility map that employs various conditioning factors. The characteristics of conditioning factors vary from area to
area, therefore, the rst stage in generating susceptibility map is to assess
the importance of each factor (Nefeslioglu, Sezer, Gokceoglu, Bozkir, &
Duman, 2010). Constructing the conditioning factors is a difcult task
(Jibson & Keefer, 1989), and no specic rule exists to dene how many
conditioning factors are sufcient for a specic susceptibility analysis.
Furthermore, no framework exists for the selection of conditioning factors. These factors are mostly chosen based on the opinions of experts.
Numerous studies on the scope of susceptibility mapping have been
conducted in the literature. However, studies on selecting the proper
conditioning factors are equally reasonable. The lack of a comprehensive research on this topic motivated the authors of this study to conduct such analysis and provide directions for future studies. Landslides
occur as a result of the effects of numerous conditioning factors, including meteorology, hydrology, geology, constructions, and geomorphic
history (Metternicht, Hurni, & Gogu, 2005; Pradhan & Youssef, 2010).
Nonetheless, covering all these factors in a single landslide susceptibility
assessment is impossible (Moreiras, 2005). Domnguez-Cuesta et al.
(2007) stated that conditioning factors can be grouped into two general
categories: factors related to topography and factors related to landslides, geology, and vegetation. Two groups of spatial variables can be
constructed further from these categories. The rst group represents topographical factors and contains the quantitative variables of altitude,
aspect, slope, and plan curvature, all of which can be acquired from
the digital elevation model. The second group consists of qualitative variables that pertain to geology and vegetation.
The morphology of the slope, land use/cover (LULC) types, and geological foundation are some of the factors that can be used in landslide
mapping (Gorsevski, Gessler, Boll, Elliot, & Foltz, 2006). Morphometric
characteristics can be used to recognize various types of landslides
(Glenn, Streutker, Chadwick, Thackray, & Dorsch, 2006). Some researchers have attempted to assess the impact of conditioning factors
and identify the ones with the most signicant impact. Donati and
Turrini (2002) determined the most inuential factors in landslide
occurrences in southeast Umbria, east of Spoleto, and ranked them
according to their importance. They identied the factors that theoretically have a signicant impact on landslides, but which in reality had a
different impact. Only some of the conditioning factors that they identied such as lithology exhibited the predicted impact; the others such
as slope steepness and the orientation exerted a less important inuence than was predicted. Domnguez Cuesta, Jimnez Snchez, and
Rodrguez Garca (1999) assessed 209 landslide events from 1980 to
1994 in the Cantabrian Mountains in northwestern Spain and found
that precipitation was the most inuential conditioning factor.
Moreiras (2005) considered lithology and slope as the most inuential factors in landslide mapping based on the study area of the Rio
Mendoza Valley in Argentina. Glenn et al. (2006) stated that topographic factors are highly inuential parameters in landslide studies. They
assessed the efciency of laser scanning data (LiDAR)-derived topographic factors in characterizing landslide morphology and activity.
According to Oh and Pradhan (2011) and Yilmaz (2009), by adding
the factors of altitude, topographic roughness index (TRI), stream
power index (SPI), and distance from road to the spatial database
enhances the accuracy of the nal results. In some studies, altitude,
slope, and aspect are the primary topographical attributes, whereas
SPI, TRI, and topographic wetness index (TWI) are the secondary topographical attributes (Wilson & Gallant, 2000). Geology, slope angle, and
LULC were determined as the most inuential conditioning factors in a
study by Zzere et al.(1999). Furthermore, Donati and Turrini (2002)
found that geology is the most important factor and considered slope
and related factors as secondary factors only.
This brief literature review shows that the conditioning factors
produce various impacts and, for each factor, only certain classes exert
151
152
Fig. 1. Landslide location map with the hill-shaded map of Bukit Antarabangsa, Ulu Klang, Malaysia.
landslides. Most of the slide surface of these landslides in the study area
is usually less than 4 m deep and occurs during or immediately after
intense rainfall (Pradhan, 2013). These incidents were divided into
two datasets for training and testing of the WoE, LR and SVM models.
Based on the literature, the most common technique of splitting the inventory dataset is choosing 70% of the location for training and 30% for
validation (Tien Bui, Pradhan, Lofman, & Revhaug, 2012). The same
method was applied in this study, and the training dataset (21 landslide
locations) was used. The dependent layer was produced; it contained
pixel values of 0 and 1, which indicates the absence and presence of
landslide events, respectively. The selected training and testing dataset
are shown in Fig. 1.
3.2. Landslide conditioning factors
Dening and mapping an appropriate set of conditioning factors correlated to landslide events require a priori knowledge of the main contributors to the landslides (Guzzetti, Carrara, Cardinali, & Reichenbach,
1999). These conditioning factors are terrain geology and morphology,
slope, weather conditions, vegetation density, LULC, and man-made inuences. Accessibility to thematic layers differs extensively depending
on the kind, scale, and method of data collection. The conditioning
factors used in this study were selected based on the factors that are
most commonly mentioned by researchers. In other words, the selected
factors in the present study are those used by many researchers and
those that signicantly inuence the inconsistency potential of the terrain. Two spatial databases that contain landslide-conditioning factors
were designed and created. The rst dataset was directly constructed
from a high-resolution airborne LiDAR that contain eight landslideconditioning factors: altitude, slope, aspect, curvature, stream power
index (SPI), TWI, TRI, and sediment transport index (STI). The second
dataset was constructed from the same conditioning factors, but with
the addition of other conditioning factors: geological and environmental factors of soil, geology, LULC, distance from river, and distance from
road. All landslide-conditioning factors were entered into a GIS and
153
Table 1
Spatial relationship between each conditioning factor and landslide occurrence extracted by using WoE and LR.
Layer
Class
S(c) LiDAR
S(c) all
LR LiDAR
LR LiDAR (sig)
LR all
LR all (sig)
Altitude
(m)
38.7558.69
58.6966.67
66.6776.64
76.6490.60
90.60112.53
112.53140.44
140.44166.37
166.37194.28
194.28230.17
230.17547.20
05.16
5.1610.67
10.6715.83
15.8320.65
20.6525.47
25.4730.63
30.6335.79
35.7941.99
41.9949.90
49.9087.76
North
Northeast
East
Southeast
South
Southwest
West
Northwest
Concave
at
convex
012.72
12.7213.52
13.5213.98
13.9814.33
14.3314.67
14.6715.13
15.1315.70
15.7016.39
16.3917.76
17.7629.22
2.415.71
5.716.10
6.106.50
6.506.89
6.897.41
7.418.07
8.079.01
9.0110.31
10.3112.56
12.5636.02
1.0317.75
17.7525.18
25.1830.75
30.7536.32
36.3243.75
43.7551.18
51.1858.61
58.6167.90
67.9080.90
80.90474.84
0
00.80
0.801.61
1.612.42
2.423.23
3.234.04
4.045.65
5.658.08
8.0812.93
12.93206.08
RGM
STP
DLD
LAA_COL
16.19
5.60
10.51
0.77
43.41
38.56
19.22
13.42
1.16
11.34
10.88
1.69
8.92
7.01
2.82
8.96
6.80
2.45
9.36
4.001
0
2.78
0.32
3.78
13.71
1.76
26.08
1.11
0.41
5.14
2.98
8.01
9.50
0.29
1.27
1.79
3.55
1.77
1.52
3.87
14.07
8.98
3.49
0.83
7.18
4.46
0.58
5.51
1.36
5.12
20.03
0.78
3.38
0.11
4.42
10.76
3.77
2.01
4.38
7.91
19.55
7.78
8.31
1.87
3.93
5.64
2.14
5.98
7.47
7.68
12.52
16.19
5.60
10.51
0.77
43.41
38.56
19.22
13.42
1.16
11.34
10.88
1.69
8.92
7.01
2.82
8.96
6.80
2.45
9.36
4.001
0
2.78
0.32
3.78
13.71
1.76
26.08
1.11
0.41
5.14
2.98
8.01
9.50
0.29
1.27
1.79
3.55
1.77
1.52
3.87
14.07
8.98
3.49
0.83
7.18
4.46
0.58
5.51
1.36
5.12
20.03
0.78
3.38
0.11
4.42
10.76
3.77
2.01
4.38
7.91
19.55
7.78
8.31
1.87
3.93
5.64
2.14
5.98
7.47
7.68
12.52
7.51
5.64
21.73
20.34
0.023
0.022
0.003
0.093
0.094
0.096
0.468
0.013
0.013
0.001
0.003
0.060
0.044
4.794
2.520
0.160
0.121
4.553
0.341
0.336
0.025
0.463
0.032
1.095
0.899
15.808
0.566
8.636
0
6.212
0.974
4.725
0
Slope (degree)
Aspect
Curvature
SPI
TWI
TRI
STI
Soil
154
Table 1 (continued)
Layer
Class
S(c) LiDAR
S(c) all
LR LiDAR
LR LiDAR (sig)
LR all
LR all (sig)
Geology
Acid intrusives
Vein quartz
Schist
1
2
3
4
050
51100
101200
N 200
050
51100
101200
201500
N 500
1.81
0
1.795
8.10
7.91
0
5.56
11.35
34.33
24.65
4.35
13.44
20.04
10.40
3.66
0
7.763
23.553
0
19.213
33.999
0
15.815
31.474
0
0
0.009
0.008
LULC
Fig. 2. Input thematic layers: a) Altitude; b) Slope; c) Aspect; d) Curvature; e) SPI; f) TWI; g) TRI; h) STI; i) Soil; j) Geology; k) LULC; l) Distance from river; m) Distance from road.
155
Fig. 2 (continued).
from the slope and catchment area by using the following equations
under the assumption of steady-state conditions and uniform soil
properties.
TWI ln As = tan
TRI
q
Abs max2 min2
where max and min are the biggest and smallest values of the cells in
the nine rectangular neighborhoods of altitude, respectively. STI denes
the procedure of the slope failure and deposition (Fig. 2h) and is computed by using the following equation:
0:6
As
sin 1:3
0:0896
22:13
STI
SPI As tan
where is the slope at each pixel, and As is the upstream area. The measured TRI and STI range from 1 to 474.7 and from 0 to 206, respectively.
In the second dataset, ve more factors were added to the LiDARderived factors: soil, geology, LULC, distance from river, and distance
from road. Soil types were gathered from the study area, which contains
four soil types, as shown in Fig. 2i. Geology inuences the shear strength
of the rock mass, penetrance, and accordingly, the probability of an
increase in neutral pressure in the subsoil. Tectonically, the Selangor
State forms part of the Sunda Shield. The geology of this region is almost
156
Fig. 2 (continued).
steady, but urban sprawl results in deforestation and soil erosion, thereby causing severe danger to the slopes (Lee & Pradhan, 2007). Geological data were generated by digitizing geological boundaries, eldwork,
and interpretation of aerial photos. In some studies, geology is one of
the most signicant conditioning factors in the distribution of landslides. In this study, three geology types were available: acid intrusives,
schist, and vein quartz (Fig. 2j). Agriculture is considered the main LULC
type in the study area, particularly for paddies, rubber, and oil palm. Oil
palm and rubber palming increased as a result of the conversion of
forests. Urbanization has changed many parts of the study area primarily because of deforestation. LULC is one of the most inuential factors in
landslide studies because of its impact and the density of the vegetation
in the soil structure. In tropical regions such as Malaysia, dense
vegetation has a critical effect on residual soil. For instance, the effect
of precipitation on the slope can be decreased through above-ground
interception and storage; plant roots also produce water paths and
serve as a calming factor that strengthens the slope as a root net
(Evett, Tolk, & Howell, 2006). An area with short, bare vegetation is
157
W i loge
P fBi jDg
n
o
P Bi jSi
P Bi jS
n
o
P Bi jSi
and
W i loge
Fig. 2 (continued).
4. Methodology
Landslide susceptibility mapping methods can be divided into direct
and indirect techniques (Van Westen, Rengers, & Soeters, 2003). In the
direct method, an expert denes landslide susceptibility based on his
opinion about terrain conditions. The indirect method, which is slightly
more accurate than the direct method, can be conducted by using statistical or deterministic approaches. These methods recognize landslidesusceptible regions based on information obtained from the correlation
among the landslide conditioning factors and distribution (Tien Bui,
Pradhan, Lofman, Revhaug, & Dick, 2012). In recent years, indirect landslide susceptibility mapping has been extensively conducted with the
aid of GIS. GIS is appropriate for this type of mapping, in which all landslide contributing factors are integrated with a landslide inventory map
by using data integration approaches (Abdallah, Chorowicz, Bou Kheir,
& Khawlie, 2005). Three methods have been selected for use in this research to acquire weights for each conditioning factor and produce
landslide susceptibility maps. Each method belongs to a specic analysis
category. The methodology owchart is shown in Fig. 3.
4.1. Weight determination using WoE algorithm
WoE was used to produce statistically derived weights for all classes
of the conditioning factor maps. Therefore, all the scale factors were
reclassied as required by BSA. Many classication techniques exist;
however, the quantile method was chosen for this research because of
its greater popularity compared with other methods (Tehrany,
Pradhan, & Jebur, 2013). Altitude, slope, SPI, TWI, TRI, and STI were categorized into 10 equal area classes. Curvature was classied into three
classes and expressed as concave when negative, at when zero, and
convex when positive. However, in the case of road and river, the
quantile classication scheme was not used. The reason is that only
small buffers are important and long distance from the river or road
does not have any signicant impact on landslide occurrence. As the distance from the river or road increases, landslide occurrence decreases
(Pradhan, Sezer, Gokceoglu, & Buchroithner, 2010). The distance to
the river and road buffer was chosen based on the occurrence of failures
to the adjacency of the river and road. Hence, a 50-m buffer zone is chosen in the study area. The signicance of each conditioning factor for
where Bi andBi represent the presence and absence of the landslide conditioning factors, respectively. Furthermore, S shows the existence of a
landslide, whereas Si represents its absence. The method was implemented by using individual factor maps, which include various categories, to demonstrate the presence or absence of a landslide. For each
conditioning factor, W+
i was used for the pixels of a conditioning factor
(shown as a class of conditioning factor) to show the signicance of the
existence of the factor for landslide occurrence. The existence of the
conditioning factor is appropriate for landslide occurrence when W+
i
is positive but inappropriate when W+
i is negative. Furthermore, the signicance of the absence of the factor for landslide occurrence is shown
by W_i . The case where W_i is positive indicates that the absence of the
factor is favorable for landslide occurrence. Weights with higher values
demonstrate that the conditioning factor is effective for susceptibility
mapping; however, conditioning factors with a zero weight represent
no correlation with landslide occurrence. Four possible combinations
exist for each conditioning factor, of which the frequency, expressed
as number of pixels, can be calculated in GIS. Lee and Choi (2004) and
Pradhan, Oh, and Buchroithner(2010) provide detailed explanations of
WoE modeling.
4.2. Weight determination using LR algorithm
The most common statistical technique used in landslide susceptibility mapping is multiple regression analysis. This method is represented as a linear equation of
Y b0 b1 x1 b2 x2 bn xn
Y
p 1= 1 e
158
where w is a coefcient vector that describes the direction of the hyperplane in the feature space, b is the offset of the hyperplane from the
origin, and i is the positive slack variable (Cortes & Vapnik, 1995). The
following optimization problem that uses Lagrangian multipliers is
solved by determining an optimal hyperplane (Samui, 2008).
Xn
1 Xn Xn
Minimize i1 i
y y xi x j ;
i1
j1 i j i j
2
10
Xn
Subject to i1 i y j 0;
11
0 i C;
where i is the Lagrange multiplier, C is the penalty, and the slack variable i allows penalized constraint violation. The decision function,
which will be used for classifying new data, can then be written as
!
n
X
yi i xi b :
gx sign
i1
12
159
preservation and vegetation density, and can also affect the strength of
the soil structure and, consequently, landsliding (Pourghasemi, Pradhan,
Gokceoglu, & Moezzi, 2012). These results demonstrate that the class of
West in aspect has more unstable conditions compared with the others.
The at curvature attained the highest S(C) value of 5.14, which indicates
high landslide probability in at regions, because the area with this characteristic can preserve water for a long time, thereby resulting in landslide
occurrence. The last class of SPI (17.7629.22) had the highest S(C) value,
which was 14.07, whereas the second class (12.723.52) acquired the
lowest weight of 9.50. The most signicant region in TWI was the last
class of 12.5636.02, with the highest weight of 20.03, whereas the lowest
weight (8.98) was for the rst class of 2.415.71.
The highest classes of SPI and TWI acquired the highest values of
probability because of their impact on augmenting water pressure in
the material, consequently reducing shear strength. This condition is
appropriate for increasing the potential of landslide occurrence in the
catchment. SPI is an important factor because the erosive power of
water runoff directly inuences slope toe erosion. The highest weight
(19.55) in the TRI layer belongs to the last class of 80.90474.84. Furthermore, the range of 36.3243.75 acquired the lowest weight. Similar
to TRI, the last class in STI achieved the highest weight, which indicates a
high probability of landslide occurrence. The soil type of LAA_COL
attained the highest weight of 20.34, while DLD achieved the lowest
value of 21.73. This nding revealed that the soil type of LAA_COL
has a weak structure, which led to slope failure in the study area. The
weight of 1.795, which was the highest value among the others, is
assigned to Schist in geology. This rock type is impenetrable, and its
shear strength is lower than that of the others. Class two in LULC
achieved the highest weight, which shows greater probability
compared with the other LULC types. The highest weights of 24.65
and 11.35 in distance from the river were assigned to the classes of
101200 m and 050 m, respectively. The case of inltration in the
areas adjacent to the river is high. Consequently, water pressure in the
material increases. In the case of distance from the road, the classes
more than 50 m showed positive correlation with landslide occurrence,
whereas the regions that are less than 50 m from the road represented
negative impact in landsliding.
6.1. Weights-of-evidence
Besides the similar WoE values derived for both datasets, the logistic
coefcients achieved for the two datasets were completely different.
This result is attributed to the fact that LR performed MSA, which demonstrated the correlation among the conditioning factors. Therefore, the
results were not similar because the conditioning factors were different
in two datasets. Logistic coefcient, which represents the weight for
each factor, was used to generate the landslide probability index,
which ranges between 0 and 1. A positive coefcient showed that the
existence of the factor in the catchment increased the probability of a
landslide occurrence. However, the negative coefcient indicated that
a negative correlation exists between the factor and landslide occurrence (Chauhan, Sharma, & Arora, 2010). The results of the rst dataset
demonstrated that TRI had the highest positive correlation with landslide occurrence with 0.341 logistic coefcients. Except for altitude,
slope, and SPI, other factors received a positive logistic coefcient. The
outcomes from the second dataset revealed that geology (vein quartz)
acquired the highest positive logistic coefcient of 33.999, whereas
soil (DLD) achieved the lowest weight of 8.636. The landslide probability maps were derived by using the LR coefcients.
gx sign
n
X
yi j K xi ; x j b
13
i1
where K(xi,xj) is the kernel type. Pradhan (2013) and Tien Bui, Pradhan,
Lofman, and Revhaug (2012) provide further information on the effect
of each kernel type and its parameters. In the current research, radialbased function (RBF) was employed for the kernel. This kernel is the
most common kernel type that is not sensitive to outliers (Marjanovi
et al., 2011; Yao et al., 2008). Furthermore, only one parameter gamma
() has to be dened for a selected penalty (). Selecting kernel types
is a difcult task in SVM analysis because landslide susceptibility mapping is a linearly non-separable problem. Thus, a cross-validation method was implemented to determine the optimal kernel parameters.
5. Validating the derived landslide susceptibility maps
160
Fig. 4. Landslide susceptibility maps derived from LiDAR-derived conditioning factors using a) WoE; b) LR (all factors); c) LR (signicant factors); d) SVM.
in the other three maps are located in the north and the northeast
around the boundary of the catchment. However, these areas are classied as low susceptible by using WoE. Therefore, validation should be
conducted to understand which method is more precise in detecting
prone regions. Fig. 5 illustrates the generated maps by using the second
dataset.
Two considerable points can be observed from the maps derived
from the second dataset. First, the result of SVM represents a high degree of exaggeration because a large part of the catchment is classied
as a susceptible zone. Second, WoE shows different zones as landslideprone areas unlike the other methods. With the use of the AUC method,
the success and prediction rates for the eight landslide susceptibility
maps are measured by comparing them with the existing landslide locations. Fig. 6 represents the AUC curve for each method.
Success rate does not show the real efciency of the derived results
but can be recognized by using the prediction rate. The prediction rates
acquired for WoE derived by the rst and second datasets were 59.00%
and 65.00%, respectively. Furthermore, 86.00% and 66.00% were the
achieved prediction rates for the LR method with the use of all factors
of the rst and second datasets, respectively. LR, which used only
the signicant factors of each dataset, obtained 89.00% and 77.00%
161
prediction rates for the rst and second datasets, respectively. Finally,
SVM attained 84.00% and 69.00% prediction rates for the LiDARderived conditioning factor dataset and the second dataset, respectively.
In terms of prediction rates, the results achieved from the rst dataset
(LiDAR dataset) by using LR and SVM showed reasonably higher
efciency because the produced accuracies were higher than those
measured from the second dataset.
As it has been mentioned, in the case of the LR, the prediction abilities of the model in the two datasets were 86.00% and 66.00%. The prediction ability of the models showed almost 20% differences. This could
be due to the fact that some of the used conditioning factors produced
some noise during processing (Chang, Chiang, & Hsu, 2007). To check
the severity of the presence of the noise, multi-collineality between
input-variables in the two datasets should be implemented (Zhu &
Huang, 2006). If there is a perfect linear relationship between the conditioning factors exist, then the estimates for a regression model cannot
be perfectly measured. The term collinearity means that two conditioning factors are almost perfect linear combinations of each another. In the
situation that more than two factors are involved, it is called multicollinearity. LR is sensitive to collinearities among the conditioning factors (Ozdemir, 2011). Hence, the inclusion of conditioning factors which
Fig. 5. Landslide susceptibility maps derived from the second dataset using a) WoE; b) LR (all factors); c) LR (signicant factors); d) SVM.
162
fell into one of the geology types. Thus, the algorithm will be forced to
provide the weight to the said region. Therefore, LiDAR-derived parameters were sufcient for precisely detecting susceptible areas in this case
of study. Adding more conditioning factors didn't increase accuracy
and could reduce precision. Van Westen et al. (2003) explained why
adding geological and LULC data into the analysis did not have a signicant effect on the nal results. They used WoE to map susceptible areas
by using various conditioning factors to detect the importance and
impact of each factor. Their analysis showed that the use of detailed geomorphological information in bivariate statistical analysis enhanced the
overall accuracy of the derived susceptibility map. However, other factor
maps, such as geology, couldn't considerably increase the precision of the
results.
Based on the more precise results achieved among the methods,
high-elevation regions with a sharp slope fall into highly susceptible
zones. This nding reveals that these factors of slope and altitude have
considerable effect on landslide occurrence. Susceptible zones are composed of weak rocks such as vein quartz. As estimated, low slope areas
and river networks showed very low landslide susceptibility.
In addition, the signicance of a specic conditioning factor was
evaluated by eliminating the factor and implementing the model to
measure the AUC. This analysis was done for all three methods of
WoE, LR, and SVM using the complete dataset. The results are listed in
Table 3.
In the case of WoE, the prediction rate signicantly decreased when
the river factor was eliminated from the analysis. It represented that the
river has signicant impact on the performance of WoE. The most significant factors recognized by LR analysis were altitude, SPI, TRI, STI, soil
and geology. The reason was that the AUCs were decreased when the
aforementioned factors were removed from the analysis. Same factors
were selected by LR as signicant factors prior to the susceptibility mapping which can be used as an evident for the correctness of the achieved
results from Table 3. Other factors did not make considerable variation
in the acquired accuracy. Same factors of altitude, SPI, TRI, STI, soil and
geology were detected as signicant factors for SVM analysis as the
achieved prediction rates were decreased in the case that these factors
were eliminated from the analysis.
Based on Fig. 6, when only the signicant factors of rst dataset and
signicant factors of second dataset were utilized in LR, the difference
between the prediction capabilities of two models was reduced from
Fig. 6. Graphic representation of the cumulative frequency diagram presenting the cumulative landslide occurrence (%; y-axis) in landslide probability index rank (%; x-axis): a) success
rate; b) prediction rate.
Table 4
The relative important of landslide conditioning factors for the signicant parameters in
both datasets.
All (sig)
Layer
Altitude
SPI
TRI
STI
Soil
Geology
Layer
Altitude
SPI
TRI
STI
Soil
Geology
1
2.484
1.164
2.495
2.376
2.499
1.553
1
1.554
1.040
1.561
1.562
1.165
2.488
1
2.454
2.501
2.471
1.599
1.066
1.570
1
1.601
1.601
1.162
1.184
1.184
1.184
1
1.138
1.095
1.095
1.082
1.095
1.053
1
LiDAR (sig)
Layer
Input variables
Altitude
Slope
Aspect
TWI
TRI
VIF
Altitude
Slope
Aspect
TWI
TRI
1
2.345
4.035
4.034
1.110
2.484
1
4.275
4.231
1.154
1.002
1.002
1
1.002
1.002
1.054
1.043
1.054
1
1.055
2.455
2.410
8.928
8.928
1
20% to 12%. Therefore, the impacts of soil and geology on the model
were examined by removing one of each conditioning factor and
recalculating AUC of these models. The changes in AUCs when eliminating a specic factor showed the contribution of that factor to the model
(Table 4).
It can be seen that all the used factors had signicant impact on the
LR performance. However, soil and geology had more contribution to
the model as the prediction capabilities of the LR (all factors-sig)
model was reduced from 77.00% to 71.19% and 72.41% respectively.
7. Conclusion
In the last decade, the vast coverage of landslide damages led to considerable modications in development strategies on unsteady terrain,
with the Malaysian government requiring local planning specialists to
perform landslide susceptibility analyses at all stages of the development process. Construction of a spatial database is the basis of such
analysis, which needs to be conducted by using appropriate and precise
sources. A long-standing issue that remains unsolved in the literature is
the selection of the inuential conditioning factor. Each researcher
chooses factors based on his opinion or on the most used factors in
the literature. This research aimed to study the inuence of two datasets
in landslide susceptibility analysis to examine whether pure LiDARderived conditioning factors are sufcient or if additional factors, such
as soil and geology, are also needed. Three methods, namely, WoE, LR,
and SVM, which are BSA, MSA, and machine learning methods, respectively, were utilized. Each method was applied by using two datasets,
Without altitude
Without slope
Without aspect
Without curvature
Without SPI
Without TWI
Without TRI
Without STI
Without soil
Without geology
Without LULC
Without river
Without road
Layer
Without altitude
Without slope
Without aspect
Without TWI
Without TRI
87.07
85.48
87.5
87.72
84.86
AUC (prediction)
LR all (sig)
Without altitude
Without SPI
Without TRI
Without STI
Without soil
Without geology
75.19
73.01
73.31
74.34
71.19
72.41
Table 3
The relative important of landslide conditioning factors for the three models.
Layer
AUC (prediction)
LR LiDAR (sig)
VIF
Input variables
163
AUC (prediction)
WoE
LR
SVM
65.92
63.12
66.62
65.80
65.84
66.99
64.15
66.96
65.14
64.39
65.56
59.56
63.18
62.75
67.88
66.17
66.71
63.55
67.03
62.49
62.89
60.91
61.38
66.84
67.01
66.53
66.66
69.15
70.61
70.32
65.45
70.70
68.46
67.56
63.42
64.87
70.02
70.85
70.97
This research was supported by UPM University Research Grant (0501-11-1283RU) to stimulate research under the RUGS scheme with project number 9344100. The authors would like to thank the National
Mapping Agency (JUPEM), and Dept. of Mineral & Geosciences (JMG),
Malaysia for providing the various datasets used in this paper. Thanks
to three anonymous reviewers for their valuable and critical comments
which helped us to improve the quality of earlier version of the
manuscript.
References
Abdallah, C., Chorowicz, J., Bou Kheir, R., & Khawlie, M. (2005). Detecting major terrain
parameters relating to mass movements' occurrence using GIS, remote sensing and
statistical correlations, case study Lebanon. Remote Sensing of Environment, 99,
448461.
164
Althuwaynee, O. F., Pradhan, B., & Lee, S. (2012). Application of an evidential belief
function model in landslide susceptibility mapping. Computers & Geosciences, 44,
120135.
Ayalew, L., & Yamagishi, H. (2005). The application of GIS-based logistic regression for
landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan.
Geomorphology, 65, 1531.
Ayalew, L., Yamagishi, H., & Ugawa, N. (2004). Landslide susceptibility mapping using
GIS-based weighted linear combination, the case in Tsugawa area of Agano River,
Niigata Prefecture, Japan. Landslides, 1, 7381.
Can, T., Nefeslioglu, H. A., Gokceoglu, C., Sonmez, H., & Duman, T. Y. (2005). Susceptibility
assessments of shallow earthows triggered by heavy rainfall at three subcatchments
by logistic regression analyses. Geomorphology, 72, 250271.
Chang, K. -T., Chiang, S. -H., & Hsu, M. -L. (2007). Modeling typhoon-and earthquakeinduced landslides in a mountainous watershed using logistic regression.
Geomorphology, 89, 335347.
Chauhan, S., Sharma, M., & Arora, M. K. (2010). Landslide susceptibility zonation of the
Chamoli region, Garhwal Himalayas, using logistic regression model. Landslides, 7,
411423.
Chen, H., & Lee, C. (2003). A dynamic model for rainfall-induced landslides on natural
slopes. Geomorphology, 51, 269288.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273297.
Domnguez Cuesta, M. J., Jimnez Snchez, M., & Rodrguez Garca, A. (1999). Press archives as temporal records of landslides in the North of Spain: relationships between
rainfall and instability slope events. Geomorphology, 30, 125132.
Domnguez-Cuesta, M. J., Jimnez-Snchez, M., & Berrezueta, E. (2007). Landslides in
the Central Coaleld (Cantabrian Mountains, NW Spain): Geomorphological features, conditioning factors and methodological implications in susceptibility assessment. Geomorphology, 89, 358369.
Donati, L., & Turrini, M. (2002). An objective method to rank the importance of the
factors predisposing to landslides with the GIS methodology: application to an
area of the Apennines (Valnerina; Perugia, Italy). Engineering Geology, 63,
277289.
Evett, S. R., Tolk, J. A., & Howell, T. A. (2006). Soil prole water content determination.
Vadose Zone Journal, 5, 894907.
Garca, M., Riao, D., Chuvieco, E., Salas, J., & Danson, F. M. (2011). Multispectral and
LiDAR data fusion for fuel type mapping using Support Vector Machine and decision
rules. Remote Sensing of Environment, 115, 13691379.
Glenn, N. F., Streutker, D. R., Chadwick, D. J., Thackray, G. D., & Dorsch, S. J. (2006).
Analysis of LiDAR-derived topographic information for characterizing
and differentiating landslide morphology and activity. Geomorphology, 73,
131148.
Gokceoglu, C., Sonmez, H., Nefeslioglu, H. A., Duman, T. Y., & Can, T. (2005). The 17 March
2005 Kuzulu landslide (Sivas, Turkey) and landslide-susceptibility map of its near
vicinity. Engineering Geology, 81, 6583.
Gorsevski, P. V., Gessler, P. E., Boll, J., Elliot, W. J., & Foltz, R. B. (2006). Spatially and
temporally distributed modeling of landslide susceptibility. Geomorphology, 80,
178198.
Guadagno, F., Martino, S., & Mugnozza, G. S. (2003). Inuence of man-made cuts on the
stability of pyroclastic covers (Campania, southern Italy): a numerical modelling
approach. Environmental Geology, 43, 371384.
Guzzetti, F., Cardinali, M., Reichenbach, P., & Carrara, A. (2000). Comparing landslide
maps: a case study in the upper Tiber River Basin, central Italy. Environmental
Management, 25, 247263.
Guzzetti, F., Carrara, A., Cardinali, M., & Reichenbach, P. (1999). Landslide hazard evaluation: a review of current techniques and their application in a multi-scale study,
Central Italy. Geomorphology, 31, 181216.
Guzzetti, F., Reichenbach, P., Cardinali, M., Galli, M., & Ardizzone, F. (2005). Probabilistic landslide hazard assessment at the basin scale. Geomorphology, 72,
272299.
Huang, R., Pei, X., Fan, X., Zhang, W., Li, S., & Li, B. (2012). The characteristics and failure
mechanism of the largest landslide triggered by the Wenchuan earthquake, May
12, 2008, China. Landslides, 9, 131142.
Jebur, M. N., Pradhan, B., & Tehrany, M. S. (2013). Using ALOS PALSAR derived
high-resolution DInSAR to detect slow-moving landslides in tropical forest:
Cameron Highlands, Malaysia. Geomatics, Natural Hazards and Risk, 119,
http://dx.doi.org/10.1080/19475705.2013.860407.
Jebur, M. N., Pradhan, B., & Tehrany, M. S. (2014). Detection of vertical slope movement in
highly vegetated tropical area of Gunung pass landslide, Malaysia, using L-band
InSAR technique. Geosciences Journal, 18(1), 6168, http://dx.doi.org/10.1007/
s12303-013-0053-8.
Jibson, R. W., & Keefer, D. K. (1989). Statistical analysis of factors affecting landslide distribution in the New Madrid seismic zone, Tennessee and Kentucky. Engineering
Geology, 27, 509542.
Lee, S., & Choi, J. (2004). Landslide susceptibility mapping using GIS and the weight-ofevidence model. International Journal of Geographical Information Science, 18,
789814.
Lee, S., & Pradhan, B. (2006). Probabilistic landslide hazards and risk mapping on Penang
Island, Malaysia. Journal of Earth System Science, 115, 661672.
Lee, S., & Pradhan, B. (2007). Landslide hazard mapping at Selangor, Malaysia using
frequency ratio and logistic regression models. Landslides, 4, 3341.
Lee, S., & Sambath, T. (2006). Landslide susceptibility mapping in the Damrei Romel area,
Cambodia using frequency ratio and logistic regression models. Environmental
Geology, 50, 847855.
Lefsky, M., Cohen, W., & Spies, T. (2001). An evaluation of alternate remote sensing products for forest inventory, monitoring, and mapping of Douglas-r forests in western
Oregon. Canadian Journal of Forest Research, 31, 7887.
Marjanovi, M., Kovaevi, M., Bajat, B., & Voenlek, V. (2011). Landslide susceptibility
assessment using SVM machine learning algorithm. Engineering Geology, 123,
225234.
Metternicht, G., Hurni, L., & Gogu, R. (2005). Remote sensing of landslides: an analysis of
the potential contribution to geo-spatial systems for hazard assessment in mountainous environments. Remote Sensing of Environment, 98, 284303.
Moreiras, S. M. (2005). Landslide susceptibility zonation in the Rio Mendoza valley,
Argentina. Geomorphology, 66, 345357.
Nefeslioglu, H., Sezer, E., Gokceoglu, C., Bozkir, A., & Duman, T. (2010). Assessment of
landslide susceptibility by decision trees in the metropolitan area of Istanbul,
Turkey. Mathematical Problems in Engineering, 2010, 115.
Oh, H. J., & Pradhan, B. (2011). Application of a neuro-fuzzy model to landslidesusceptibility mapping for shallow landslides in a tropical hilly area. Computers &
Geosciences, 37, 12641276.
Ozdemir, A. (2011). Using a binary logistic regression method and GIS for evaluating and
mapping the groundwater spring potential in the Sultan Mountains (Aksehir,
Turkey). Journal of Hydrology, 405, 123136.
Pourghasemi, H. R., Jirandeh, A. G., Pradhan, B., Xu, C., & Gokceoglu, C. (2013). Landslide
susceptibility mapping using support vector machine and GIS at the Golestan
Province, Iran. Journal of Earth System Science, 122, 349369.
Pourghasemi, H. R., Pradhan, B., & Gokceoglu, C. (2012). Application of fuzzy logic and
analytical hierarchy process (AHP) to landslide susceptibility mapping at Haraz
watershed, Iran. Natural Hazards, 63, 965996.
Pourghasemi, H., Pradhan, B., Gokceoglu, C., & Moezzi, K. D. (2012). Landslide susceptibility mapping using a spatial multi criteria evaluation model at Haraz Watershed, Iran.
In B. Pradhan, & M. Buchroithner (Eds.), Terrigenous Mass Movements (pp. 2349).
Springer.
Pourghasemi, H. R., Pradhan, B., Gokceoglu, C., Mohammadi, M., & Moradi, H. R. (2012).
Application of weights-of-evidence and certainty factor models and their comparison
in landslide susceptibility mapping at Haraz watershed, Iran. Arabian Journal of
Geosciences, 6, 23512365.
Pradhan, B. (2013). A comparative study on the predictive ability of the decision tree,
support vector machine and neuro-fuzzy models in landslide susceptibility mapping
using GIS. Computers & Geosciences, 51, 350365.
Pradhan, B., & Lee, S. (2010a). Delineation of landslide hazard areas on Penang Island,
Malaysia, by using frequency ratio, logistic regression, and articial neural network
models. Environmental Earth Sciences, 60, 10371054.
Pradhan, B., & Lee, S. (2010b). Landslide susceptibility assessment and factor effect
analysis: backpropagation articial neural networks and their comparison with
frequency ratio and bivariate logistic regression modelling. Environmental Modelling
& Software, 25, 747759.
Pradhan, B., & Lee, S. (2010c). Regional landslide susceptibility analysis using backpropagation neural network model at Cameron Highland, Malaysia. Landslides, 7,
1330.
Pradhan, B., Mansor, S., Pirasteh, S., & Buchroithner, M. F. (2011). Landslide hazard and
risk analyses at a landslide prone catchment area using statistical based geospatial
model. International Journal of Remote Sensing, 32, 40754087.
Pradhan, B., Oh, H. J., & Buchroithner, M. (2010). Weights-of-evidence model applied to
landslide susceptibility mapping in a tropical hilly area. Geomatics, Natural Hazards
and Risk, 1, 199223.
Pradhan, B., Sezer, E. A., Gokceoglu, C., & Buchroithner, M. F. (2010). Landslide susceptibility mapping by neuro-fuzzy approach in a landslide-prone area (Cameron
Highlands, Malaysia). IEEE Transactions on Geoscience and Remote Sensing, 48,
41644177.
Pradhan, B., & Youssef, A.M. (2010). Manifestation of remote sensing data and GIS on
landslide hazard analysis using spatial-based statistical models. Arabian Journal of
Geosciences, 3, 319326.
Pradhan, B., Youssef, A., & Varathrajoo, R. (2010). Approaches for delineating landslide
hazard areas using different training sites in an advanced articial neural network
model. Geo-spatial Information Science, 13, 93102.
Ray, R. L., Jacobs, J. M., & Cosh, M. H. (2010). Landslide susceptibility mapping using
downscaled AMSR-E soil moisture: A case study from Cleveland Corral, California,
US. Remote Sensing of Environment, 114, 26242636.
Regmi, N. R., Giardino, J. R., & Vitek, J.D. (2010). Modeling susceptibility to landslides using
the weight of evidence approach: Western Colorado, USA. Geomorphology, 115,
172187.
Samui, P. (2008). Slope stability analysis: a support vector machine approach.
Environmental Geology, 56, 255267.
Tehrany, M. S., Pradhan, B., & Jebu, M. N. (2013). A comparative assessment between object and pixel-based classication approaches for land use/land cover mapping using
SPOT 5 imagery. Geocarto International, 119, http://dx.doi.org/10.1080/10106049.
2013.768300.
Tehrany, M. S., Pradhan, B., & Jebur, M. N. (2013). Spatial prediction of ood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. Journal of Hydrology, 504,
6979.
Tehrany, M. S., Pradhan, B., & Jebur, M. N. (2014). Flood susceptibility mapping using a
novel ensemble weights-of-evidence and support vector machine models in GIS.
Journal of Hydrology, http://dx.doi.org/10.1016/j.jhydrol.2014.03.008.
Tien Bui, D., Pradhan, B., Lofman, O., & Revhaug, I. (2012). Landslide susceptibility assessment in vietnam using support vector machines, decision tree, and Naive Bayes
Models. Mathematical Problems in Engineering, 2012, 115.
Tien Bui, D., Pradhan, B., Lofman, O., Revhaug, I., & Dick, O. B. (2012). Spatial prediction of landslide hazards in Hoa Binh province (Vietnam): a comparative assessment of the efcacy of evidential belief functions and fuzzy logic models. Catena,
96, 2840.
165
Zzere, J. L. S., de Brum Ferreira, A., & Rodrigues, M. L. S. (1999). The role of conditioning
and triggering factors in the occurrence of landslides: a case study in the area north of
Lisbon (Portugal). Geomorphology, 30, 133146.
Zhu, L., & Huang, J. -F. (2006). GIS-based logistic regression method for landslide
susceptibility mapping in regional scale. Journal of Zhejiang University Science A, 7,
20072017.