You are on page 1of 16

Remote Sensing of Environment 152 (2014) 150165

Contents lists available at ScienceDirect

Remote Sensing of Environment


journal homepage: www.elsevier.com/locate/rse

Optimization of landslide conditioning factors using very high-resolution


airborne laser scanning (LiDAR) data at catchment scale
Mustafa Neamah Jebur 1, Biswajeet Pradhan , Mahyat Shafapour Tehrany 1
Department of Civil Engineering, Faculty of Engineering, Geospatial Information Science Research Center (GISRC), University Putra Malaysia, 43400 UPM, Serdang, Selangor, Malaysia

a r t i c l e

i n f o

Article history:
Received 25 March 2014
Received in revised form 27 May 2014
Accepted 30 May 2014
Available online 28 June 2014
Keywords:
Landslide
LiDAR
GIS
Remote sensing
Conditioning factors
Susceptibility mapping
Malaysia

a b s t r a c t
Landslide susceptibility, hazards, and risks have been extensively explored and analyzed in the past decades.
However, choosing relevant conditioning factors in such analyses remains a challenging task. Landslide susceptibility mapping employs topological, environmental, geological, and hydrological parameters. Some researchers
assume that as the number of conditioning factors increases, the precision of the generated susceptibility map increases. By contrast, other case studies prove that a small number of conditioning factors are sufcient to produce
landslide susceptibility maps with a reasonable quality. This study investigates the effects of conditioning factors
on landslide susceptibility mapping. Bukit Antarabangsa, Ulu Klang, Malaysia was selected as the study area,
because it is a catchment area with a high potential of landslide occurrence. A spatial database of 31 landslide locations was evaluated to map landslide-susceptible areas. Two datasets of conditioning factors were constructed
in GIS environment. The rst dataset was derived from high-resolution airborne laser scanning data (LiDAR),
which contains eight landslide conditioning factors: altitude, slope, aspect, curvature, stream power index
(SPI), topographic wetness index (TWI), topographic roughness index (TRI), and sediment transport index
(STI). The second dataset was gathered by using the same conditioning factors of the rst dataset, but with the
addition of other conditioning factors: geological and environmental factors of soil, geology, land use/cover
(LULC), distance from river, and distance from road. Two different datasets were constructed to compare the efciency of one over the other in landslide susceptibility zonation. Three methods were implemented to recognize
the importance of different conditioning factors in landslide susceptibility mapping. Three different types of
models such as weights-of-evidence (WoE) (bivariate statistical analysis), logistic regression (LR) (multivariate
statistical analysis), and data-driven support vector machine (SVM) were used to determine the optimal landslide conditioning factors. The area under curve (AUC) was used to assess the obtained results. The prediction
rates of WoE, LR, and SVM obtained from only the LiDAR-derived conditioning factors were 59%, 86%, and 84%,
respectively. The prediction rates of the WoE, LR, and SVM obtained from the second dataset were 65%, 66%,
and 69%, respectively. The LiDAR-derived conditioning factors were more sufcient in generating an accurate
landslide susceptibility map. Using additional factors, such as geology, LULC, and so on, does not signicantly increase the accuracy of the map. The ndings of this study can be used as reference for future analysis in selecting
data for landslide conditioning factors.
2014 Elsevier Inc. All rights reserved.

1. Introduction
Landslides are a catastrophic phenomenon and a dynamic process
that contributes to the destruction and transformation of a given landscape (Lee & Pradhan, 2006). Various natural and man-made factors
trigger landslides (Guzzetti, Reichenbach, Cardinali, Galli, & Ardizzone,
2005). Meteorological variations, such as strong or continued precipitation, and tectonic forces, such as earthquakes, are the main factors that
trigger landslides (Huang et al., 2012), although natural forces, such as
rainfall, and human activities also trigger them (Guadagno, Martino, &
Corresponding author. Tel.: +60 3 89466383; fax: +60 3 89468470.
E-mail addresses: biswajeet24@gmail.com, biswajeet@lycos.com (B. Pradhan).
1
Tel.: +60 3 89466383; fax: +603 89468470.

http://dx.doi.org/10.1016/j.rse.2014.05.013
0034-4257/ 2014 Elsevier Inc. All rights reserved.

Mugnozza, 2003; Jebur, Pradhan, & Tehrany, 2013, 2014; Pradhan,


Youssef & Varathrajoo, 2010). Given the many possible causes of landslides, mapping landslide susceptibility, hazards, and risks is essential
to implementing mitigation strategies (Chen & Lee, 2003; Pradhan &
Lee, 2010c; Ray, Jacobs, & Cosh, 2010). The landslide susceptibility map
is the rst stage of hazard and risk mapping, which determines the regions with the same probability of landslide occurrence in a given period
of time (Pradhan, Mansor, Pirasteh, & Buchroithner, 2011; Pradhan &
Youssef, 2010). Landslide susceptibility mapping is the evaluation of
the proneness of the ground to landslides and the possibility that a landslide might take place at a specic terrain or under the inuence of certain factors (Pourghasemi, Pradhan, & Gokceoglu, 2012). Landslide
susceptibility is specied by using comparative qualitative and quantitative analyses of the conditioning factors observed in previously damaged

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

regions (Domnguez-Cuesta, Jimnez-Snchez, & Berrezueta, 2007). Differences between the characteristics of the factors should be evaluated to
produce a landslide susceptibility map that employs various conditioning factors. The characteristics of conditioning factors vary from area to
area, therefore, the rst stage in generating susceptibility map is to assess
the importance of each factor (Nefeslioglu, Sezer, Gokceoglu, Bozkir, &
Duman, 2010). Constructing the conditioning factors is a difcult task
(Jibson & Keefer, 1989), and no specic rule exists to dene how many
conditioning factors are sufcient for a specic susceptibility analysis.
Furthermore, no framework exists for the selection of conditioning factors. These factors are mostly chosen based on the opinions of experts.
Numerous studies on the scope of susceptibility mapping have been
conducted in the literature. However, studies on selecting the proper
conditioning factors are equally reasonable. The lack of a comprehensive research on this topic motivated the authors of this study to conduct such analysis and provide directions for future studies. Landslides
occur as a result of the effects of numerous conditioning factors, including meteorology, hydrology, geology, constructions, and geomorphic
history (Metternicht, Hurni, & Gogu, 2005; Pradhan & Youssef, 2010).
Nonetheless, covering all these factors in a single landslide susceptibility
assessment is impossible (Moreiras, 2005). Domnguez-Cuesta et al.
(2007) stated that conditioning factors can be grouped into two general
categories: factors related to topography and factors related to landslides, geology, and vegetation. Two groups of spatial variables can be
constructed further from these categories. The rst group represents topographical factors and contains the quantitative variables of altitude,
aspect, slope, and plan curvature, all of which can be acquired from
the digital elevation model. The second group consists of qualitative variables that pertain to geology and vegetation.
The morphology of the slope, land use/cover (LULC) types, and geological foundation are some of the factors that can be used in landslide
mapping (Gorsevski, Gessler, Boll, Elliot, & Foltz, 2006). Morphometric
characteristics can be used to recognize various types of landslides
(Glenn, Streutker, Chadwick, Thackray, & Dorsch, 2006). Some researchers have attempted to assess the impact of conditioning factors
and identify the ones with the most signicant impact. Donati and
Turrini (2002) determined the most inuential factors in landslide
occurrences in southeast Umbria, east of Spoleto, and ranked them
according to their importance. They identied the factors that theoretically have a signicant impact on landslides, but which in reality had a
different impact. Only some of the conditioning factors that they identied such as lithology exhibited the predicted impact; the others such
as slope steepness and the orientation exerted a less important inuence than was predicted. Domnguez Cuesta, Jimnez Snchez, and
Rodrguez Garca (1999) assessed 209 landslide events from 1980 to
1994 in the Cantabrian Mountains in northwestern Spain and found
that precipitation was the most inuential conditioning factor.
Moreiras (2005) considered lithology and slope as the most inuential factors in landslide mapping based on the study area of the Rio
Mendoza Valley in Argentina. Glenn et al. (2006) stated that topographic factors are highly inuential parameters in landslide studies. They
assessed the efciency of laser scanning data (LiDAR)-derived topographic factors in characterizing landslide morphology and activity.
According to Oh and Pradhan (2011) and Yilmaz (2009), by adding
the factors of altitude, topographic roughness index (TRI), stream
power index (SPI), and distance from road to the spatial database
enhances the accuracy of the nal results. In some studies, altitude,
slope, and aspect are the primary topographical attributes, whereas
SPI, TRI, and topographic wetness index (TWI) are the secondary topographical attributes (Wilson & Gallant, 2000). Geology, slope angle, and
LULC were determined as the most inuential conditioning factors in a
study by Zzere et al.(1999). Furthermore, Donati and Turrini (2002)
found that geology is the most important factor and considered slope
and related factors as secondary factors only.
This brief literature review shows that the conditioning factors
produce various impacts and, for each factor, only certain classes exert

151

a considerable impact on landslide occurrence (Donati & Turrini,


2002). Thorough research into the topic should ll research gaps and facilitate our understanding of these factors and selection of appropriate
data for susceptibility analysis. Although the choice and quality of the
conditioning factors affect susceptibility maps, the efciency of the
method used in mapping also exerts a signicant inuence (Pradhan,
2013). As such, three different methods were selected for landslide susceptibility modeling in the present study to comprehensively assess the
conditioning factors and to enhance the effect of the algorithm used in
the evaluation. Each method belongs to one of the main groups of analysis. Weights-of-evidence (WoE) was chosen as a bivariate statistical
method (Pourghasemi, Pradhan, Gokceoglu, Mohammadi, & Moradi,
2012; Tehrany, Pradhan, & Jebur, 2014), logistic regression (LR) was selected as a multivariate statistical method (Pradhan & Lee, 2010a), and
support vector machine (SVM) was selected as a popular algorithm
for the machine learning category (Pourghasemi, Jirandeh, Pradhan,
Xu, & Gokceoglu, 2013). In this case of using various methods, the effect
of the used technique on the nal judgment is expected to diminish.
Many techniques of assessing landslide susceptibility at the basin scale
have been examined in the literature (Pradhan & Lee, 2010b, 2010c).
The selected methods also have a favorable reputation in landslide susceptibility studies. Landslides are common in Bukit Antarabangsa, Ulu
Klang, Malaysia, and they considerably alter and destabilize the area's
terrain every year. This study was conducted in Bukit Antarbangsa,
and it aims to ascertain the roles played in landslide occurrence by
both pure LiDAR-derived conditioning factors and the full dataset that
include factors such as geological and environmental factors.
2. General description of the study area
Bukit Antarabangsa is a region in Ulu Klang, Malaysia. It has an unstable soil structure and is highly susceptible to landslides (Fig. 1). In
certain circumstances, this area exhibits mass movements that destroy
properties and lives. The geographical location of this catchment area
is 314N to 309N latitude and 10144E to 10147E longitude. This region was chosen because of its frequent landslides over the past few
years. The temperature in the study area can reach as high as 32 C,
and the average monthly precipitation is from 58 mm to 240 mm. Its
dominant LULC types include vegetation and urban areas. Its main soil
types are loam and clay (Althuwaynee, Pradhan, & Lee, 2012). Its geology contains three classes of acid intrusives, schist, and vein quartz.
Precipitation is the main landslide causative factor in the area.
3. Data used
3.1. Landslide inventory
Acquiring a reliable inventory map is a vital task in susceptibility
studies. Such information forms the basis of any subsequent analysis
and directly affects nal outcomes (Guzzetti, Cardinali, Reichenbach, &
Carrara, 2000; Lee & Pradhan, 2006). Several methods of mapping landslide locations are available; however, remote sensing data and aerial
photographs constitute the two main sources of data (Lefsky, Cohen, &
Spies, 2001). In mapping the landslide inventory of the study area, the
main scarp of each landslide was determined as a polygonal feature,
as shown in rasterized records of government agencies, aerial photos,
satellite images, and eld surveys. About 13 landslide incidents were
detected in the study area.
The common types of the landslides were shallow rotational, and
there were a few translational and ow types. However, for the current
research, only the rotational failures were considered, and the other
types of landslide were removed because the occurrence of the other
types of landslide was rare and ignorable. Moreover, a few landslides
that took place in slightly oblique areas were not considered and therefore were eliminated in the research. Hence, the susceptibility maps that
were generated in this study are valid for the shallow rotational

152

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

Fig. 1. Landslide location map with the hill-shaded map of Bukit Antarabangsa, Ulu Klang, Malaysia.

landslides. Most of the slide surface of these landslides in the study area
is usually less than 4 m deep and occurs during or immediately after
intense rainfall (Pradhan, 2013). These incidents were divided into
two datasets for training and testing of the WoE, LR and SVM models.
Based on the literature, the most common technique of splitting the inventory dataset is choosing 70% of the location for training and 30% for
validation (Tien Bui, Pradhan, Lofman, & Revhaug, 2012). The same
method was applied in this study, and the training dataset (21 landslide
locations) was used. The dependent layer was produced; it contained
pixel values of 0 and 1, which indicates the absence and presence of
landslide events, respectively. The selected training and testing dataset
are shown in Fig. 1.
3.2. Landslide conditioning factors
Dening and mapping an appropriate set of conditioning factors correlated to landslide events require a priori knowledge of the main contributors to the landslides (Guzzetti, Carrara, Cardinali, & Reichenbach,
1999). These conditioning factors are terrain geology and morphology,
slope, weather conditions, vegetation density, LULC, and man-made inuences. Accessibility to thematic layers differs extensively depending
on the kind, scale, and method of data collection. The conditioning
factors used in this study were selected based on the factors that are
most commonly mentioned by researchers. In other words, the selected
factors in the present study are those used by many researchers and
those that signicantly inuence the inconsistency potential of the terrain. Two spatial databases that contain landslide-conditioning factors
were designed and created. The rst dataset was directly constructed
from a high-resolution airborne LiDAR that contain eight landslideconditioning factors: altitude, slope, aspect, curvature, stream power
index (SPI), TWI, TRI, and sediment transport index (STI). The second
dataset was constructed from the same conditioning factors, but with
the addition of other conditioning factors: geological and environmental factors of soil, geology, LULC, distance from river, and distance from
road. All landslide-conditioning factors were entered into a GIS and

transformed from vector to raster format with a 5 m 5 m grid cell


and an area of 4370 columns and 4532 rows. These data are accessible
in Malaysia either as paper or as digital maps. The elements of the constructed spatial database are listed in Table 1.
The rst database contains eight landslide-conditioning factors extracted from the LiDAR data. The LiDAR data were used in constructing
the altitude. The LiDAR vector point data were recorded over 31 km2
of the Bukit Antarabangsa landslide and nearby areas on August 3, 2007,
which resulted in nearly 40,000 points/s with a 25,000 Hz pulse rate
frequency. The absolute accuracy of the LiDAR data should meet the
root-mean-square errors of 0.15 m in the vertical axis and 0.3 m in the
horizontal axis. The slope, aspect, curvature, SPI, TWI, TRI, and STI were
calculated based on the derived altitude. The landform of the catchment
area ranges from very at terrain, particularly in the swamp forest, unrestricted mining, savanna and brushwood areas, to a mountainous area
that ranges between 38 and 547 m above sea level (Fig. 2a) (Lee &
Pradhan, 2007). The vertical component of gravity increases with the degree of the slope. The range of the slope in the study area was from 0 to
87.7 (Fig. 2b). Aspect (Fig. 2c) affects weathering and consequently inuences the shear strength of the rock mass in an indirect way. Although
the correlation between aspect and landslide occurrence has long been
explored, no specic agreement exists on the inuence of this factor
on landslide events (Gokceoglu, Sonmez, Nefeslioglu, Duman, & Can,
2005). Nevertheless, numerous researchers use aspect as a conditioning
factor in landslide studies. Aspect is associated with the overall physiographic condition of the study area and inuences rainfall distribution.
A perpendicular correlation exists between the direction of the landslides and the overall physiographic trend of the area. Curvature was
computed by using the curvature sub-tool of the ArcMap 3D analysis
(Fig. 2d). SPI and TWI are hydrological factors that are mostly used in
landslide research. SPI (Fig. 2e) is considered in some studies as a secondary topographical characteristic in landslide susceptibility mapping
(Gokceoglu et al., 2005). TWI is widely used in depicting the inuence
of topography on the place and magnitude of the saturated source
regions of the runoff generation (Fig. 2f). SPI and TWI were calculated

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

153

Table 1
Spatial relationship between each conditioning factor and landslide occurrence extracted by using WoE and LR.
Layer

Class

S(c) LiDAR

S(c) all

LR LiDAR

LR LiDAR (sig)

LR all

LR all (sig)

Altitude
(m)

38.7558.69
58.6966.67
66.6776.64
76.6490.60
90.60112.53
112.53140.44
140.44166.37
166.37194.28
194.28230.17
230.17547.20
05.16
5.1610.67
10.6715.83
15.8320.65
20.6525.47
25.4730.63
30.6335.79
35.7941.99
41.9949.90
49.9087.76
North
Northeast
East
Southeast
South
Southwest
West
Northwest
Concave
at
convex
012.72
12.7213.52
13.5213.98
13.9814.33
14.3314.67
14.6715.13
15.1315.70
15.7016.39
16.3917.76
17.7629.22
2.415.71
5.716.10
6.106.50
6.506.89
6.897.41
7.418.07
8.079.01
9.0110.31
10.3112.56
12.5636.02
1.0317.75
17.7525.18
25.1830.75
30.7536.32
36.3243.75
43.7551.18
51.1858.61
58.6167.90
67.9080.90
80.90474.84
0
00.80
0.801.61
1.612.42
2.423.23
3.234.04
4.045.65
5.658.08
8.0812.93
12.93206.08
RGM
STP
DLD
LAA_COL

16.19
5.60
10.51
0.77
43.41
38.56
19.22
13.42
1.16
11.34
10.88
1.69
8.92
7.01
2.82
8.96
6.80
2.45
9.36
4.001
0
2.78
0.32
3.78
13.71
1.76
26.08
1.11
0.41
5.14
2.98
8.01
9.50
0.29
1.27
1.79
3.55
1.77
1.52
3.87
14.07
8.98
3.49
0.83
7.18
4.46
0.58
5.51
1.36
5.12
20.03
0.78
3.38
0.11
4.42
10.76
3.77
2.01
4.38
7.91
19.55
7.78
8.31
1.87
3.93
5.64
2.14
5.98
7.47
7.68
12.52

16.19
5.60
10.51
0.77
43.41
38.56
19.22
13.42
1.16
11.34
10.88
1.69
8.92
7.01
2.82
8.96
6.80
2.45
9.36
4.001
0
2.78
0.32
3.78
13.71
1.76
26.08
1.11
0.41
5.14
2.98
8.01
9.50
0.29
1.27
1.79
3.55
1.77
1.52
3.87
14.07
8.98
3.49
0.83
7.18
4.46
0.58
5.51
1.36
5.12
20.03
0.78
3.38
0.11
4.42
10.76
3.77
2.01
4.38
7.91
19.55
7.78
8.31
1.87
3.93
5.64
2.14
5.98
7.47
7.68
12.52
7.51
5.64
21.73
20.34

0.023

0.022

0.003

0.093

0.094

0.096

0.468

0.013

0.013

0.001

0.003

0.060

0.044

4.794

2.520

0.160

0.121

4.553

0.341

0.336

0.025

0.463

0.032

1.095

0.899

15.808
0.566
8.636
0

6.212
0.974
4.725
0

Slope (degree)

Aspect

Curvature

SPI

TWI

TRI

STI

Soil

(continued on next page)

154

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

Table 1 (continued)
Layer

Class

S(c) LiDAR

S(c) all

LR LiDAR

LR LiDAR (sig)

LR all

LR all (sig)

Geology

Acid intrusives
Vein quartz
Schist
1
2
3
4
050
51100
101200
N 200
050
51100
101200
201500
N 500

1.81
0
1.795
8.10
7.91
0
5.56
11.35
34.33
24.65
4.35
13.44
20.04
10.40
3.66
0

7.763
23.553
0

19.213
33.999
0
15.815
31.474
0
0
0.009

0.008

LULC

Distance from river

Distance from road

Constant for LR (all) = 9.181.


Constant for LR (all) (sig) = 8.884.
Constant for LR (LiDAR) = 7.137.
Constant for LR (LiDAR) (sig) = 6.831.

Fig. 2. Input thematic layers: a) Altitude; b) Slope; c) Aspect; d) Curvature; e) SPI; f) TWI; g) TRI; h) STI; i) Soil; j) Geology; k) LULC; l) Distance from river; m) Distance from road.

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

155

Fig. 2 (continued).

from the slope and catchment area by using the following equations
under the assumption of steady-state conditions and uniform soil
properties.

TWI ln As = tan

TRI

q


Abs max2 min2

where max and min are the biggest and smallest values of the cells in
the nine rectangular neighborhoods of altitude, respectively. STI denes
the procedure of the slope failure and deposition (Fig. 2h) and is computed by using the following equation:

0:6 
As
sin 1:3
0:0896
22:13


STI
SPI As tan

where As is the specic catchment area (m2m1), and (radian) is the


slope gradient (in ) (Regmi, Giardino, & Vitek, 2010). Fig. 2f shows that
TWI is greater around the river than in other parts of the area. This phenomenon is caused by an increase in inltration and water pressure and
a corresponding decrease in shear strength. The SPI and TWI range from
0 to 29.2 and from 2.41 to 36.02, respectively. The TRI (Fig. 2g), one of
the morphological factors and which is broadly utilized in landslide
analysis, was computed by using Eq. (3).

where is the slope at each pixel, and As is the upstream area. The measured TRI and STI range from 1 to 474.7 and from 0 to 206, respectively.
In the second dataset, ve more factors were added to the LiDARderived factors: soil, geology, LULC, distance from river, and distance
from road. Soil types were gathered from the study area, which contains
four soil types, as shown in Fig. 2i. Geology inuences the shear strength
of the rock mass, penetrance, and accordingly, the probability of an
increase in neutral pressure in the subsoil. Tectonically, the Selangor
State forms part of the Sunda Shield. The geology of this region is almost

156

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

Fig. 2 (continued).

steady, but urban sprawl results in deforestation and soil erosion, thereby causing severe danger to the slopes (Lee & Pradhan, 2007). Geological data were generated by digitizing geological boundaries, eldwork,
and interpretation of aerial photos. In some studies, geology is one of
the most signicant conditioning factors in the distribution of landslides. In this study, three geology types were available: acid intrusives,
schist, and vein quartz (Fig. 2j). Agriculture is considered the main LULC
type in the study area, particularly for paddies, rubber, and oil palm. Oil
palm and rubber palming increased as a result of the conversion of
forests. Urbanization has changed many parts of the study area primarily because of deforestation. LULC is one of the most inuential factors in
landslide studies because of its impact and the density of the vegetation
in the soil structure. In tropical regions such as Malaysia, dense
vegetation has a critical effect on residual soil. For instance, the effect
of precipitation on the slope can be decreased through above-ground
interception and storage; plant roots also produce water paths and
serve as a calming factor that strengthens the slope as a root net
(Evett, Tolk, & Howell, 2006). An area with short, bare vegetation is

probably the most susceptible to landslide occurrences. The LULC map


was produced by using high resolution Quickbird image and the supervised classication method. Four LULC categories are seen in Fig. 2k. In
the case of river, only undercutting of side slopes of rivers might cause
landslides initiation. Hence, the river that was located in the slope less
than 15 was not used in the construction of the distance from river
map as it does not effect on landslide occurrence. Only the river that
falls in the slope more than 15 was considered for analysis (Fig. 2l).
Road construction in mountainous and hilly areas raises the probability
of landslide occurrence. Therefore, this conditioning factor is considered as one of the important factors in landslide analysis. The existence of the road reduces the strength and constancy of the slope
structure. The presence of the road breaks the rock mass, thereby
decrease its strength (Donati & Turrini, 2002). Moreover, the road
only has inuence on landslides as it may undercut the slopes. Therefore, similar to the river map, only the roads that were located in the
slope more than 15 were used in the generation of the road map
(Fig. 2m).

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

157

landslide occurrence was calculated by comparing landslide density


within a class of conditioning factors within the entire study area. In
_
WoE, positive and negative weights (W+
i and Wi ) were assigned to
each class of the conditioning factor (e.g., each soil unit within a soil
map) and were determined as follows:

W i loge

P fBi jDg
n
o
P Bi jSi



P Bi jS
n
o
P Bi jSi

and
W i loge

Fig. 2 (continued).

4. Methodology
Landslide susceptibility mapping methods can be divided into direct
and indirect techniques (Van Westen, Rengers, & Soeters, 2003). In the
direct method, an expert denes landslide susceptibility based on his
opinion about terrain conditions. The indirect method, which is slightly
more accurate than the direct method, can be conducted by using statistical or deterministic approaches. These methods recognize landslidesusceptible regions based on information obtained from the correlation
among the landslide conditioning factors and distribution (Tien Bui,
Pradhan, Lofman, Revhaug, & Dick, 2012). In recent years, indirect landslide susceptibility mapping has been extensively conducted with the
aid of GIS. GIS is appropriate for this type of mapping, in which all landslide contributing factors are integrated with a landslide inventory map
by using data integration approaches (Abdallah, Chorowicz, Bou Kheir,
& Khawlie, 2005). Three methods have been selected for use in this research to acquire weights for each conditioning factor and produce
landslide susceptibility maps. Each method belongs to a specic analysis
category. The methodology owchart is shown in Fig. 3.
4.1. Weight determination using WoE algorithm
WoE was used to produce statistically derived weights for all classes
of the conditioning factor maps. Therefore, all the scale factors were
reclassied as required by BSA. Many classication techniques exist;
however, the quantile method was chosen for this research because of
its greater popularity compared with other methods (Tehrany,
Pradhan, & Jebur, 2013). Altitude, slope, SPI, TWI, TRI, and STI were categorized into 10 equal area classes. Curvature was classied into three
classes and expressed as concave when negative, at when zero, and
convex when positive. However, in the case of road and river, the
quantile classication scheme was not used. The reason is that only
small buffers are important and long distance from the river or road
does not have any signicant impact on landslide occurrence. As the distance from the river or road increases, landslide occurrence decreases
(Pradhan, Sezer, Gokceoglu, & Buchroithner, 2010). The distance to
the river and road buffer was chosen based on the occurrence of failures
to the adjacency of the river and road. Hence, a 50-m buffer zone is chosen in the study area. The signicance of each conditioning factor for

where Bi andBi represent the presence and absence of the landslide conditioning factors, respectively. Furthermore, S shows the existence of a
landslide, whereas Si represents its absence. The method was implemented by using individual factor maps, which include various categories, to demonstrate the presence or absence of a landslide. For each
conditioning factor, W+
i was used for the pixels of a conditioning factor
(shown as a class of conditioning factor) to show the signicance of the
existence of the factor for landslide occurrence. The existence of the
conditioning factor is appropriate for landslide occurrence when W+
i
is positive but inappropriate when W+
i is negative. Furthermore, the signicance of the absence of the factor for landslide occurrence is shown
by W_i . The case where W_i is positive indicates that the absence of the
factor is favorable for landslide occurrence. Weights with higher values
demonstrate that the conditioning factor is effective for susceptibility
mapping; however, conditioning factors with a zero weight represent
no correlation with landslide occurrence. Four possible combinations
exist for each conditioning factor, of which the frequency, expressed
as number of pixels, can be calculated in GIS. Lee and Choi (2004) and
Pradhan, Oh, and Buchroithner(2010) provide detailed explanations of
WoE modeling.
4.2. Weight determination using LR algorithm
The most common statistical technique used in landslide susceptibility mapping is multiple regression analysis. This method is represented as a linear equation of
Y b0 b1 x1 b2 x2 bn xn

where, Y shows the dependent layer made by landslide inventory that


represents the existence (1) or absence (0) of a landslide, b0 is the intercept of the model, bi = (i = 0, 1, 2, , n), bi (i = 0, 2, , n) represents
the LR coefcients, and xi(i = 0,1,2, n) denotes the conditioning
factors.
To predict the possibility of a landslide event in each pixel, the probability index was measured by using Eq. (8)

Y
p 1= 1 e

where p is the landslide probability attained between 0 and 1 on an


S-shaped curve.
Lee and Sambath (2006) stated that LR as a multivariate statistical
analysis (MSA) method is appropriate for predicting the existence or
absence of a landslide based on the values of the conditioning factors.
LR allows the use of any conditioning factor in the analysis because of
the addition of a proper link function to the common linear regression
model, that is, factors can be either scale or nominal, or any combination
of both types, and do not essentially have normal distributions. This
characteristic made this method more efcient than other statistical
methods, which require dening the assumption prior to the study. LR
can perform the analysis in two ways: either by using all the entered

158

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

Fig. 3. Methodology owchart.

conditioning factors or by using signicant conditioning factors only.


This characteristic of LR was considered in this research, and both
datasets of pure LiDAR conditioning factors and full database of all conditioning factors were entered in LR and assessed in two stages. LR was
rst applied for all pure LiDAR conditioning factor datasets and then
applied by using the signicant LiDAR-derived conditioning factors
altitude, aspect, curvature, SPI, and TRIfrom the same dataset. LR
was performed by using the second dataset obtained by all the factors
(LiDAR-derived factors plus geology and so on) and by using signicant
factors of the same dataset, such as altitude, slope, curvature, distance
form river, distance from road, SPI, TWI, TRI, and STI.

4.3. Weight determination using SVM algorithm


SVM is a binary classier and machine learning algorithm (Yao,
Tham, & Dai, 2008). Its structure is more sophisticated than that of
WoE and LR, thereby making it capable of dealing with non-linear applications such as basin structure (Garca, Riao, Chuvieco, Salas, &
Danson, 2011). The machine learning method reshapes nonlinear conditions into a linear format and processes able classes by generating a
hyperplane. This method generates the separating hyperplane from
the training dataset (Tehrany, Pradhan, & Jebu, 2013a). The hyperplane
is separated in the original space of the n coordinates (xi parameters in
vector x) among the points of the two separate classes (Marjanovi,
Kovaevi, Bajat, & Voenlek, 2011). The maximum margin of separation between the classes is determined, consequently creating the classication hyperplane in the central of the maximum margin. If the point
is overhead, the hyperplane is classied as +1; otherwise, it is classied
as 1. The training points closest to the optimal hyperplane are called
support vectors. New data can be classied after acquiring the decision

surface. Pradhan (2013) described the SVM modeling process as


follows:
Consider a training dataset of instance-label pairs (xi,yi) with xi Rn,
yi {1, 1}, and i = 1,,m. In this study, x is a vector of input space that
contains LiDAR-derived conditioning factors. The two classes of {1, 1}
represent the landslide pixels and non-landslide pixels. The SVM procedure nds the optimal separating hyperplane to divide the training
dataset into landslide and non-landslide {1, 1} factors. For the case of
linear separable data, a separating hyperplane can be dened as follows:
yi w:xi b1i;

where w is a coefcient vector that describes the direction of the hyperplane in the feature space, b is the offset of the hyperplane from the
origin, and i is the positive slack variable (Cortes & Vapnik, 1995). The
following optimization problem that uses Lagrangian multipliers is
solved by determining an optimal hyperplane (Samui, 2008).

Xn
1 Xn Xn
Minimize i1 i
y y xi x j ;
i1
j1 i j i j
2

10

Xn
Subject to i1 i y j 0;

11

0 i C;

where i is the Lagrange multiplier, C is the penalty, and the slack variable i allows penalized constraint violation. The decision function,
which will be used for classifying new data, can then be written as
!
n
X
yi i xi b :
gx sign
i1

12

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

159

Validation was implemented by using the area under curve (AUC)


method in comparing the maps with the existing landslide inventory.
AUC is one of the most common techniques for evaluating the reliability
of the results, which determine the success and prediction rates. The
success rate demonstrates how well the estimators are implemented
with respect to the landslides used in making these estimators. The
real efciency of the used model can be recognized based on the prediction rate. The measured probability index was sorted in descending
order to compute the relative ranks for each prediction pattern. Consequently, the cell values partitioned into 100 classes were set on the vertical axis (y), with accumulated 1% intervals in the horizontal axis (x).
The existence of the landslide locations (training and testing) in each
interval was assessed, and the resultant success and prediction rates
were measured.

preservation and vegetation density, and can also affect the strength of
the soil structure and, consequently, landsliding (Pourghasemi, Pradhan,
Gokceoglu, & Moezzi, 2012). These results demonstrate that the class of
West in aspect has more unstable conditions compared with the others.
The at curvature attained the highest S(C) value of 5.14, which indicates
high landslide probability in at regions, because the area with this characteristic can preserve water for a long time, thereby resulting in landslide
occurrence. The last class of SPI (17.7629.22) had the highest S(C) value,
which was 14.07, whereas the second class (12.723.52) acquired the
lowest weight of 9.50. The most signicant region in TWI was the last
class of 12.5636.02, with the highest weight of 20.03, whereas the lowest
weight (8.98) was for the rst class of 2.415.71.
The highest classes of SPI and TWI acquired the highest values of
probability because of their impact on augmenting water pressure in
the material, consequently reducing shear strength. This condition is
appropriate for increasing the potential of landslide occurrence in the
catchment. SPI is an important factor because the erosive power of
water runoff directly inuences slope toe erosion. The highest weight
(19.55) in the TRI layer belongs to the last class of 80.90474.84. Furthermore, the range of 36.3243.75 acquired the lowest weight. Similar
to TRI, the last class in STI achieved the highest weight, which indicates a
high probability of landslide occurrence. The soil type of LAA_COL
attained the highest weight of 20.34, while DLD achieved the lowest
value of 21.73. This nding revealed that the soil type of LAA_COL
has a weak structure, which led to slope failure in the study area. The
weight of 1.795, which was the highest value among the others, is
assigned to Schist in geology. This rock type is impenetrable, and its
shear strength is lower than that of the others. Class two in LULC
achieved the highest weight, which shows greater probability
compared with the other LULC types. The highest weights of 24.65
and 11.35 in distance from the river were assigned to the classes of
101200 m and 050 m, respectively. The case of inltration in the
areas adjacent to the river is high. Consequently, water pressure in the
material increases. In the case of distance from the road, the classes
more than 50 m showed positive correlation with landslide occurrence,
whereas the regions that are less than 50 m from the road represented
negative impact in landsliding.

6. Results and discussion

6.2. Logistic regression

6.1. Weights-of-evidence

Besides the similar WoE values derived for both datasets, the logistic
coefcients achieved for the two datasets were completely different.
This result is attributed to the fact that LR performed MSA, which demonstrated the correlation among the conditioning factors. Therefore, the
results were not similar because the conditioning factors were different
in two datasets. Logistic coefcient, which represents the weight for
each factor, was used to generate the landslide probability index,
which ranges between 0 and 1. A positive coefcient showed that the
existence of the factor in the catchment increased the probability of a
landslide occurrence. However, the negative coefcient indicated that
a negative correlation exists between the factor and landslide occurrence (Chauhan, Sharma, & Arora, 2010). The results of the rst dataset
demonstrated that TRI had the highest positive correlation with landslide occurrence with 0.341 logistic coefcients. Except for altitude,
slope, and SPI, other factors received a positive logistic coefcient. The
outcomes from the second dataset revealed that geology (vein quartz)
acquired the highest positive logistic coefcient of 33.999, whereas
soil (DLD) achieved the lowest weight of 8.636. The landslide probability maps were derived by using the LR coefcients.

In some conditions where discovering the separating hyperplane by


using the linear kernel function is not possible, the original input data
may be transformed into a high-dimensional feature space by using
certain nonlinear kernel types. The classication decision function is
written as follows:

gx sign

n
X

yi j K xi ; x j b

13

i1

where K(xi,xj) is the kernel type. Pradhan (2013) and Tien Bui, Pradhan,
Lofman, and Revhaug (2012) provide further information on the effect
of each kernel type and its parameters. In the current research, radialbased function (RBF) was employed for the kernel. This kernel is the
most common kernel type that is not sensitive to outliers (Marjanovi
et al., 2011; Yao et al., 2008). Furthermore, only one parameter gamma
() has to be dened for a selected penalty (). Selecting kernel types
is a difcult task in SVM analysis because landslide susceptibility mapping is a linearly non-separable problem. Thus, a cross-validation method was implemented to determine the optimal kernel parameters.
5. Validating the derived landslide susceptibility maps

Weights were assigned to each class according to the ratio between


the amount of landslides per class and the area of each class. Table 1 lists
the correlation between the classes of each conditioning factors and
landslide occurrence. When performing BSA, the relationships among
the conditioning factors are not considered. Therefore, the acquired
weights for classes of the same conditioning factors in both datasets
are similar. For instance, the altitude in both datasets attained the
same WoE values for each of its classes. However, the nal probability
map was different from the map acquired by using the rst dataset
(pure LiDAR data) because of additional conditioning factors in the
second dataset.
Table 1 shows that altitude has particular inuence on the characteristics of the catchment. The altitude range of 90.60 to 112.53 exhibited a
maximum weight (43.41) among the other classes. This nding revealed that more instability can be expected for the terrain in higherelevation areas. Hence, the region with the mentioned altitude showed
maximum susceptibility with reference to landslides in the study area.
In the case of slope, the higher weights belonged to the slope range of
25.4730.63, and the values above this range indicate the inuence of
sharp slopes in the occurrence of slope failure because of the increase in
the vertical component of gravity. The class of West in the aspect
layer showed the highest landslide probability in the study area, with
an acquired weight of 26.08. Other classes showed lower impact on landslide occurrence compared with this class. Aspect inuences moisture

6.3. Support vector machine


When SVM was applied for both datasets, the landslide probability
index was computed for each dataset. Given the internal procedure of
SVM, the acquired weights for each class and each conditioning factor
cannot be listed.

160

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

6.4. Landslide susceptibility maps


In landslide susceptibility analysis, eight continuous scales of values
(from 0 to 1) were produced from the methods that were used. When
the values are closer to 1, landslides are more likely to occur. These continuous scales are called probability index and need to be divided into
susceptibility classes to produce the landslide susceptibility maps. However, producing the landslide susceptibility maps is difcult because no
specic framework exists for classifying continuous data (Ayalew,
Yamagishi, & Ugawa, 2004). The obstacle of transforming continuous
data into some classes remains uncertain in landslide susceptibility
mapping, because susceptible zones are often determined by expert
knowledge and opinion. Various classication schemes are available;
however, four methods, namely, standard deviations, equal intervals,
natural breaks, and quantile, have been examined in the current research (Ayalew & Yamagishi, 2005). Each method has its own way of
categorizing values, and each approach may acquire various outcomes.
Standard deviation determines the mean of the data and consequently
divides the data into categories based on the standard deviation of the
mean. However, the number of the classes is xed by using this method,
which is unsuitable in susceptibility studies because a specic number

of classes are required. In addition, equal interval technique is not useful


because unlike the other methods, it highlights one class of susceptibility. Natural break method denes the boundary of each class based on
the inherent nature of the data wherever any jump occurs in the values.
Based on the derived histogram of the values obtained in this research,
no jump was detected in the values. Hence, the mentioned schemes
were not used in the current research. Classes with equal areas can be
derived by using the quantile method, that is, each class has the same
number of values when using this scheme. Therefore, quantile method
best suits the objective of this research. Five landslide susceptibility classes, namely, very low, low, moderate, high, and very high, were derived
for each susceptibility map, as illustrated in Figs. 4 and 5. The generated
maps reect the potential of landsliding in Bukit Antarabangsa, Ulu
Klang, Malaysia. Can, Nefeslioglu, Gokceoglu, Sonmez, and Duman
(2005) stated that two facts should be considered to achieve an efcient
landslide susceptibility map. The rst factor is that the landslide inventories must overlap with the regions detected as highly probable.
Second, the high probable areas should not have large coverage. Fig. 4
illustrates the generated maps by using the rst dataset.
Visually, the result acquired from WoE was considerably different
from the other derived maps. The recognized highly susceptible areas

Fig. 4. Landslide susceptibility maps derived from LiDAR-derived conditioning factors using a) WoE; b) LR (all factors); c) LR (signicant factors); d) SVM.

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

in the other three maps are located in the north and the northeast
around the boundary of the catchment. However, these areas are classied as low susceptible by using WoE. Therefore, validation should be
conducted to understand which method is more precise in detecting
prone regions. Fig. 5 illustrates the generated maps by using the second
dataset.
Two considerable points can be observed from the maps derived
from the second dataset. First, the result of SVM represents a high degree of exaggeration because a large part of the catchment is classied
as a susceptible zone. Second, WoE shows different zones as landslideprone areas unlike the other methods. With the use of the AUC method,
the success and prediction rates for the eight landslide susceptibility
maps are measured by comparing them with the existing landslide locations. Fig. 6 represents the AUC curve for each method.
Success rate does not show the real efciency of the derived results
but can be recognized by using the prediction rate. The prediction rates
acquired for WoE derived by the rst and second datasets were 59.00%
and 65.00%, respectively. Furthermore, 86.00% and 66.00% were the
achieved prediction rates for the LR method with the use of all factors
of the rst and second datasets, respectively. LR, which used only
the signicant factors of each dataset, obtained 89.00% and 77.00%

161

prediction rates for the rst and second datasets, respectively. Finally,
SVM attained 84.00% and 69.00% prediction rates for the LiDARderived conditioning factor dataset and the second dataset, respectively.
In terms of prediction rates, the results achieved from the rst dataset
(LiDAR dataset) by using LR and SVM showed reasonably higher
efciency because the produced accuracies were higher than those
measured from the second dataset.
As it has been mentioned, in the case of the LR, the prediction abilities of the model in the two datasets were 86.00% and 66.00%. The prediction ability of the models showed almost 20% differences. This could
be due to the fact that some of the used conditioning factors produced
some noise during processing (Chang, Chiang, & Hsu, 2007). To check
the severity of the presence of the noise, multi-collineality between
input-variables in the two datasets should be implemented (Zhu &
Huang, 2006). If there is a perfect linear relationship between the conditioning factors exist, then the estimates for a regression model cannot
be perfectly measured. The term collinearity means that two conditioning factors are almost perfect linear combinations of each another. In the
situation that more than two factors are involved, it is called multicollinearity. LR is sensitive to collinearities among the conditioning factors (Ozdemir, 2011). Hence, the inclusion of conditioning factors which

Fig. 5. Landslide susceptibility maps derived from the second dataset using a) WoE; b) LR (all factors); c) LR (signicant factors); d) SVM.

162

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

are not signicant in the LR models will result in a reduction of the


prediction accuracy. Therefore, signicant factors of each dataset
(LR LiDAR sig and LR all sig) were used in multi-collineality analysis.
Table 2 presents the results of multi-collineality with variance ination
factor (VIF).
The VIF more than 10 represents the existence of multi-collineality
(Ozdemir, 2011). Based on the acquired results, the highest VIF in this
study was less than 10, showing that there was no serious multicollinearity between the conditioning factors.
Based on the derived AUCs as presented in Fig. 6, WoE showed lower
prediction accuracy for the derived susceptibility map by using the
LiDAR-derived factors because of the less efciency of the used method.
This phenomenon can be understood based on the achieved success
rate, which represented that this algorithm is inappropriate for the current landslide modeling. Success rate shows how the algorithm can
present the prone areas by using training locations. Therefore, a case
where accuracy is low reveals that the used algorithm is inappropriate
for the selected specic analysis. Hence, the lower accuracy derived
from the WoE by using the LiDAR dataset does not represent the deciency of this dataset, but represents the possible weakness of this
bivariate statistical method. This nding can be attributed to the generalizations that are inherent in bivariate statistical analysis (BSA)
methods in which they assumed that landslides occur under the same
combination of factors throughout the study area (Van Westen et al.,
2003). WoE, which follows the linear framework, is slightly unsuitable
for non-linear applications, such as a landslide with complex structure.
Furthermore, WoE neglects the impact of each conditioning factor on
landslides. These conditions can lead to unreliable judgment about the
acquired results. BSA methods tend to simplify the factors that contribute to landslide occurrence by taking only those factors that can be
mapped easily, such as slope or geology, in a study area. This negative
characteristic of BSA methods affects assessment of the conditioning
factors. Both conditioning factors and the used algorithm should be efcient to achieve precise and reliable results.
The acquired results showed that despite the fewer conditioning
factors, the derived susceptibility map was more precise because of the
efciency of the LiDAR-derived factors that are correlated with landslide
occurrence. Adding other factors, such as geology, to the dataset interfered with the correlation and identied some regions as landslideprone areas, which could be ignored, that is, a single landslide location

fell into one of the geology types. Thus, the algorithm will be forced to
provide the weight to the said region. Therefore, LiDAR-derived parameters were sufcient for precisely detecting susceptible areas in this case
of study. Adding more conditioning factors didn't increase accuracy
and could reduce precision. Van Westen et al. (2003) explained why
adding geological and LULC data into the analysis did not have a signicant effect on the nal results. They used WoE to map susceptible areas
by using various conditioning factors to detect the importance and
impact of each factor. Their analysis showed that the use of detailed geomorphological information in bivariate statistical analysis enhanced the
overall accuracy of the derived susceptibility map. However, other factor
maps, such as geology, couldn't considerably increase the precision of the
results.
Based on the more precise results achieved among the methods,
high-elevation regions with a sharp slope fall into highly susceptible
zones. This nding reveals that these factors of slope and altitude have
considerable effect on landslide occurrence. Susceptible zones are composed of weak rocks such as vein quartz. As estimated, low slope areas
and river networks showed very low landslide susceptibility.
In addition, the signicance of a specic conditioning factor was
evaluated by eliminating the factor and implementing the model to
measure the AUC. This analysis was done for all three methods of
WoE, LR, and SVM using the complete dataset. The results are listed in
Table 3.
In the case of WoE, the prediction rate signicantly decreased when
the river factor was eliminated from the analysis. It represented that the
river has signicant impact on the performance of WoE. The most significant factors recognized by LR analysis were altitude, SPI, TRI, STI, soil
and geology. The reason was that the AUCs were decreased when the
aforementioned factors were removed from the analysis. Same factors
were selected by LR as signicant factors prior to the susceptibility mapping which can be used as an evident for the correctness of the achieved
results from Table 3. Other factors did not make considerable variation
in the acquired accuracy. Same factors of altitude, SPI, TRI, STI, soil and
geology were detected as signicant factors for SVM analysis as the
achieved prediction rates were decreased in the case that these factors
were eliminated from the analysis.
Based on Fig. 6, when only the signicant factors of rst dataset and
signicant factors of second dataset were utilized in LR, the difference
between the prediction capabilities of two models was reduced from

Fig. 6. Graphic representation of the cumulative frequency diagram presenting the cumulative landslide occurrence (%; y-axis) in landslide probability index rank (%; x-axis): a) success
rate; b) prediction rate.

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165


Table 2
Multi-collinearity diagnostics of the signicant parameters for both dataset.

Table 4
The relative important of landslide conditioning factors for the signicant parameters in
both datasets.

All (sig)
Layer

Altitude

SPI

TRI

STI

Soil

Geology

Layer

Altitude
SPI
TRI
STI
Soil
Geology

1
2.484
1.164
2.495
2.376
2.499

1.553
1
1.554
1.040
1.561
1.562

1.165
2.488
1
2.454
2.501
2.471

1.599
1.066
1.570
1
1.601
1.601

1.162
1.184
1.184
1.184
1
1.138

1.095
1.095
1.082
1.095
1.053
1

LiDAR (sig)
Layer

Input variables

Altitude
Slope
Aspect
TWI
TRI

VIF
Altitude

Slope

Aspect

TWI

TRI

1
2.345
4.035
4.034
1.110

2.484
1
4.275
4.231
1.154

1.002
1.002
1
1.002
1.002

1.054
1.043
1.054
1
1.055

2.455
2.410
8.928
8.928
1

20% to 12%. Therefore, the impacts of soil and geology on the model
were examined by removing one of each conditioning factor and
recalculating AUC of these models. The changes in AUCs when eliminating a specic factor showed the contribution of that factor to the model
(Table 4).
It can be seen that all the used factors had signicant impact on the
LR performance. However, soil and geology had more contribution to
the model as the prediction capabilities of the LR (all factors-sig)
model was reduced from 77.00% to 71.19% and 72.41% respectively.
7. Conclusion
In the last decade, the vast coverage of landslide damages led to considerable modications in development strategies on unsteady terrain,
with the Malaysian government requiring local planning specialists to
perform landslide susceptibility analyses at all stages of the development process. Construction of a spatial database is the basis of such
analysis, which needs to be conducted by using appropriate and precise
sources. A long-standing issue that remains unsolved in the literature is
the selection of the inuential conditioning factor. Each researcher
chooses factors based on his opinion or on the most used factors in
the literature. This research aimed to study the inuence of two datasets
in landslide susceptibility analysis to examine whether pure LiDARderived conditioning factors are sufcient or if additional factors, such
as soil and geology, are also needed. Three methods, namely, WoE, LR,
and SVM, which are BSA, MSA, and machine learning methods, respectively, were utilized. Each method was applied by using two datasets,

Without altitude
Without slope
Without aspect
Without curvature
Without SPI
Without TWI
Without TRI
Without STI
Without soil
Without geology
Without LULC
Without river
Without road

Layer

Without altitude
Without slope
Without aspect
Without TWI
Without TRI

87.07
85.48
87.5
87.72
84.86

AUC (prediction)
LR all (sig)

Without altitude
Without SPI
Without TRI
Without STI
Without soil
Without geology

75.19
73.01
73.31
74.34
71.19
72.41

and landslide probability index was derived consequently. LR was


implemented twice for each dataset because this method can detect signicant factors in one dataset. Therefore, LR was applied once by using
the whole dataset and then once by using only the signicant factors.
Four maps were derived by using LR. Finally, eight landslide susceptibility maps were obtained from all methods. AUC was used for validation,
and success and prediction rates were computed.
Given the inadequacy of WoE, questionable results on the performance of each dataset were obtained. However, LR and SVM showed
reasonably precise results, which enabled reliable analysis about the effect of each dataset in landslide susceptibility analysis. These methods
demonstrated the efciency of the rst dataset (pure LiDAR-derived factors) in landslide susceptibility mapping up to some extent. The success
and prediction rates for WoE were 60.00% and 59.00%, respectively, for
the rst dataset and 75.00% and 65.00%, respectively, for the second
dataset. In the case of LR, the maps derived by using only signicant factors were more precise than the others. The overall assessment showed
that the north and northeast of the catchment are highly susceptible to
landslide occurrence. Based on the results, the most susceptible region
is situated on high slope mountains where weak stones, such as vein
quartz, exist. Slope and altitude are proper for dening atlands that
will rarely be subjected to landslide occurrence.
Based on the ndings of this research, LiDAR-derived conditioning
factors may possibly be sufcient in cases where other geological and
environmental factors, such as soil and geology, are unavailable. Using
these factors might reduce effort and difculty in data collection,
which is time consuming and in some cases a difcult task. The results
of this research may provide planners and researchers with a proper
perspective about the effect of conditioning factors in future analysis.
The difculty in obtaining high accuracy is related to the fact that each
kind of landslide has its own set of conditioning factors, which should
be evaluated separately. However, in the current and similar studies,
this factor is neglected because of difculty in data collection for each
type of landslide. Furthermore, similar to indirect susceptibility
mapping, expert knowledge is not involved, and the model will be performed by a GIS expert, not by an earth specialist. Hence, uncertainty in
the measurements cannot be avoided.
Acknowledgment

Table 3
The relative important of landslide conditioning factors for the three models.
Layer

AUC (prediction)
LR LiDAR (sig)

VIF
Input variables

163

AUC (prediction)
WoE

LR

SVM

65.92
63.12
66.62
65.80
65.84
66.99
64.15
66.96
65.14
64.39
65.56
59.56
63.18

62.75
67.88
66.17
66.71
63.55
67.03
62.49
62.89
60.91
61.38
66.84
67.01
66.53

66.66
69.15
70.61
70.32
65.45
70.70
68.46
67.56
63.42
64.87
70.02
70.85
70.97

This research was supported by UPM University Research Grant (0501-11-1283RU) to stimulate research under the RUGS scheme with project number 9344100. The authors would like to thank the National
Mapping Agency (JUPEM), and Dept. of Mineral & Geosciences (JMG),
Malaysia for providing the various datasets used in this paper. Thanks
to three anonymous reviewers for their valuable and critical comments
which helped us to improve the quality of earlier version of the
manuscript.
References
Abdallah, C., Chorowicz, J., Bou Kheir, R., & Khawlie, M. (2005). Detecting major terrain
parameters relating to mass movements' occurrence using GIS, remote sensing and
statistical correlations, case study Lebanon. Remote Sensing of Environment, 99,
448461.

164

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165

Althuwaynee, O. F., Pradhan, B., & Lee, S. (2012). Application of an evidential belief
function model in landslide susceptibility mapping. Computers & Geosciences, 44,
120135.
Ayalew, L., & Yamagishi, H. (2005). The application of GIS-based logistic regression for
landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan.
Geomorphology, 65, 1531.
Ayalew, L., Yamagishi, H., & Ugawa, N. (2004). Landslide susceptibility mapping using
GIS-based weighted linear combination, the case in Tsugawa area of Agano River,
Niigata Prefecture, Japan. Landslides, 1, 7381.
Can, T., Nefeslioglu, H. A., Gokceoglu, C., Sonmez, H., & Duman, T. Y. (2005). Susceptibility
assessments of shallow earthows triggered by heavy rainfall at three subcatchments
by logistic regression analyses. Geomorphology, 72, 250271.
Chang, K. -T., Chiang, S. -H., & Hsu, M. -L. (2007). Modeling typhoon-and earthquakeinduced landslides in a mountainous watershed using logistic regression.
Geomorphology, 89, 335347.
Chauhan, S., Sharma, M., & Arora, M. K. (2010). Landslide susceptibility zonation of the
Chamoli region, Garhwal Himalayas, using logistic regression model. Landslides, 7,
411423.
Chen, H., & Lee, C. (2003). A dynamic model for rainfall-induced landslides on natural
slopes. Geomorphology, 51, 269288.
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273297.
Domnguez Cuesta, M. J., Jimnez Snchez, M., & Rodrguez Garca, A. (1999). Press archives as temporal records of landslides in the North of Spain: relationships between
rainfall and instability slope events. Geomorphology, 30, 125132.
Domnguez-Cuesta, M. J., Jimnez-Snchez, M., & Berrezueta, E. (2007). Landslides in
the Central Coaleld (Cantabrian Mountains, NW Spain): Geomorphological features, conditioning factors and methodological implications in susceptibility assessment. Geomorphology, 89, 358369.
Donati, L., & Turrini, M. (2002). An objective method to rank the importance of the
factors predisposing to landslides with the GIS methodology: application to an
area of the Apennines (Valnerina; Perugia, Italy). Engineering Geology, 63,
277289.
Evett, S. R., Tolk, J. A., & Howell, T. A. (2006). Soil prole water content determination.
Vadose Zone Journal, 5, 894907.
Garca, M., Riao, D., Chuvieco, E., Salas, J., & Danson, F. M. (2011). Multispectral and
LiDAR data fusion for fuel type mapping using Support Vector Machine and decision
rules. Remote Sensing of Environment, 115, 13691379.
Glenn, N. F., Streutker, D. R., Chadwick, D. J., Thackray, G. D., & Dorsch, S. J. (2006).
Analysis of LiDAR-derived topographic information for characterizing
and differentiating landslide morphology and activity. Geomorphology, 73,
131148.
Gokceoglu, C., Sonmez, H., Nefeslioglu, H. A., Duman, T. Y., & Can, T. (2005). The 17 March
2005 Kuzulu landslide (Sivas, Turkey) and landslide-susceptibility map of its near
vicinity. Engineering Geology, 81, 6583.
Gorsevski, P. V., Gessler, P. E., Boll, J., Elliot, W. J., & Foltz, R. B. (2006). Spatially and
temporally distributed modeling of landslide susceptibility. Geomorphology, 80,
178198.
Guadagno, F., Martino, S., & Mugnozza, G. S. (2003). Inuence of man-made cuts on the
stability of pyroclastic covers (Campania, southern Italy): a numerical modelling
approach. Environmental Geology, 43, 371384.
Guzzetti, F., Cardinali, M., Reichenbach, P., & Carrara, A. (2000). Comparing landslide
maps: a case study in the upper Tiber River Basin, central Italy. Environmental
Management, 25, 247263.
Guzzetti, F., Carrara, A., Cardinali, M., & Reichenbach, P. (1999). Landslide hazard evaluation: a review of current techniques and their application in a multi-scale study,
Central Italy. Geomorphology, 31, 181216.
Guzzetti, F., Reichenbach, P., Cardinali, M., Galli, M., & Ardizzone, F. (2005). Probabilistic landslide hazard assessment at the basin scale. Geomorphology, 72,
272299.
Huang, R., Pei, X., Fan, X., Zhang, W., Li, S., & Li, B. (2012). The characteristics and failure
mechanism of the largest landslide triggered by the Wenchuan earthquake, May
12, 2008, China. Landslides, 9, 131142.
Jebur, M. N., Pradhan, B., & Tehrany, M. S. (2013). Using ALOS PALSAR derived
high-resolution DInSAR to detect slow-moving landslides in tropical forest:
Cameron Highlands, Malaysia. Geomatics, Natural Hazards and Risk, 119,
http://dx.doi.org/10.1080/19475705.2013.860407.
Jebur, M. N., Pradhan, B., & Tehrany, M. S. (2014). Detection of vertical slope movement in
highly vegetated tropical area of Gunung pass landslide, Malaysia, using L-band
InSAR technique. Geosciences Journal, 18(1), 6168, http://dx.doi.org/10.1007/
s12303-013-0053-8.
Jibson, R. W., & Keefer, D. K. (1989). Statistical analysis of factors affecting landslide distribution in the New Madrid seismic zone, Tennessee and Kentucky. Engineering
Geology, 27, 509542.
Lee, S., & Choi, J. (2004). Landslide susceptibility mapping using GIS and the weight-ofevidence model. International Journal of Geographical Information Science, 18,
789814.
Lee, S., & Pradhan, B. (2006). Probabilistic landslide hazards and risk mapping on Penang
Island, Malaysia. Journal of Earth System Science, 115, 661672.
Lee, S., & Pradhan, B. (2007). Landslide hazard mapping at Selangor, Malaysia using
frequency ratio and logistic regression models. Landslides, 4, 3341.
Lee, S., & Sambath, T. (2006). Landslide susceptibility mapping in the Damrei Romel area,
Cambodia using frequency ratio and logistic regression models. Environmental
Geology, 50, 847855.
Lefsky, M., Cohen, W., & Spies, T. (2001). An evaluation of alternate remote sensing products for forest inventory, monitoring, and mapping of Douglas-r forests in western
Oregon. Canadian Journal of Forest Research, 31, 7887.

Marjanovi, M., Kovaevi, M., Bajat, B., & Voenlek, V. (2011). Landslide susceptibility
assessment using SVM machine learning algorithm. Engineering Geology, 123,
225234.
Metternicht, G., Hurni, L., & Gogu, R. (2005). Remote sensing of landslides: an analysis of
the potential contribution to geo-spatial systems for hazard assessment in mountainous environments. Remote Sensing of Environment, 98, 284303.
Moreiras, S. M. (2005). Landslide susceptibility zonation in the Rio Mendoza valley,
Argentina. Geomorphology, 66, 345357.
Nefeslioglu, H., Sezer, E., Gokceoglu, C., Bozkir, A., & Duman, T. (2010). Assessment of
landslide susceptibility by decision trees in the metropolitan area of Istanbul,
Turkey. Mathematical Problems in Engineering, 2010, 115.
Oh, H. J., & Pradhan, B. (2011). Application of a neuro-fuzzy model to landslidesusceptibility mapping for shallow landslides in a tropical hilly area. Computers &
Geosciences, 37, 12641276.
Ozdemir, A. (2011). Using a binary logistic regression method and GIS for evaluating and
mapping the groundwater spring potential in the Sultan Mountains (Aksehir,
Turkey). Journal of Hydrology, 405, 123136.
Pourghasemi, H. R., Jirandeh, A. G., Pradhan, B., Xu, C., & Gokceoglu, C. (2013). Landslide
susceptibility mapping using support vector machine and GIS at the Golestan
Province, Iran. Journal of Earth System Science, 122, 349369.
Pourghasemi, H. R., Pradhan, B., & Gokceoglu, C. (2012). Application of fuzzy logic and
analytical hierarchy process (AHP) to landslide susceptibility mapping at Haraz
watershed, Iran. Natural Hazards, 63, 965996.
Pourghasemi, H., Pradhan, B., Gokceoglu, C., & Moezzi, K. D. (2012). Landslide susceptibility mapping using a spatial multi criteria evaluation model at Haraz Watershed, Iran.
In B. Pradhan, & M. Buchroithner (Eds.), Terrigenous Mass Movements (pp. 2349).
Springer.
Pourghasemi, H. R., Pradhan, B., Gokceoglu, C., Mohammadi, M., & Moradi, H. R. (2012).
Application of weights-of-evidence and certainty factor models and their comparison
in landslide susceptibility mapping at Haraz watershed, Iran. Arabian Journal of
Geosciences, 6, 23512365.
Pradhan, B. (2013). A comparative study on the predictive ability of the decision tree,
support vector machine and neuro-fuzzy models in landslide susceptibility mapping
using GIS. Computers & Geosciences, 51, 350365.
Pradhan, B., & Lee, S. (2010a). Delineation of landslide hazard areas on Penang Island,
Malaysia, by using frequency ratio, logistic regression, and articial neural network
models. Environmental Earth Sciences, 60, 10371054.
Pradhan, B., & Lee, S. (2010b). Landslide susceptibility assessment and factor effect
analysis: backpropagation articial neural networks and their comparison with
frequency ratio and bivariate logistic regression modelling. Environmental Modelling
& Software, 25, 747759.
Pradhan, B., & Lee, S. (2010c). Regional landslide susceptibility analysis using backpropagation neural network model at Cameron Highland, Malaysia. Landslides, 7,
1330.
Pradhan, B., Mansor, S., Pirasteh, S., & Buchroithner, M. F. (2011). Landslide hazard and
risk analyses at a landslide prone catchment area using statistical based geospatial
model. International Journal of Remote Sensing, 32, 40754087.
Pradhan, B., Oh, H. J., & Buchroithner, M. (2010). Weights-of-evidence model applied to
landslide susceptibility mapping in a tropical hilly area. Geomatics, Natural Hazards
and Risk, 1, 199223.
Pradhan, B., Sezer, E. A., Gokceoglu, C., & Buchroithner, M. F. (2010). Landslide susceptibility mapping by neuro-fuzzy approach in a landslide-prone area (Cameron
Highlands, Malaysia). IEEE Transactions on Geoscience and Remote Sensing, 48,
41644177.
Pradhan, B., & Youssef, A.M. (2010). Manifestation of remote sensing data and GIS on
landslide hazard analysis using spatial-based statistical models. Arabian Journal of
Geosciences, 3, 319326.
Pradhan, B., Youssef, A., & Varathrajoo, R. (2010). Approaches for delineating landslide
hazard areas using different training sites in an advanced articial neural network
model. Geo-spatial Information Science, 13, 93102.
Ray, R. L., Jacobs, J. M., & Cosh, M. H. (2010). Landslide susceptibility mapping using
downscaled AMSR-E soil moisture: A case study from Cleveland Corral, California,
US. Remote Sensing of Environment, 114, 26242636.
Regmi, N. R., Giardino, J. R., & Vitek, J.D. (2010). Modeling susceptibility to landslides using
the weight of evidence approach: Western Colorado, USA. Geomorphology, 115,
172187.
Samui, P. (2008). Slope stability analysis: a support vector machine approach.
Environmental Geology, 56, 255267.
Tehrany, M. S., Pradhan, B., & Jebu, M. N. (2013). A comparative assessment between object and pixel-based classication approaches for land use/land cover mapping using
SPOT 5 imagery. Geocarto International, 119, http://dx.doi.org/10.1080/10106049.
2013.768300.
Tehrany, M. S., Pradhan, B., & Jebur, M. N. (2013). Spatial prediction of ood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS. Journal of Hydrology, 504,
6979.
Tehrany, M. S., Pradhan, B., & Jebur, M. N. (2014). Flood susceptibility mapping using a
novel ensemble weights-of-evidence and support vector machine models in GIS.
Journal of Hydrology, http://dx.doi.org/10.1016/j.jhydrol.2014.03.008.
Tien Bui, D., Pradhan, B., Lofman, O., & Revhaug, I. (2012). Landslide susceptibility assessment in vietnam using support vector machines, decision tree, and Naive Bayes
Models. Mathematical Problems in Engineering, 2012, 115.
Tien Bui, D., Pradhan, B., Lofman, O., Revhaug, I., & Dick, O. B. (2012). Spatial prediction of landslide hazards in Hoa Binh province (Vietnam): a comparative assessment of the efcacy of evidential belief functions and fuzzy logic models. Catena,
96, 2840.

M.N. Jebur et al. / Remote Sensing of Environment 152 (2014) 150165


Van Westen, C., Rengers, N., & Soeters, R. (2003). Use of geomorphological information in
indirect landslide susceptibility assessment. Natural Hazards, 30, 399419.
Wilson, J. P., & Gallant, J. C. (2000). Terrain analysis: principles and applications. New York:
John Wiley & Sons.
Yao, X., Tham, L., & Dai, F. (2008). Landslide susceptibility mapping based on support
vector machine: a case study on natural slopes of Hong Kong, China. Geomorphology,
101, 572582.
Yilmaz, I. (2009). A case study from Koyulhisar (Sivas-Turkey) for landslide susceptibility
mapping by articial neural networks. Bulletin of Engineering Geology and the
Environment, 68, 297306.

165

Zzere, J. L. S., de Brum Ferreira, A., & Rodrigues, M. L. S. (1999). The role of conditioning
and triggering factors in the occurrence of landslides: a case study in the area north of
Lisbon (Portugal). Geomorphology, 30, 133146.
Zhu, L., & Huang, J. -F. (2006). GIS-based logistic regression method for landslide
susceptibility mapping in regional scale. Journal of Zhejiang University Science A, 7,
20072017.

You might also like