Next Article in Journal
Decadal Changes of Earth’s Outgoing Longwave Radiation
Next Article in Special Issue
Identifying Collapsed Buildings Using Post-Earthquake Satellite Imagery and Convolutional Neural Networks: A Case Study of the 2010 Haiti Earthquake
Previous Article in Journal
Indirect Estimation of Structural Parameters in South African Forests Using MISR-HR and LiDAR Remote Sensing Data
Previous Article in Special Issue
Landslide Detection and Susceptibility Mapping by AIRSAR Data Using Support Vector Machine and Index of Entropy Models in Cameron Highlands, Malaysia
 
 
Erratum published on 29 December 2018, see Remote Sens. 2019, 11(1), 57.
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Integrated Approach of Relevance Vector Machine Optimized by Imperialist Competitive Algorithm for Spatial Modeling of Shallow Landslides

1
Geographic Information Science Research Group, Ton Duc Thang University, Ho Chi Minh City, Vietnam
2
Faculty of Environment and Labour Safety, Ton Duc Thang University, Ho Chi Minh City, Vietnam
3
Department of Geomorphology, Faculty of Natural Resources, University of Kurdistan, Sanandaj 66177-15175, Iran
4
Department of Rangeland and Watershed Management, Faculty of Natural Resources, University of Kurdistan, Sanandaj 66177-15175, Iran
5
Faculty of Civil Engineering, Institute of Research and Development, Duy Tan University, P809-K7/25 Quang Trung, Danang 550000, Vietnam
6
Geotechnical Engineering and Artificial Intelligence Research Group (GEOAI), University of Transport Technology, Hanoi 100803, Vietnam
7
Faculty of Geography, VNU University of Science, 334 Nguyen Trai, ThanhXuan, Hanoi 100803, Vietnam
8
Faculty of Information Technology, Hanoi University of Mining and Geology, Pho Vien, Bac Tu Liem, Hanoi 100803, Vietnam
9
Young Researchers and Elites Club, North Tehran Branch, Islamic Azad University, Tehran P.O. Box 19585/466, Iran
10
Department of Geoinformation, Faculty of Geoinformation and Real Estate, Universiti Teknologi Malaysia (UTM), 81310 Johor Bahru, Malaysia
11
Geological Research Division, Korea Institute of Geoscience and Mineral Resources (KIGAM), 124, Gwahak-ro Yuseong-gu, Daejeon 34132, Korea
12
Department of Geophysical Exploration, Korea University of Science and Technology, 217, Gajeong-ro Yuseong-gu, Daejeon 34113, Korea
*
Authors to whom correspondence should be addressed.
Remote Sens. 2018, 10(10), 1538; https://0-doi-org.brum.beds.ac.uk/10.3390/rs10101538
Submission received: 16 August 2018 / Revised: 16 September 2018 / Accepted: 16 September 2018 / Published: 25 September 2018

Abstract

:
This research aims at proposing a new artificial intelligence approach (namely RVM-ICA) which is based on the Relevance Vector Machine (RVM) and the Imperialist Competitive Algorithm (ICA) optimization for landslide susceptibility modeling. A Geographic Information System (GIS) spatial database was generated from Lang Son city in Lang Son province (Vietnam). This GIS database includes a landslide inventory map and fourteen landslide conditioning factors. The suitability of these factors for landslide susceptibility modeling in the study area was verified by the Information Gain Ratio (IGR) technique. A landslide susceptibility prediction model based on RVM-ICA and the GIS database was established by training and prediction phases. The predictive capability of the new approach was evaluated by calculations of sensitivity, specificity, accuracy, and the area under the Receiver Operating Characteristic curve (AUC). In addition, to assess the applicability of the proposed model, two state-of-the-art soft computing techniques including the support vector machine (SVM) and logistic regression (LR) were used as benchmark methods. The results of this study show that RVM-ICA with AUC = 0.92 achieved a high goodness-of-fit based on both the training and testing datasets. The predictive capability of RVM-ICA outperformed those of SVM with AUC = 0.91 and LR with AUC = 0.87. The experimental results confirm that the newly proposed model is a very promising alternative to assist planners and decision makers in the task of managing landslide prone areas.

Graphical Abstract

1. Introduction

Landslides are phenomena of large ground movements that are mainly caused by either climatic or geophysical factors. These natural hazards often occur along one or several slip surfaces due to shear displacement of soil and rock [1,2]. Besides climatic and geophysical factors, environmental changes caused by humans, which alter the landforms have also been recognized as triggering factors of landslides [3]. A landslide is a mass movement of slope-forming materials where the shear stress on a slope is higher than the shear strength of the slope-forming materials [4].
Despite the fact that landslides occur frequently worldwide, the mechanism, the scale, and the risk of landslides in many regions have not been thoroughly explored and understood [5,6,7,8,9,10,11]. Research on landslide susceptibility and mitigation in many developing countries, especially in Vietnam, is still limited. This fact has been demonstrated through a high number of human casualties and economic losses caused by landslide incidences [12].
Earthquake and rainfall are widely recognized as the two main causes of landslide [13]. In Vietnam, due to its typical geographic and climatic features, rainfall is the dominant factor of landslide occurrence [14]. In recent years, because of the global climate change, extreme weather phenomena, such as torrential rainfalls and tropical rainstorms, appear frequently [15,16]. Therefore, it is of urgent need to construct a landslide susceptibility model with particular consideration of rainfall as a triggering factor. This model may provide immense help to assess the likelihood of landslide occurrence in the future and to establish appropriate strategies for hazard mitigation [17,18].
Some studies were conducted on engineering geology, slope stability, and landsliding by researchers between the 1960s and 1970s. Some of these studies were done by consulting engineering geologists which were not published [19]. However, during this decade some papers were published on engineering geologic, landslide and slope stability. The 1970s was graphically landslide-based where conditioning/controlling factor effects on landslide occurrence were known such as slope exposure [19]. The U.S. Geological Survey undertook extensive regional mapping of landslide deposits, analysis of the costs of damage produced by landsliding, preparation of regional slope maps, while the preparation of slope stability maps were the most important productions in the 1970s [20,21]. On the other word, studies of landslide susceptibility and hazard mapping are referred to 1970s by Nilsen et al. [19]; and Kienholz [22,23].
A susceptibility map for a region can significantly decrease the negative effects of landslides. Based on such a map, intact areas that are potentially damaged by future landslides can be clearly detected. Accordingly, landslide hazard mitigation and prevention can be achieved either by constructing landslide defense structures or by relocation of the local communities and industrial facilities in highly susceptible areas to safer areas. In practice, the establishment of a susceptibility map is the most important step in landslide hazard management [18,24]. This is because this map is an effective tool to quantify the level of landslide hazard reflected by the spatial likelihood of the landslide occurrences with respect to various conditions of topography, geography, vegetation, climate, and land-use [25,26,27].
In recent years, the advancements of GIS technology have paved the way for various data-driven landslide susceptibility models. GIS has been extensively used and demonstrated its feasibility and effectiveness for many types of natural hazard mapping including rainfall-induced shallow landslides [26,28,29,30]. The GIS database, which consists of a historical landslide inventory and a set of landslide conditioning factors, is first established. Statistical and machine learning models are then employed to generalize a decision boundary that distinguishes landslide areas from the entire study region.
Statistical models including step-wise weight ratio assessment (SWARA) [31], multi-criteria decision evaluation (MCDE) [32,33], logistic regression (LR) [34,35,36], bivariate and multivariate regression [37,38], analytical hierarchy process (AHP) [4,39,40], multivariate adaptive regression spline (MDRS) [41,42], weight of evidence (WoE) [43,44], evidential belief function (EBF) [45,46,47], discriminant analysis (DA) [48], index of entropy (IOE) [49], random forest (RF) [50,51], and Chebyshev theorem [52] have proven their feasibility in solving the task of interest. Among the above-mentioned models, LR has been applied widely and efficiently in many landslide studies [53]. The main principle of the LR algorithm is to apply maximum likelihood estimation to form a multivariate regression relation between independent variables (influencing factors) and a dependent variable (landslide or non-landslide status); such a relation is then used to estimate the probability of a landslide occurrence. However, due to the superior capability in multivariate and nonlinear modeling, machine learning-based methods have gained more attention from the academic community.
Various machine learning based approaches for mapping of landslide susceptibility including support vector machines (SVM) [54,55,56], Adaptive neuro-fuzzyinference system (ANFIS) [57,58,59], artificial neural network (ANN) [45], Fuzzy Shannon entropy [60], decision trees (DT) [54], naive Bayes (NB) [61], naive Bayes tree (NBT) [62], rotation forest ensembles (RFE) [63], multivariate adaptive regression spline (MARS) [61], Bayesian logistic regression (BLR) [64], logistic model tree (LMT) [63], and fuzzy logic system [65] have been put forward.
Among these, SVM is also a powerful method for landslide modeling [6]. The SVM method uses statistical learning theory to find an optimum linear hyper-plane to separate two classes of landslide and non-landslide. In this study, the nonlinear data is converted into the linearly separable data in a high-dimensional feature space using the radial basis function (RBF) kernel function (KF) [66]. Recent works reported by Reichenbach, et al. [67] and Huang and Zhao [68] point out an increasing trend of applying GIS-based machine learning models for landslide susceptibility mapping. Although a variety of models have been proposed, the investigation of other advanced machine learning approaches used for tackling the problem of interest is very necessary. It is because data sets collected in different study regions have unique characteristics. Thus, a model may perform well in one region but may not be suitable for another region. This fact has been clearly demonstrated in the comparative works of Pham, et al. [69], Wang, et al. [70], and Bui, et al. [45].
In addition, when applying machine learning models, an important issue is how to appropriately determine their hyper parameters. The task of hyper parameter setting is widely known as the model selection problem [71]. This is a crucial task because hyper parameters affect the learning process and therefore strongly influence its generalization capability. If the model parameters are not set properly, the constructed machine learning models suffer from either over fitting or under fitting, which clearly undermines their prediction processes.
It is also to be noted that the problem of model selection is far from being trivial. The first challenge is that a machine learning method may have many hyper parameters. The second challenge is that the hyper parameters are searched in continuous ranges; thus, there are infinite parameter combinations. To tackle these challenges, many researchers have resorted to metaheuristic algorithms such as the Artificial Bee Colony [72], the Differential Flower Pollination [73], the Particle Swarm Optimization [74], the Symbiotic Organisms Search [75], the Firefly Algorithm [76], and the Genetic Algorithm [77]. The integration of machine learning and metaheuristic algorithms has been demonstrated to be an effective tool for solving complex problems including landslide spatial prediction [78,79].
In this research context, the current study aims at proposing a new artificial intelligent method for landslide spatial modeling that hybridizes the Relevance Vector Machine (RVM) [80] and the Imperialist Competitive Algorithm (ICA) [81]. RVM is a powerful machine learning method, which operates on the theory of Bayesian inference. Successful implementations of RVM in various fields have been reported by previous studies [82]; nevertheless, its application in rainfall-induced shallow landslide modeling is still very limited. Therefore, our research work is an attempt to fill this gap in the literature. It is of note that the learning phase of RVM requires the determination of the Radial Basis Function width as a hyper parameter. For determining this parameter of RVM, the ICA metaheuristic method is employed.
To the best of our knowledge, none of the previous studies have investigated this integration of RVM and ICA for spatial prediction of rainfall-induced shallow landslide as well as of other natural hazards. In addition, a GIS database for the study area of Lang Son district in the northern part of Vietnam was established. The performance of the new hybrid model, named as RVM-ICA, was compared with those of the SVM and LR. The receiver operating characteristic (ROC) curve and other statistical indicators (sensitivity, specificity, and accuracy) were selected as performance measurement metrics.

2. Study Area and Data Acquisition

2.1. Description of the Study Area

Lang Son city is considered as the case study to implement the RVM-ICA model. The area under investigation for landslide susceptibility is located in Lang Son province (Figure 1). The study area belongs to the northeastern part of Vietnam; it lies between the longitudes of 106°41′34″E– 106°48′32″E and the latitudes of 21°49′43″N–21°57′13″N. The study area covers about 101.3 km2. The elevation of the Lang Son varies from 214 to 800 m above standard sea level with a mean elevation of 325.6 m. The slope angles range from 0 to 80°. Six lithological categories are observed in the study area including tuff, sandstone, siltstone, quaternary deposit (gritstone, breccia, sand and clay), basalt, and conglomerate [83,84]. Tuff occupies the largest area (29.2%), followed by siltstone (26.4%), gritstone, breccia, sand, and clay (19.8%), conglomerate (11.6%), sandstone (9.2%), and basalt (3.8%). Quaternary sediments, which consist of mainly grit/gritstone (sandstones composed of angular sand grains and small pebbles), breccia (composed of angular particles which are larger than two millimeters in diameter), pebble, boulder, sand, and clay. Seven land use types covering the study area are productive forest land (PDL), paddy land (PL), barren land (BL), protective forest land (PTL), residential areas (RA), crop land (PCL), and water surface land (WSL). PDL dominates the landscape with 32.3% of the area, followed by PL (25.9%), BL (19.0%), RA (13.2%), PTL (7.5%), WSL (1.6%), and PCL (0.4%) [85]. The soil types were classified into eight categories including ferralic acrisols (FA) (76.4%), dystric gleysols (DG) (6.3%), rhodic ferralsols (RF) (7.4%), eutric fluvisols (EF) (5.6%), plinthic acrisols (PA) (1.6%), dystric fluvisols (DF) (1.2%), water surface (WS) (1.2%), and rocky mountain surface (RMS) (0.2%) [85].
Monsoon winds (seasonal reversing winds) affect the study area by altering the regional precipitation [85]. The average annual rainfall varies from 1200 to 1600 mm. The typical rainy season is from May to September. Based on historical records, high amount of rainfall is the main landslide triggering factor in the study area. According to meteorological data analyses, during the past 20 years, the annual average temperature ranged from 17 to 22 °C. The highest and lowest average monthly temperatures are 27.5 °C in July and 12.5 °C in January, respectively. Additionally, the annual average relative humidity in the study area is between 80% and 85%. The highest humidity is recorded in August while the lowest humidity is observed in December [85].

2.2. Data Preparation

2.2.1. Landslide Inventory Map

For the purpose of landslide spatial modeling, information regarding the landslide inventory and landslide influencing factors must be collected from the study area. The landslide inventory contains landslide events that occurred in the past. The landslide affecting factors cover a variety of aspects including geomorphological, geological, hydrological, and climatic factors which influence the likelihood of landslide occurrences.
In the study area, shallow soil slides and debris flows triggered by rainfall are the subject of interest. The reason is that no earthquake-induced landslides have been recorded in the region. It is of note that a small numbers of rock fall events were removed from the inventory due to their rarity, insignificant damage, and also of a different mechanism. The landslide inventory map used in this study was constructed by three means:
(i)
The locations of landslides, occurring before the year 2003, are identified by the interpretation of aerial photographs with resolution of about 1 m (obtained from the Aerial Photo-Topography Company, 2003) and field survey data.
(ii)
The landslide inventory maps established in 2006 and 2009 are obtained from the previous research works [86,87].
(iii)
Some recent landslide locations were identified during field works by Nguyen et al. [88].
The rationale of the landslide prediction model is based on the assumption that the factors that triggered landslide events in the past will cause the landslide occurrences in the future [79]. Thus, determining the landslide conditioning factors is the critical issue for constructing an accurate landslide susceptibility map [89]. In this study, a total of 101 landslide locations were collected to establish the landslide inventory map (see Figure 1 and Figure 2). They are soil mixed boulder slides, which occurred during the last 15 years. These landslide locations are obtained from three projects carried out in the previous research work Bui, Pradhan, Revhaug, Nguyen, Pham, and Bui [85]. It is proper to note that the 101 landslide locations are divided into two groups: group 1 including 69 locations is used for model training and group 2 consisting of 32 locations is employed for model validation. The number of pixels in the two groups of data is 2410 and 1045. In other words, the training set occupies ~70% of the data and the validation set consists of ~30% of the data. To complete the data set used for machine learning model establishment, data samples belonging to non-landslide locations are randomly sampled from the GIS database with the help of the ArcGIS 10.2 package.

2.2.2. Landslide Conditioning Factors

Furthermore, fourteen landslide conditioning factors, used in this study, are categorized into two groups of geo-morphometrical and geographical factors. The geo-morphometrical factors include slope degree, slope length, aspect, curvature, elevation, topographic wetness index (TWI), stream power index (SPI), sediment transport index (STI), valley depth, and toposhade. The geo-environmental factors comprise lithology, land use, soil type, and distance to faults. The selection of these factors is based on a literature reviewing process [8,45], a previous analysis by Nguyen et al. [88], and the availability of data in the study region.
To retrieve the geo-morphometrical factors, a digital elevation model (DEM) was constructed from the National Topographic Maps at scale of 1:5000 for Lang Son city. The resolution of the DEM is 5 m × 5 m and size of the DEM is 2567 and 2954 pixels. In this study, a raster resolution of 20 m was used for all the conditioning factors in the landslide modeling process. Based on the constructed DEM, slope degree, slope length, aspect, curvature, elevation, TWI, SPI, STI, valley depth, and toposhade were extracted using the ArcGIS 10.2 software package. Continuous values of these factors (except aspect) were reclassified into classes using Jenks Natural Break optimization method [67] available in ArcGIS 10. 2, as suggested and explained by Hung et al. [90].
Slope angle is the major parameter for the slope constancy analysis [91]. Slope angle is very frequently used in landslide susceptibility studies [92]. The slope map was generated and classified into six classes (Figure 3a, Table 1).
Slope length has been considered an important factor in landslide activity since longer slope lengths increase the potential of erosive agents to dislodge and transport materials downslope [93]. The slope length was prepared with five classes (Figure 3b, Table 1).
The slope aspect describes the direction of the slope and has an important effect on the rainfall, wind, and sunlight exposure. Consequently, this factor has been frequently applied in landslide susceptibility analyses by many scholars [94]. The aspect map was built with nine classes including flat, north, northeast, east, southeast, south, south-west, west, northwest (Figure 3c, Table 1). Surface curvature at any point is the arc of a line. It is formed by the intersection of the surface with a specific alignment passing through this point. The value of the curvature can be either below, above, or equal to 0.05, representing the concave, convex, or flat shaped curvatures, respectively [95]. A curvature map was generated in five categories (Figure 3d, Table 1). Elevation is one of the important factors in the occurrence of landslides. The height area affects the loading on the slope and thus enhances the chances of landslides if the sliding plain has dip (orientation) towards the open excavation [96].
The elevation map was classified into seven classes (Figure 3e, Table 1). The topographic wetness index (TWI) is a hydrological factor that is affected by the slope and the specific catchment area of a watershed [97]. Moreover, the probability of landslide occurrence will be decreased when the TWI increases [98]. The TWI is calculated as follows:
TWI = L n   ( α / tan β )  
where α is cumulative upslope area drainage through a point (per unit contour length) and β is the angle of the slope at the point. Accordingly, the TWI map for the study area was classified into six classes (Figure 3f, Table 1).
The stream power index (SPI) demonstrates the erosion power of the stream in a region [99]. These hydrological factors are defined based on the specific catchment area (AS) which is proportional to the discharge (Q) in a watershed. It is computed as follows:
  SPI = A s tan β  
where As is the specific catchment area, and β is the local slope gradient (in degree) [100].
The SPI map was prepared and categorized into six classes (Figure 3g, Table 1). In addition, the sediment transport index (STI) indicates the amount of sediment transported by overland flow. This hydrological factor is based on the catchment evolution erosion theories and the transport capacity limiting sediment flux [101,102]. The STI is calculated from the following formula [103]:
STI = ( A s 22.13 ) 0.6 ( sin β 0.0896 ) 1.3  
where A s is the specific catchment area (m2/m) and β is the slope gradient [100]. In this study, the STI map was divided into six classes (Figure 3h, Table 1).
The valley depth (VD) factor is defined as the difference in elevation between the pixel and the upstream ridge; this factor has a major contribution in the slope instability assessment [104]. The VD map for this study was constructed with six classes (Figure 3i, Table 1).
The toposhape as one of the important conditioning factors in occurrence of landslide in the current study was created and categorized into ten classes namely ridge, saddle, flat, ravine, convex hillside, saddle hillside, slope hillside, concave hillside, inflection hillside, and unknown hillside as suggested in [105,106] (Figure 3j, Table 1). Land use has a significant influence on slope instability and is widely considered in landslide susceptibility [107]. The land use map of the study area with seven classes (Figure 3k, Table 1) was constructed based on the land use status map at 1:50,000 scale; this land use map was provided by the local authority [88]. The dominant land use type in the study area is the forest land (43.4%) in which 35.7% and 7.7% of the land belong to productive and protective forests, respectively. Additionally, paddy land covers 21.5% of the total study area, followed by barren land (20.4%) and residential areas (6.9%), respectively.
Soil has been considered an important factor in the occurrence of landslides due to its important role in instability of the slopes [108]. The soil type map with a scale of 1:100,000 was obtained from National Pedology Maps (NPM) [88]. Ferralic acrisols (FA) is the dominant soil type in the total study area (76.4%). It is followed by rhodic ferralsols (RF) (7.4%), dystric gleysols (DG) (6.3%), eutric fluvisols (EF) (5.6%), plinthic acrisols (PA) (1.6%), and dystric fluvisols (DF) (1.2%). Water surface (WS) occupies about 1.2% of the area and rocky mountain surface (RMS) forms 0.2% of the study region (Figure 3l, Table 1).
Lithology plays a very important role in the landslide occurrences. Soft and weathered rocks are more vulnerable than hard unjointed rocks thus lithological units have different vulnerability degrees to landslides [109,110]. In this study, the lithology map is compiled based on four tiles of the Geological and Mineral Resources Map (GMRM) of Vietnam at the scale of 1:50,000 [84] in ArcGIS 10.2. The study area consists of various types of lithological units (Figure 3m, Table 1).
Moreover, distance to faults is one of the most important affecting factors of landslides as slope may fail along the faults depending on the nature and orientation of the faults [111]. The distance to faults map with five categories (Figure 3n) was constructed from the fault lines of the lithological data with the help of ArcGIS 10.2 (Table 1). Table 1 and Figure 3 show all fourteen conditioning factors and their classifications used in the current study.

3. Methodology

3.1. Relevance Vector Machine

Relevance Vector Machine (RVM), proposed by Tipping [112], is a popular machine learning technique, which can construct nonlinear classification models with probabilistic outcomes. This algorithm is essentially a supervised pattern classification problem relying on a training set of feature vectors X = { x n } n = 1 N with the outputs C = { c n } n = 1 2 . Herein, C1 and C2 denote the landslide and non-landslide data samples, respectively. In landslide modeling, the task of interest is to establish a nonlinear classification model that assigns the set of feature vectors into two decision regions, which corresponds to the two aforementioned class labels. The problem at hand is non-trivial due to the multivariate, complex, and uncertain natures of landslide prediction.
Thus, RVM is deemed very suitable for modeling the task of interest. Similar to the concept of SVM, RVM first maps training data samples from the original input space to a higher dimensional space (called feature space). Accordingly, a hyperplane used for discriminating the data can be constructed in the feature space. The concept of RVM is demonstrated in Figure 4.
Without the loss of generality, the outputs Ci are assigned with two possible outcomes: 0 for no landslide occurrence and 1 for landslide occurrence. The conditional distribution of landslide occurrence given information of the landslide predisposing factors and the probabilistic model is provided as follows:
  P ( c i | x , w ) = σ ( y )  
where σ ( y ) = 1 1 + e y represents a logistic sigmoid function; and y denotes a linearly-weighted sum of M basis functions:
y ( x , w ) = m = 1 M w m φ m ( x ) + w o = w . φ  
where φ represents a Gaussian radial basis function. Its functional formula is stated as follows:
φ ( x i , x j ) = exp ( | | x i x j | | 2 2 × r 2 )  
where r represents the width of the radial basis function (RBF).
In this method, weights must be determined using a prior distribution function over the vector to formulate a Bayesian training criterion. Additionally, to prevent an over-fitting problem (very large values of w), achieving small weights in order to make a smooth classification boundary is the main objective [80,114]. This can be attained by assigning a zero-mean Gaussian distribution on the weight as follows [79]:
p ( w | α ) = m = 1 M N ( w m | 0 , α m 1 )  
where α is a vector of independent hyper-parameters; each element α m of the vector dictates how far its associated weight Wm may vary from zero. Additionally, a sparse weight prior distribution is attained by assigning a different variance parameter for each weight [112]. Thus, the prior distribution over w is written in the following form [79]:
p ( w | α ) = m = 1 M N ( w m | 0 , α m 1 ) = ( 2 π ) M / 2 m = 1 M α m 1 / 2 exp ( α m w m 2 2 )  
Given initial values of the hyper-parameter α and p ( w | C , α ) P ( C | w ) p ( w | α ) , the most probable weight µ can be found by maximizing the penalized negative log-likelihood function over w [79]:
log { P ( C | w ) p ( w | α ) } = i = 1 N [ c i log y i + ( 1 c i ) log ( 1 y i ) ] 0.5 w T A w  
where A = diag (α1, α2, …, αm).
Furthermore, the least squares algorithm (LSA) is applied on weights iteratively and then the Laplace approximation procedure to solve the above optimization problem is conducted; the most probable weight µ and language its covariance Σ are computed as follows [79]:
μ = Σ · φ T · B · C     Σ = [ φ T · B · φ + A ] 1  
where B = diag(β1, β2,…, βN) with β i = σ { y ( x i ) } · [ 1 σ { y ( x i ) } ] .
The remarkable result when the optimization process terminates is that many elements of the hyper parameter vector α approach infinity; therefore, the weight vector w only has a few non-zero elements, which are considered as relevant vectors [112]. After finishing the training process, the vector of model weight w is then used to predict the posterior of the class label Ci given an input vector x utilizing Equation (4).
It is of note that the performance of RVM is dependent on the setting of the parameter r, which is the RBF basis width. This parameter strongly influences the smoothness of the classification boundary and therefore affects the model predictive capability [79]. A too smooth boundary may under-fit the data; meanwhile, a too rough boundary may lead to over-fitting [71]. The process of selecting a model parameter appropriately is widely known as a model selection problem. Since model selection can be formulated as an optimization task, we employed Imperialist Competitive Algorithm (ICA) optimization in this study as powerful metaheuristic to solve the model selection problem for RVM. The ICA algorithm is described in the subsequent part of the study.

3.2. Imperialist Competitive Algorithm (ICA)

The Imperialist Competitive Algorithm (ICA), proposed by Atashpaz-Gargari and Lucas [81], is inspired from the field of human social evolution. This algorithm belongs to the group of swarm intelligence, which can effectively deal with continuous optimization problems [115]. As other metaheuristics, ICA is specifically designed for solving optimization problems in which no exact algorithm can be applied in polynomial time, also called NP-hard problem. The task of finding an optimal value for the machine learning model can be classified as an NP-hard problem since interaction between the model structure and the training data is very complex and there is an infinite number of possible values of the tuning parameter. Since successful applications of ICA have been widely observed [116], this algorithm can be helpful to assist the training phase of RVM by selecting an optimal width of the radial basis function.
ICA is a population-based (metaheuristic optimization) stochastic search inspired by imperialistic competition [81]. This algorithm attempts to mimic the social policy of imperialism in the real world. When an empire rises, it dominates more colonies and takes advantage of their sources; if one empire falls, other empires will compete to take its possessions. In ICA, individuals within the population represent countries and they interact with each other to form empires that possess colonies (Figure 5).
ICA commences with an initial population and a pre-specified objective function. Based on the objective function value, the most powerful countries are chosen as imperialists and the others are colonies of them. The algorithm then simulates the competition among imperialists in order to acquire more colonies. The best imperialist typically has more chance to occupy more colonies. A population of ICA is illustrated in Figure 6.
After colonies have been assigned to each imperialist, these colonies move towards their corresponding imperialists. These movements of colonies are demonstrated in Figure 6. In this figure, it is noted that α represents a uniformly distributed random number, generated as follows [117]:
α ~ U ( 0 , θ × S )  
where θ denotes a constant variable; typically, θ is greater than 1 (e.g., 1.5). S is the distance between the colony and the imperialist.
During the competition process, the weakest empire gradually loses its colonies and other powerful empires attempt to obtain them. Moreover, the colonies will exchange their positions when they are more powerful than their relevant imperialist [118]. The empire that has no colonies will collapse and eventually the most powerful empire will dominate all other empires and represents the final optimal solution for the optimization problem at hand.

3.3. Performance Evaluation

3.3.1. Statistical-Based Measures

Statistical index-based methods are applied to evaluate and compare the performance of a new proposed model with other soft computing benchmark models. In this study, sensitivity (recall), specificity, precision (positive predictive value (PPV)), and accuracy were utilized. To better understand the definition and formulation the above mentioned criteria, four types of possible consequences including True Positive (TP), False Positive (FP), True Negative (TN), and False Negative (FN) are necessary to be described [119]. The TP and FP are defined as the proportion of the number of pixels that are correctly classified as landslide and non-landslide, respectively. Meanwhile, TN and FN are the number of pixels classified correctly and incorrectly as non-landslide, respectively [4]. Basically, the goodness-of-fit (using training dataset) and prediction power (using validation dataset) of the landslide susceptibility models are evaluated based on these statistical measures.
Moreover, sensitivity (recall), specificity, and accuracy are also commonly employed performance measurement metrics. These three metrics are computed as follows [120]:
Sensitivity = TP TP + FN ;   Specificity = TN FP + TN ;   Accuracy = TP + TN TP + TN + FP + FN  

3.3.2. Receiver Operating Characteristic (ROC) Curve

The ROC curve is also widely used to assess the predictive power and quality of probabilistic classification models. The global performance of a landslides model is evaluated by the area under the ROC curve (AUROC) [121]. It is to be noted that the ROC is constructed based on the sensitivity (true positivity rate) and the specificity (false negative rate) results. The AUROC can be computed as:
AUROC = TP + TN P + N  
To quantify the global performance, AUROC that varies between 0.5 and 1 is also used. The closer the AUC is to one, the more accurate the prediction of the model is. Accordingly, The AUROC values of 0.5–0.6, 0.6–0.7, 0.7–0.8, 0.8–0.9, and 0.9–1 indicate insufficient, moderate, good, very good, and excellence performance, respectively [84].

3.4. The Proposed Integration Approach Based on RVM and ICA for Spatial Modeling of Rainfall-Induced Shallow Landslides

3.4.1. GIS Database, Training and Validation Datasets

In order to construct a machine learning based model for shallow landslide prediction, it is necessary to establish a GIS database. The GIS database used in this study is illustrated in Figure 7. The locations of the past shallow landslide occurrences, maps of topographic feature, land use data, and the map of the geological features are processed and integrated into a single database. In total, fourteen input variables are employed to model the status of the rainfall-induced shallow landslide. It is worth noting that data was acquired, processed, and integrated via ArcGIS and IDRISI Selva packages. Consequently, for producing the susceptibility maps, the susceptible indexes of all the pixels of the area were calculated by the RVM inference process, the SVM, and the LR models. The two SVM hyper parameters of the kernel width ( γ = 0.7) and the regularization parameter (C = 0.5) were selected to implement the SVM prediction model. Thereafter, these pixels with susceptible indexes were converted to GIS data using the ArcGIS 10.2 software (ESRI, California, USA) and a supplementary module developed in C++ by the authors.
To construct the shallow landslide prediction model, information of 6908 pixels within the map of the studied region was collected. Among them, 3453 pixels were associated with landslide occurrences. Since we formulate the landslide modeling as a supervised learning task, landslide occurrence data (or pixels) are assigned to the class label of C1 = 1; and the label of non-landslide data are denoted as C2 = 0. In addition, the whole data set is divided into two subsets: Training Set (70%) used for model construction and Validation Set (30%) employed for model testing. In this study, to facilitate the modeling process, all continuous and discreet condition factors were normalized and converted from categorical classes into continuous values within the range of 0.01 and 0.99 by the frequency ratio method described in Tien Bui et al. [85].

3.4.2. The Proposed Model Structure

The overall structure of the RVM-ICA model which is a combination of RVM and ICA algorithms is shown in Figure 8. RVM serves as a pattern recognition algorithm to distinguish data samples with a landslide label from data samples with a non-landslide label. This machine learning method was employed to generalize a classification boundary from the collected data set. This classification boundary is capable of recognizing the status of landslide occurrences in the study area. The prediction results of RVM were then used to construct a landslide susceptibility map for the city of Lang Son. As aforementioned, the model construction of RVM necessitates an appropriate setting of the RBF basis width. Therefore, in this study, we employed ICA to fine-tune this parameter of RVM and achieve the most desirable prediction performance.
At the first iteration, the RBF basis width of RVM is produced randomly between 0 and 1. The RVM model associated with this basis width value is established with the data from the training set. To better evaluate the quality of a RVM model, we further divide the training set into two data groups: Data for model construction (70%), is denoted as group 1 and data for model prediction (30%) is denoted as group 2. The data in Group 1 is used to train the RVM model; the data in group 2 serves as unknown patterns. In the model evaluation of RVM, the following objective function is used;
f RVM = 1 C S R Group 1 + C S R Group 2  
where CSRGroup1 and CSRGroup2 denote classification success rates of the two groups of interest.
The purpose of data separation in the training set is to alleviate the over-fitting problem. It is of note that one may simply identify the most suitable RBF basis width parameter by considering the model prediction performance on the whole the training set. However, the prediction results on the training set alone are not good indicators of the model generalization due to an over-fitting issue. The over-fitting generally occurs when a model learns the training data very well but with the data poorly predicts outside of the training set.
Therefore, the data in group 2 is utilized to penalize the over-fitted model and identify a good value for the RBF basis width. Since the objective function has been defined, the ICA operation can start. During the searching process, ICA gradually recognizes suitable values of the tuning parameter and discards inappropriate ones. When the stopping criterion is satisfied, the ICA operation stops; accordingly, a desirable tuning parameter has been identified. The optimized RVM-ICA model is re-trained with the whole training set and the prediction outcome on the validation set can be obtained.

4. Results and Analysis

4.1. Factor Selection Using Information Gain Ratio (IGR)

In landslide prediction, it is necessary to preliminarily investigate the relevancy of the influencing factors to ensure that the essential factors are selected and the irrelevant factors are removed from further analysis [122]. In this study, Information Gain (IGR) algorithm, which is a commonly-used feature selection method [91], was selected for expressing the prediction power of the aforementioned landslide conditioning factors. The IGR, as a filtering approach, is an entropy-based method that only considers important factors on landslide occurrence. This method aims at ranking subsets of features based on the results of information gain entropy in a decreasing order [123]. Consider C as a training dataset composed of n input samples which is the number of samples in the training dataset C, belonging to the class label (landslide, non-landslide locations). The IGR for a specify landslide conditioning factor (CF) and training data (C) is obtained with the following equation [63,124]:
IGR   ( C ,   CF )   = E n t r o p y   ( C ) - E n t r o p y   ( C ,   CF ) S p l i t E n t r o p y   ( C ,   CF )  
E n t r o p y   ( C )   =   i = 1 2 n ( L i ,   CF ) | C | log 2 n ( L i ,   CF ) | C |  
E n t r o p y   ( C ,   CF ) = j = 1 m C j | C | E n t r o p y   ( C )  
S p l i t E n t r o p y ( C , CF ) = - j = 1 m | C j | | C | log 2 | C j | | C |  
The results of the factor evaluation based on IGR are shown in Table 2. It can be observed that slope is the most significant factor for shallow landslide modeling in this study because it has the highest value of average predictive ability (0.601 0.002), followed by STI (0.378 0.005), aspect (0.230 422 0.003), SPI (0.217 0.004), TWI (0.215 0.003), land use (0.155 0.002), curvature (0.152 0.003), toposhade (0.121 0.002), lithology (0.118 0.002), elevation (0.119 0.001), slope length (0.072 0.003), distance to faults (0.068 0.002), soil type (0.055 0.001), and valley depth (0.023 0.001), respectively. It can be observed that slope is the most significant factor for shallow landslide modeling in this study because it has the highest value of average predictive ability, followed by STI, aspect, SPI, TWI, land use, curvature, toposhade, lithology, elevation, slope length, distance to faults, soil type, and valley depth, respectively.

4.2. Training and Validation Process

In this study, based on the training dataset, the modeling of landslide occurrences was performed and analyzed. Based on the optimization outcome of ICA, the optimal RVM hyper parameter, which is the basis width, was found to be 0.979. Therefore, the proposed model was trained with the basis width of 0.979. The training result of the proposed model is reported in Table 3. It can be observed that the proposed model has very high values of sensitivity (87%), specificity (96.6%), as well as accuracy (91.3%). The training result with the collected dataset demonstrates that the learning phase of the proposed RVM-ICA has been achieved successfully, reflected by a goodness-of-fit between the actual and predicted class labels.
Additionally, the proposed model was evaluated using the validation dataset to assess its prediction power in shallow landslide modeling. The validating results of the proposed RVM-ICA are reported in Table 4 and Figure 9. It can be seen that the proposed model also obtained high values of sensitivity (84.1%), specificity (90.6%), and accuracy (87.1%). Moreover, the predictive results of the ROC curve analysis pointed out that the proposed model has a very high value of AUC (0.92) (see Figure 9). The observed validation results are in line with the training outcomes. In summary, both the training and validating phases indicate that the proposed model achieved a very good predictive capability for shallow landslide modeling.

4.3. Construction of the Susceptibility Map

Since RVM-ICA has achieved a very good predictive result with the data set collected from the study area, this hybrid intelligent model was employed to prepare a landslide susceptibility map for the city of Lang Son (Vietnam). The landslide susceptibility map for the study area (Figure 10) was constructed with five susceptible classes including very low (40%), low (20%), moderate (15%), high (15%), and very high (10%) [125,126]. These labels were determined on the basis of reclassification of susceptible indexes of all pixels in the map.
To validate the reliability of the produced susceptibility map, the landslide inventory map, which includes the actual landslide locations, was overlaid with the newly constructed map. The graphic curve [127] was plotted with the percentage of the landslide pixels on the y-axis and the percentage of pixels of susceptible classes which were sorted from high to low susceptible indexes. It can be seen from the graphic curve that most of the actual landslide pixels were detected in high and very high classes whereas very few actual landslide pixels were located in low and very low classes. The AUC was obtained with the models for the whole study area (Figure 10). These results depicted that the produced landslide susceptibility map established by the proposed RVM-ICA model is highly reliable for practical landslide hazard management.

4.4. Model Comparison

Since the proposed model has been newly introduced for landslide prediction, its predictive capability should be compared with other well-known existing methods. In this study, two benchmark models including the LR and SVM were selected for comparison. The results of the model comparison are reported in Table 3 and Table 4, and Figure 8. It can be shown that the sensitivity, specificity, accuracy, and AUC of the proposed model are better than those of the other benchmark models (SVM and LR) in both training and validation phases. According to the results, the proposed model outperformed the SVM and LR models in shallow landslide prediction.

5. Discussion

Landside modelling has been studied based more on qualitative and quantitative methods throughout the world. Among these, machine learning and evolutionary algorithms have been more favored [4,124]. The aim of the introduction of these techniques is to use an accurate method to prepare a reliable landslide susceptibility map. Basically, in the current study we present a new state-of-the-art evolutionary/optimization algorithm of the imperialistic competitive algorithm (ICA) based relevance vector machine (RVM) namely ICA-RVM for spatial prediction of landslides in Lang Son city. Additionally, two machine learning algorithms including LR and SVM were used for comparison of the new proposed model.
The ICA-RVM had not been explored for spatial prediction of landslides. We selected fourteen condition factors for the modelling process. The information gain ratio was used to determine the most important factors for the modelling process. Results indicated that slope angle was the most significant factor for landslide occurrence in the study area. The analysis results are reasonable because slope is well-known as the most important affecting factor for landslide occurrences [128]. The outcomes of IGR-based analysis also show that all factors are relevant to landslide modeling as the average predictive ability of all factors is high; therefore, all fourteen influencing factors were selected for landslide modeling in this study.
Results analysis concluded that the new proposed model had the highest goodness-of-fit and performance according to the training and validation datasets, respectively (see Table 3 and Table 4). However, the ICA-RVM (accuracy = 87.1%) model can significantly enhance the performance of LR (accuracy = 79.8%) and SVM (accuracy = 84.6%). The validity of landslide susceptibility maps prepared using the new hybrid model and two state-of-the-art algorithms, LR and SVM, was checked by the validation dataset and designing the ROC curve (Figure 8). The results of this figure indicated that the ICA-RVM outperformed and outclassed the LR and SVM as benchmark algorithms. The reason for the improvement in prediction accuracy of the proposed method is that RVM-ICA is a hybrid model which inherits the advantages of both machine learning and metaheuristic approaches [72]. Additionally, the proposed model used the RVM model as the classifier which has many advantages compared with others machine learning approaches. The first advantage is that RVM has only one hyper parameter; this significantly eases its model selection phase. The second advantage is that the resulting model constructed by RVM generally has fewer support vectors. This means that the model of RVM is less complex than the standard SVM [78]; therefore, RVM is less prone to over fitting than SVM. Moreover, the results which demonstrate that the SVM model outperforms the LR model for landslide prediction in the present study are in agreement with the findings in previous studies [129].

6. Concluding Remark

In this study, RVM-ICA, which is a combination of two advanced computational intelligence methods of RVM and ICA, was proposed for spatial landslide modeling with the case study of Lang Son city (northern part of Vietnam). This area has experienced many serious landslides in recent years during the monsoon season. The proposed model was constructed and validated with a dataset generated from the GIS database of the study area. The data set includes 101 historical landslide locations and fourteen landslide influencing factors. The predictive capability of the proposed model was verified by the employment of the area under the ROC curve (AUROC), and three statistical indexes (sensitivity, specificity, and accuracy).
In addition, two benchmark models including SVM and LR were used to compare and validate the applicability of the new shallow landslide prediction model. Analysis results showed that the new hybrid model obtained a very good training performance (accuracy = 91.3%) and a desired predictive outcome with the validating data set (accuracy = 87.1%). More specially, the performance of the proposed model is significantly better than those of the two benchmark models (SVM and LR). These facts indicate that the hybrid machine learning model is a very promising alternative for shallow landslide modeling. In short, the hybrid approach of the advanced machine learning classifier and the metaheuristic optimization method is capable of producing an accurate landslide susceptibility map. One significant advantage of the hybrid RVM-ICA model is that the model parameter selection phase is carried out automatically with the help of the metaheuristic algorithm. This means that the training and prediction phases of the proposed model can be performed autonomously with minimum human intervention. Thus, the model can be used by local authorities without much domain knowledge in machine learning. This advantage and the experimental outcome confirm that RVM-ICA is very helpful for planners and decision makers in the task of landslide hazard management in the study areas. The designed hybrid model can be applied for the same purposes to the other cities with the same weather and topographic conditions; however, caution and careful verification are very necessary.

Author Contributions

D.T.B., H.S., A.S., K.C., N.D.H., B.T.P., Q.T.B., C.T.T., M.P., B.B.A, and L.S contributed equally to the work. D.T.B., N.D.H, and B.T.P collected field data and conducted the landslide mapping and analysis. D.T.B., H.S., A.S., K.C., Q.T.B, and C.T.T wrote the manuscript. M.P., B.B.A, and L.S provided critical comments in planning this paper and edited the manuscript. All the authors discussed the results and edited the manuscript.

Funding

This research was supported by the Basic Research Project of the Korea Institute of Geoscience, Mineral Resources (KIGAM) funded by the Minister of Science and ICT and Universiti Teknologi Malaysia (UTM) based on Research University Grant (Q.J130000.2527.17H84).

Acknowledgments

We express our thanks to Editor-in-Chief of the Remote sensing journal and our three anonymous reviewers. With their comments and suggestions, we were able to significantly improve the quality of our paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Varnes, D.J. Slope movement types and processes. Spec. Rep. 1978, 176, 11–33. [Google Scholar]
  2. Dai, F.; Lee, C.; Ngai, Y.Y. Landslide risk assessment and management: An overview. Eng. Geol. 2002, 64, 65–87. [Google Scholar] [CrossRef]
  3. Guzzetti, F.; Reichenbach, P.; Ardizzone, F.; Cardinali, M.; Galli, M. Estimating the quality of landslide susceptibility models. Geomorphology 2006, 81, 166–184. [Google Scholar] [CrossRef]
  4. Shirzadi, A.; Bui, D.T.; Pham, B.T.; Solaimani, K.; Chapi, K.; Kavian, A.; Shahabi, H.; Revhaug, I. Shallow landslide susceptibility assessment using a novel hybrid intelligence approach. Environ. Earth Sci. 2017, 76, 60–78. [Google Scholar] [CrossRef]
  5. Malamud, B.D.; Turcotte, D.L.; Guzzetti, F.; Reichenbach, P. Landslide inventories and their statistical properties. Earth Surf. Process. Landf. 2004, 29, 687–711. [Google Scholar] [CrossRef]
  6. Chau, K.; Sze, Y.; Fung, M.; Wong, W.; Fong, E.; Chan, L. Landslide hazard analysis for hong kong using landslide inventory and gis. Comput. Geosci. 2004, 30, 429–443. [Google Scholar] [CrossRef]
  7. Van Den Eeckhaut, M.; Verstraeten, G.; Poesen, J. Morphology and internal structure of a dormant landslide in a hilly area: The collinabos landslide (Belgium). Geomorphology 2007, 89, 258–273. [Google Scholar] [CrossRef]
  8. Shahabi, H.; Hashim, M. Landslide susceptibility mapping using gis-based statistical models and remote sensing data in tropical environment. Sci. Rep. 2015, 5, 9899. [Google Scholar] [CrossRef] [PubMed]
  9. Van Westen, C.J.; Castellanos, E.; Kuriakose, S.L. Spatial data for landslide susceptibility, hazard, and vulnerability assessment: An overview. Eng. Geol. 2008, 102, 112–131. [Google Scholar] [CrossRef]
  10. Remondo, J.; Bonachea, J.; Cendrero, A. A statistical approach to landslide risk modelling at basin scale: From landslide susceptibility to quantitative risk assessment. Landslides 2005, 2, 321–328. [Google Scholar] [CrossRef]
  11. Maharaj, R.J. Landslide processes and landslide susceptibility analysis from an upland watershed: A case study from St. Andrew, Jamaica, West Indies. Eng. Geol. 1993, 34, 53–79. [Google Scholar] [CrossRef]
  12. Hoang, N.-D.; Tien Bui, D. Spatial prediction of rainfall-induced shallow landslides using gene expression programming integrated with gis: A case study in Vietnam. Nat. Hazards 2018, 92, 1871–1887. [Google Scholar] [CrossRef]
  13. Martha, T.R.; Roy, P.; Govindharaj, K.B.; Kumar, K.V.; Diwakar, P.; Dadhwal, V. Landslides triggered by the June 2013 extreme rainfall event in parts of Uttarakhand state, India. Landslides 2015, 12, 135–146. [Google Scholar] [CrossRef]
  14. Thanh, L.N.; De Smedt, F. Application of an analytical hierarchical process approach for landslide susceptibility mapping in a luoi district, thua thien hue province, Vietnam. Environ. Earth Sci. 2012, 66, 1739–1752. [Google Scholar] [CrossRef]
  15. Barry, R.G.; Chorley, R.J. Atmosphere, Weather and Climate; Routledge: Abingdon, UK, 2009. [Google Scholar]
  16. Salinger, M.J. Climate variability and change: Past, present and future—An overview. In Increasing Climate Variability and Change; Springer: Berlin, Germany, 2005; pp. 9–29. [Google Scholar]
  17. Lima, P.; Steger, S.; Glade, T.; Tilch, N.; Schwarz, L.; Kociu, A. Landslide Susceptibility Mapping at National Scale: A First Attempt for Austria; Springer International Publishing: Cham, Switzerland, 2017; pp. 943–951. [Google Scholar]
  18. Jaiswal, P.; van Westen, C.J.; Jetten, V. Quantitative landslide hazard assessment along a transportation corridor in southern India. Eng. Geol. 2010, 116, 236–250. [Google Scholar] [CrossRef]
  19. Nilsen, T.H. Relative Slope Stability and Land-Use Planning in the San Francisco Bay Region, California; US Government Printing Office: Washington, DC, USA, 1979; Volume 944.
  20. Nilsen, T.H.; Brabb, E.E. Current slope-stability studies in the san francisco bay region. J. Res. US Geol. Surv. 1973, 1, 327–431. [Google Scholar]
  21. Nilsen, T.; Brabb, E.E. Slope stability studies in the san francisco bay region, California. Geol. Soc. Am. Rev. Eng. Geol. 1977, 3, 235–243. [Google Scholar]
  22. Kienholz, H. Maps of geomorphology and natural hazards of Grindelwald, Switzerland: Scale 1:10,000. Arct. Alp. Res. 1978, 10, 169–184. [Google Scholar] [CrossRef]
  23. Fell, R.; Corominas, J.; Bonnard, C.; Cascini, L.; Leroi, E.; Savage, W.Z. Guidelines for landslide susceptibility, hazard and risk zoning for land use planning. Eng. Geol. 2008, 102, 85–98. [Google Scholar] [CrossRef] [Green Version]
  24. Moosavi, V.; Niazi, Y. Development of hybrid wavelet packet-statistical models (WP-SM) for landslide susceptibility mapping. Landslides 2016, 13, 97–114. [Google Scholar] [CrossRef]
  25. Chen, J.; Zeng, Z.; Jiang, P.; Tang, H. Application of multi-gene genetic programming based on separable functional network for landslide displacement prediction. Neural Comput. Appl. 2016, 27, 1771–1784. [Google Scholar] [CrossRef]
  26. Meinhardt, M.; Fink, M.; Tünschel, H. Landslide susceptibility analysis in central Vietnam based on an incomplete landslide inventory: Comparison of a new method to calculate weighting factors by means of bivariate statistics. Geomorphology 2015, 234, 80–97. [Google Scholar] [CrossRef]
  27. Ballabio, C.; Sterlacchini, S. Support vector machines for landslide susceptibility mapping: The staffora river basin case study, Italy. Math. Geosci. 2012, 44, 47–70. [Google Scholar] [CrossRef]
  28. Matin, M.A.; Chitale, V.S.; Murthy, M.S.R.; Uddin, K.; Bajracharya, B.; Pradhan, S. Understanding forest fire patterns and risk in nepal using remote sensing, geographic information system and historical fire data. Int. J. Wildl. Fire 2017, 26, 276–286. [Google Scholar] [CrossRef]
  29. Tien Bui, D.; Hoang, N.D. A bayesian framework based on a gaussian mixture model and radial-basis-function fisher discriminant analysis (baygmmkda v1.1) for spatial prediction of floods. Geosci. Model Dev. 2017, 10, 3391–3409. [Google Scholar] [CrossRef]
  30. Samodra, G.; Chen, G.; Sartohadi, J.; Kasama, K. Comparing data-driven landslide susceptibility models based on participatory landslide inventory mapping in purwosari area, Yogyakarta, Java. Environ. Earth Sci. 2017, 76, 184–203. [Google Scholar] [CrossRef]
  31. Dehnavi, A.; Aghdam, I.N.; Pradhan, B.; Varzandeh, M.H.M. A new hybrid model using step-wise weight assessment ratio analysis (Swara) technique and adaptive neuro-fuzzy inference system (Anfis) for regional landslide hazard assessment in iran. Catena 2015, 135, 122–148. [Google Scholar] [CrossRef]
  32. Ahmed, B. Landslide susceptibility mapping using multi-criteria evaluation techniques in chittagong metropolitan area, bangladesh. Landslides 2015, 12, 1077–1095. [Google Scholar] [CrossRef]
  33. Feizizadeh, B.; Blaschke, T. An uncertainty and sensitivity analysis approach for gis-based multicriteria landslide susceptibility mapping. Int. J. Geogr. Inf. Sci. 2014, 28, 610–638. [Google Scholar] [CrossRef] [PubMed]
  34. Shirzadi, A.; Saro, L.; Joo, O.H.; Chapi, K. A gis-based logistic regression model in rock-fall susceptibility mapping along a mountainous road: Salavat abad case study, Kurdistan, Iran. Nat. Hazards 2012, 64, 1639–1656. [Google Scholar] [CrossRef]
  35. Shahabi, H.; Hashim, M.; Ahmad, B.B. Remote sensing and gis-based landslide susceptibility mapping using frequency ratio, logistic regression, and fuzzy logic methods at the central Zab Basin, Iran. Environ. Earth Sci. 2015, 73, 8647–8668. [Google Scholar] [CrossRef]
  36. Othman, A.A.; Gloaguen, R.; Andreani, L.; Rahnama, M. Improving landslide susceptibility mapping using morphometric features in the mawat area, Kurdistan region, NE Iraq: Comparison of different statistical models. Geomorphology 2018, 319, 147–160. [Google Scholar] [CrossRef]
  37. Shahabi, H.; Ahmad, B.; Khezri, S. Evaluation and comparison of bivariate and multivariate statistical methods for landslide susceptibility mapping (case study: Zab basin). Arab. J. Geosci. 2013, 6, 3885–3907. [Google Scholar] [CrossRef]
  38. Süzen, M.L.; Doyuran, V. A comparison of the gis based landslide susceptibility assessment methods: Multivariate versus Bivariate. Environ. Geol. 2004, 45, 665–679. [Google Scholar] [CrossRef]
  39. Shahabi, H.; Khezri, S.; Ahmad, B.B.; Hashim, M. Landslide susceptibility mapping at central zab basin, iran: A comparison between analytical hierarchy process, frequency ratio and logistic regression models. Catena 2014, 115, 55–70. [Google Scholar] [CrossRef]
  40. Komac, M. A landslide susceptibility model using the analytical hierarchy process method and multivariate statistics in perialpine Slovenia. Geomorphology 2006, 74, 17–28. [Google Scholar] [CrossRef]
  41. Wang, L.-J.; Guo, M.; Sawada, K.; Lin, J.; Zhang, J. Landslide susceptibility mapping in mizunami city, Japan: A comparison between logistic regression, bivariate statistical analysis and multivariate adaptive regression spline models. Catena 2015, 135, 271–282. [Google Scholar] [CrossRef]
  42. Felicísimo, Á.M.; Cuartero, A.; Remondo, J.; Quirós, E. Mapping landslide susceptibility with logistic regression, multiple adaptive regression splines, classification and regression trees, and maximum entropy methods: A comparative study. Landslides 2013, 10, 175–189. [Google Scholar] [CrossRef]
  43. Tan, Y.; Guo, D.; Xu, B. A geospatial information quantity model for regional landslide risk assessment. Nat. Hazards 2015, 79, 1385–1398. [Google Scholar] [CrossRef]
  44. Lee, S.; Choi, J. Landslide susceptibility mapping using gis and the weight-of-evidence model. Int. J. Geogr. Inf. Sci. 2004, 18, 789–814. [Google Scholar] [CrossRef]
  45. Bui, D.T.; Tuan, T.A.; Klempe, H.; Pradhan, B.; Revhaug, I. Spatial prediction models for shallow landslide hazards: A comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree. Landslides 2016, 13, 361–378. [Google Scholar]
  46. Chen, W.; Shahabi, H.; Shirzadi, A.; Li, T.; Guo, C.; Hong, H.; Li, W.; Pan, D.; Hui, J.; Ma, M. A novel ensemble approach of bivariate statistical-based logistic model tree classifier for landslide susceptibility assessment. Geocarto Int. 2018, 1–23. [Google Scholar] [CrossRef]
  47. Althuwaynee, O.F.; Pradhan, B.; Lee, S. Application of an evidential belief function model in landslide susceptibility mapping. Comput. Geosci. 2012, 44, 120–135. [Google Scholar] [CrossRef]
  48. He, S.; Pan, P.; Dai, L.; Wang, H.; Liu, J. Application of kernel-based fisher discriminant analysis to map landslide susceptibility in the qinggan river delta, three gorges, china. Geomorphology 2012, 171, 30–41. [Google Scholar] [CrossRef]
  49. Chen, W.; Xie, X.; Peng, J.; Shahabi, H.; Hong, H.; Bui, D.T.; Duan, Z.; Li, S.; Zhu, A.-X. Gis-based landslide susceptibility evaluation using a novel hybrid integration approach of bivariate statistical based random forest method. Catena 2018, 164, 135–149. [Google Scholar] [CrossRef]
  50. Chen, W.; Zhang, S.; Li, R.; Shahabi, H. Performance evaluation of the gis-based data mining techniques of best-first decision tree, random forest, and naïve bayes tree for landslide susceptibility modeling. Sci. Total Environ. 2018, 644, 1006–1018. [Google Scholar] [CrossRef]
  51. Ngoc-Thach, N.; Ngo, D.B.-T.; Xuan-Canh, P.; Hong-Thi, N.; Thi, B.H.; NhatDuc, H.; Dieu, T.B. Spatial pattern assessment of tropical forest fire danger at thuan chau area (vietnam) using gis-based advanced machine learning algorithms: A comparative study. Ecol. Inform. 2018, 46, 74–85. [Google Scholar] [CrossRef]
  52. Ercanoglu, M.; Dağdelenler, G.; Özsayin, E.; Alkevlı, T.; Sönmez, H.; Özyurt, N.N.; Kahraman, B.; Uçar, İ.; Çetınkaya, S. Application of chebyshev theorem to data preparation in landslide susceptibility mapping studies: An example from yenice (Karabük, Turkey) region. J. Mt. Sci. 2016, 13, 1923–1940. [Google Scholar] [CrossRef]
  53. Mandal, S.; Mandal, K. Modeling and mapping landslide susceptibility zones using gis based multivariate binary logistic regression (LR) model in the rorachu river basin of eastern Sikkim Himalaya, India. Model. Earth Syst. Environ. 2018, 4, 69–88. [Google Scholar] [CrossRef]
  54. Hong, H.; Pradhan, B.; Xu, C.; Bui, D.T. Spatial prediction of landslide hazard at the yihuang area (China) using two-class kernel logistic regression, alternating decision tree and support vector machines. Catena 2015, 133, 266–281. [Google Scholar] [CrossRef]
  55. Hong, H.; Liu, J.; Zhu, A.-X.; Shahabi, H.; Pham, B.T.; Chen, W.; Pradhan, B.; Bui, D.T. A novel hybrid integration model using support vector machines and random subspace for weather-triggered landslide susceptibility assessment in the wuning area (China). Environ. Earth Sci. 2017, 76, 652. [Google Scholar] [CrossRef]
  56. Tehrany, M.S.; Jones, S.; Shabani, F.; Martínez-Álvarez, F.; Bui, D.T. A novel ensemble modeling approach for the spatial prediction of tropical forest fire susceptibility using logitboost machine learning classifier and multi-source geospatial data. Theor. Appl. Climatol. 2018, 1–17. [Google Scholar] [CrossRef]
  57. Pradhan, B.; Sezer, E.A.; Gokceoglu, C.; Buchroithner, M.F. Landslide susceptibility mapping by neuro-fuzzy approach in a landslide-prone area (cameron highlands, Malaysia). IEEE Trans. Geosci. Remote Sens. 2010, 48, 4164–4177. [Google Scholar] [CrossRef]
  58. Chen, W.; Panahi, M.; Tsangaratos, P.; Shahabi, H.; Ilia, I.; Panahi, S.; Li, S.; Jaafari, A.; Ahmad, B.B. Applying population-based evolutionary algorithms and a neuro-fuzzy system for modeling landslide susceptibility. Catena 2019, 172, 212–231. [Google Scholar] [CrossRef]
  59. Tien Bui, D.; Khosravi, K.; Li, S.; Shahabi, H.; Panahi, M.; Singh, V.; Chapi, K.; Shirzadi, A.; Panahi, S.; Chen, W. New hybrids of anfis with several optimization algorithms for flood susceptibility modeling. Water 2018, 10, 1210. [Google Scholar] [CrossRef]
  60. Shadman Roodposhti, M.; Aryal, J.; Shahabi, H.; Safarrad, T. Fuzzy shannon entropy: A hybrid gis-based landslide susceptibility mapping method. Entropy 2016, 18, 343. [Google Scholar] [CrossRef]
  61. Pham, B.T.; Bui, D.T.; Pourghasemi, H.R.; Indra, P.; Dholakia, M. Landslide susceptibility assesssment in the uttarakhand area (India) using gis: A comparison study of prediction capability of naïve bayes, multilayer perceptron neural networks, and functional trees methods. Theor. Appl. Climatol. 2017, 128, 255–273. [Google Scholar] [CrossRef]
  62. Shirzadi, A.; Chapi, K.; Shahabi, H.; Solaimani, K.; Kavian, A.; Ahmad, B.B. Rock fall susceptibility assessment along a mountainous road: An evaluation of bivariate statistic, analytical hierarchy process and frequency ratio. Environ. Earth Sci. 2017, 76, 152–169. [Google Scholar] [CrossRef]
  63. Chen, W.; Shirzadi, A.; Shahabi, H.; Ahmad, B.B.; Zhang, S.; Hong, H.; Zhang, N. A novel hybrid artificial intelligence approach based on the rotation forest ensemble and naïve bayes tree classifiers for a landslide susceptibility assessment in langao county, China. Geomat. Nat. Hazards Risk 2017, 8, 1955–1977. [Google Scholar] [CrossRef]
  64. Tien Bui, D.; Shahabi, H.; Shirzadi, A.; Chapi, K.; Pradhan, B.; Chen, W.; Khosravi, K.; Panahi, M.; Bin Ahmad, B.; Saro, L. Land subsidence susceptibility mapping in south korea using machine learning algorithms. Sensors 2018, 18, 2464. [Google Scholar] [CrossRef] [PubMed]
  65. Chen, W.; Peng, J.; Hong, H.; Shahabi, H.; Pradhan, B.; Liu, J.; Zhu, A.-X.; Pei, X.; Duan, Z. Landslide susceptibility modelling using gis-based machine learning techniques for chongren county, Jiangxi province, China. Sci. Total Environ. 2018, 626, 1121–1135. [Google Scholar] [CrossRef] [PubMed]
  66. Fatemi Aghda, S.M.; Bagheri, V.; Razifard, M. Landslide susceptibility mapping using fuzzy logic system and its influences on mainlines in lashgarak region, Tehran, Iran. Geotech. Geol. Eng. 2018, 36, 915–937. [Google Scholar] [CrossRef]
  67. Pham, B.T.; Tien Bui, D.; Prakash, I. Bagging based support vector machines for spatial prediction of landslides. Environ. Earth Sci. 2018, 77, 146–161. [Google Scholar] [CrossRef]
  68. Reichenbach, P.; Rossi, M.; Malamud, B.D.; Mihir, M.; Guzzetti, F. A review of statistically-based landslide susceptibility models. Earth-Sci. Rev. 2018, 180, 60–91. [Google Scholar] [CrossRef]
  69. Huang, Y.; Zhao, L. Review on landslide susceptibility mapping using support vector machines. Catena 2018, 165, 520–529. [Google Scholar] [CrossRef]
  70. Pham, B.T.; Tien Bui, D.; Prakash, I. Landslide susceptibility assessment using bagging ensemble based alternating decision trees, logistic regression and J48 decision trees methods: A comparative study. Geotech. Geol. Eng. 2017, 35, 2597–2611. [Google Scholar] [CrossRef]
  71. Wang, L.-J.; Guo, M.; Sawada, K.; Lin, J.; Zhang, J. A comparative study of landslide susceptibility maps using logistic regression, frequency ratio, decision tree, weights of evidence and artificial neural network. Geosci. J. 2016, 20, 117–136. [Google Scholar] [CrossRef]
  72. Bishop, C. Pattern Recognition and Machine Learning; Springer Science + Business Media: Singapore, 2006. [Google Scholar]
  73. Tien Bui, D.; Tuan, T.A.; Hoang, N.-D.; Thanh, N.Q.; Nguyen, D.B.; Van Liem, N.; Pradhan, B. Spatial prediction of rainfall-induced landslides for the lao cai area (Vietnam) using a hybrid intelligent approach of least squares support vector machines inference model and artificial bee colony optimization. Landslides 2016, 14, 447–458. [Google Scholar] [CrossRef]
  74. Hoang, N.-D.; Tien Bui, D.; Liao, K.-W. Groutability estimation of grouting processes with cement grouts using differential flower pollination optimized support vector machine. Appl. Soft Comput. 2016, 45, 173–186. [Google Scholar] [CrossRef]
  75. Xue, X. Prediction of slope stability based on hybrid PSO and LSSVM. J. Comput. Civ. Eng. 2017, 31, 04016041. [Google Scholar] [CrossRef]
  76. Prayogo, D.; Susanto, Y.T.T. Optimizing the prediction accuracy of friction capacity of driven piles in cohesive soil using a novel self-tuning least squares support vector machine. Adv. Civ. Eng. 2018, 2018, 1–9. [Google Scholar] [CrossRef]
  77. Qi, C.; Fourie, A.; Ma, G.; Tang, X.; Du, X. Comparative study of hybrid artificial intelligence approaches for predicting hangingwall stability. J. Comput. Civ. Eng. 2018, 32, 04017086. [Google Scholar] [CrossRef]
  78. Hong, H.; Tsangaratos, P.; Ilia, I.; Liu, J.; Zhu, A.X.; Xu, C. Applying genetic algorithms to set the optimal combination of forest fire related variables and model forest fire susceptibility based on data mining models. The case of Dayu county, China. Sci. Total Environ. 2018, 630, 1044–1056. [Google Scholar] [CrossRef] [PubMed]
  79. Chen, W.; Panahi, M.; Pourghasemi, H.R. Performance evaluation of gis-based new ensemble data mining techniques of adaptive neuro-fuzzy inference system (ANFIS) with genetic algorithm (GA), differential evolution (DE), and particle swarm optimization (PSO) for landslide spatial modelling. Catena 2017, 157, 310–324. [Google Scholar] [CrossRef]
  80. Hoang, N.-D.; Tien-Bui, D. A novel relevance vector machine classifier with cuckoo search optimization for spatial prediction of landslides. J. Comput. Civ. Eng. 2016, 30, 04016001. [Google Scholar] [CrossRef]
  81. Tipping, M.E. The relevance vector machine. Adv. Neural Inf. Process. Syst. 2000, 12, 652–658. [Google Scholar]
  82. Atashpaz-Gargari, E.; Lucas, C. Imperialist Competitive Algorithm: An Algorithm for Optimization Inspired by Imperialistic Competition. In Proceedings of the 2007 IEEE Congress on Evolutionary Computation, Singapore, 25–28 September 2007; pp. 4661–4667. [Google Scholar]
  83. Imani, M.; Kao, H.-C.; Lan, W.-H.; Kuo, C.-Y. Daily sea level prediction at chiayi coast, taiwan using extreme learning machine and relevance vector machine. Glob. Planet. Chang. 2018, 161, 211–221. [Google Scholar] [CrossRef]
  84. Bui, D.T.; Pradhan, B.; Revhaug, I.; Tran, C.T. A comparative assessment between the application of fuzzy unordered rules induction algorithm and J48 decision tree models in spatial prediction of shallow landslides at lang son city, Vietnam. In Remote Sensing Applications in Environmental Research; Springer: Berlin, Germnay, 2014; pp. 87–111. [Google Scholar]
  85. Quoc, N.D.; Hung, L.; Huyen, D.T. Geological Map; Institute of Geosciences and Mineral Resources: Hanoi, Vietnam, 1992. [Google Scholar]
  86. Bui, D.T.; Pradhan, B.; Revhaug, I.; Nguyen, D.B.; Pham, H.V.; Bui, Q.N. A novel hybrid evidential belief function-based fuzzy logic model in spatial prediction of rainfall-induced shallow landslides in the lang son city area (Vietnam). Geomat. Nat. Hazards Risk 2015, 6, 243–271. [Google Scholar] [CrossRef]
  87. Tam, V.; Tuy, P.; Nam, N.; Tuan, L.; Tuan, N.; Trung, N.; Thang, D.; Ha, P. Geohazard Investigation in Some Key Areas of the Northern Mountainous Area of Vietnam for the Planning of Socio-Economic Development; Vietnam Institute of Geosciences and Mineral Resources: Hanoi, Vietnam, 2006; Volume 83, pp. 56–62. [Google Scholar]
  88. Nguyen, Q.-K.; Tien Bui, D.; Hoang, N.-D.; Trinh, P.; Nguyen, V.-H.; Yilmaz, I. A novel hybrid approach based on instance based learning classifier and rotation forest ensemble for spatial prediction of rainfall-induced shallow landslides using gis. Sustainability 2017, 9, 813. [Google Scholar] [CrossRef]
  89. Jebur, M.N.; Pradhan, B.; Tehrany, M.S. Using alos palsar derived high-resolution dinsar to detect slow-moving landslides in tropical forest: Cameron highlands, Malaysia. Geomat. Nat. Hazards Risk 2015, 6, 741–759. [Google Scholar] [CrossRef]
  90. Hung, L.Q.; Van, N.T.H.; Duc, D.M.; Van Son, P.; Khanh, N.H.; Binh, L.T. Landslide susceptibility mapping by combining the analytical hierarchy process and weighted linear combination methods: A case study in the upper lo river Catchment (Vietnam). Landslides 2016, 13, 1285–1301. [Google Scholar] [CrossRef]
  91. Huang, S.L.; Yamasaki, K. Slope failure analysis using local minimum factor-of-safety approach. J. Geotech. Eng. 1993, 119, 1974–1989. [Google Scholar] [CrossRef]
  92. Yalcin, A. Gis-based landslide susceptibility mapping using analytical hierarchy process and bivariate statistics in ardesen (Turkey): Comparisons of results and confirmations. Catena 2008, 72, 1–12. [Google Scholar] [CrossRef]
  93. Gomez, H.; Kavzoglu, T. Assessment of shallow landslide susceptibility using artificial neural networks in jabonosa river basin, venezuela. Eng. Geol. 2005, 78, 11–27. [Google Scholar] [CrossRef]
  94. Clerici, A.; Perego, S.; Tellini, C.; Vescovi, P. A gis-based automated procedure for landslide susceptibility mapping by the conditional analysis method: The baganza valley case study (Italian Northern Apennines). Environ. Geol. 2006, 50, 941–961. [Google Scholar] [CrossRef]
  95. Ohlmacher, G.C. Plan curvature and landslide probability in regions dominated by earth flows and earth slides. Eng. Geol. 2007, 91, 117–134. [Google Scholar] [CrossRef]
  96. Walker, L.R.; Shiels, A.B. Landslide Ecology; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
  97. Wilson, J.P.; Gallant, J.C. Terrain Analysis: Principles and Applications; John Wiley & Sons: Mississauga, ON, Canada, 2000. [Google Scholar]
  98. Ozdemir, A. Landslide susceptibility mapping using bayesian approach in the Sultan Mountains (Akşehir, Turkey). Nat. Hazards 2011, 59, 1573–1607. [Google Scholar] [CrossRef]
  99. Poudyal, C.P.; Chang, C.; Oh, H.-J.; Lee, S. Landslide susceptibility maps comparing frequency ratio and artificial Neural networks: A case study from the Nepal Himalaya. Environ. Earth Sci. 2010, 61, 1049–1064. [Google Scholar] [CrossRef]
  100. Moore, I.D.; Wilson, J.P. Length-slope factors for the revised universal soil loss equation: Simplified method of estimation. J. Soil Water Conserv. 1992, 47, 423–428. [Google Scholar]
  101. Pradhan, A.M.S.; Kim, Y.-T. Relative effect method of landslide susceptibility zonation in weathered granite soil: A case study in Deokjeok-ri creek, South Korea. Nat. Hazards 2014, 72, 1189–1217. [Google Scholar] [CrossRef]
  102. Devkota, K.C.; Regmi, A.D.; Pourghasemi, H.R.; Yoshida, K.; Pradhan, B.; Ryu, I.C.; Dhital, M.R.; Althuwaynee, O.F. Landslide susceptibility mapping using certainty factor, index of entropy and logistic regression models in gis and their comparison at mugling–narayanghat road section in Nepal Himalaya. Nat. Hazards 2013, 65, 135–165. [Google Scholar] [CrossRef] [Green Version]
  103. Jebur, M.N.; Pradhan, B.; Tehrany, M.S. Optimization of landslide conditioning factors using very high-resolution airborne laser scanning (Lidar) data at catchment scale. Remote Sens. Environ. 2014, 152, 150–165. [Google Scholar] [CrossRef]
  104. Hong, H.; Chen, W.; Xu, C.; Youssef, A.M.; Pradhan, B.; Tien Bui, D. Rainfall-induced landslide susceptibility assessment at the chongren area (China) using frequency ratio, certainty factor, and index of entropy. Geocarto Int. 2016, 32, 139–154. [Google Scholar] [CrossRef]
  105. Caniani, D.; Pascale, S.; Sdao, F.; Sole, A. Neural networks and landslide susceptibility: A case study of the urban area of Potenza. Nat. Hazards 2008, 45, 55–72. [Google Scholar] [CrossRef]
  106. Ercanoglu, M.; Kasmer, O.; Temiz, N. Adaptation and comparison of expert opinion to analytical hierarchy process for landslide susceptibility mapping. Bull. Eng. Geol. Environ. 2008, 67, 565–578. [Google Scholar] [CrossRef]
  107. Zhao, C.; Chen, W.; Wang, Q.; Wu, Y.; Yang, B. A comparative study of statistical index and certainty factor models in landslide susceptibility mapping: A case study for the Shangzhou district, shaanxi province, China. Arab. J. Geosci. 2015, 8, 9079–9088. [Google Scholar] [CrossRef]
  108. Ayalew, L.; Yamagishi, H. The application of gis-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko mountains, central Japan. Geomorphology 2005, 65, 15–31. [Google Scholar] [CrossRef]
  109. Yesilnacar, E.; Topal, T. Landslide susceptibility mapping: A comparison of logistic regression and neural networks methods in a medium scale study, Hendek region (Turkey). Eng. Geol. 2005, 79, 251–266. [Google Scholar] [CrossRef]
  110. Nefeslioglu, H.A.; Duman, T.Y.; Durmaz, S. Landslide susceptibility mapping for a part of tectonic Kelkit Valley (Eastern Black sea region of Turkey). Geomorphology 2008, 94, 401–418. [Google Scholar] [CrossRef]
  111. Gorum, T.; Fan, X.; van Westen, C.J.; Huang, R.Q.; Xu, Q.; Tang, C.; Wang, G. Distribution pattern of earthquake-induced landslides triggered by the 12 may 2008 wenchuan earthquake. Geomorphology 2011, 133, 152–167. [Google Scholar] [CrossRef]
  112. Tipping, M.E. Sparse bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 2001, 1, 211–244. [Google Scholar]
  113. Pardo, B.A. Professional employment. IEEE Trans. Signal Process. 2014, 62, 4298–4310. [Google Scholar]
  114. Bishop, C.M.; Tipping, M.E. Variational relevance vector machines. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence, San Francisco, CA, USA, 30 June–3 July 2000; pp. 46–53. [Google Scholar]
  115. Hosseini, S.; Al Khaled, A. A survey on the imperialist competitive algorithm metaheuristic: Implementation in engineering domain and directions for future research. Appl. Soft Comput. 2014, 24, 1078–1094. [Google Scholar] [CrossRef]
  116. Sadowski, L.; Nikoo, M. Corrosion current density prediction in reinforced concrete by imperialist competitive algorithm. Neural Comput. Appl. 2014, 25, 1627–1638. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  117. Rao, R.V. Teaching-learning-based optimization algorithm. In Teaching Learning Based Optimization Algorithm; Springer: Berlin, Germany, 2016; pp. 9–39. [Google Scholar]
  118. Kaveh, A.; Talatahari, S. Optimum design of skeletal structures using imperialist competitive algorithm. Comput. Struct. 2010, 88, 1220–1229. [Google Scholar] [CrossRef]
  119. Fawcett, T. An introduction to roc analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
  120. Pham, B.T.; Shirzadi, A.; Bui, D.T.; Prakash, I.; Dholakia, M. A hybrid machine learning ensemble approach based on a radial basis function neural network and rotation forest for landslide susceptibility modeling: A case study in the Himalayan area, India. Int. J. Sediment Res. 2018, 33, 157–170. [Google Scholar] [CrossRef]
  121. Pham, B.T.; Bui, D.T.; Prakash, I.; Nguyen, L.H.; Dholakia, M. A comparative study of sequential minimal optimization-based support vector machines, vote feature intervals, and logistic regression in landslide susceptibility assessment using gis. Environ. Earth Sci. 2017, 76, 371. [Google Scholar] [CrossRef]
  122. Micheletti, N.; Foresti, L.; Robert, S.; Leuenberger, M.; Pedrazzini, A.; Jaboyedoff, M.; Kanevski, M. Machine learning feature selection methods for landslide susceptibility mapping. Math. Geosci. 2014, 46, 33–57. [Google Scholar] [CrossRef]
  123. Forman, G. An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 2003, 3, 1289–1305. [Google Scholar]
  124. Abedini, M.; Ghasemian, B.; Shirzadi, A.; Shahabi, H.; Chapi, K.; Pham, B.T.; Bin Ahmad, B.; Tien Bui, D. A novel hybrid approach of bayesian logistic regression and its ensembles for landslide susceptibility assessment. Geocarto Int. 2018, 14, 447–458. [Google Scholar] [CrossRef]
  125. Lee, S. Comparison of landslide susceptibility maps generated through multiple logistic regression for three test areas in Korea. Earth Surf. Process. Landf. J. Br. Geomorphol. Rese. 2007, 32, 2133–2148. [Google Scholar] [CrossRef]
  126. Chung, C.J.F.; Fabbri, A.G. Validation of spatial prediction models for landslide hazard mapping. Nat. Hazards 2003, 30, 451–472. [Google Scholar] [CrossRef]
  127. Chung, C.-J.F.; Fabbri, A.G.; Van Westen, C.J. Multivariate regression analysis for landslide hazard Zonation. In Geographical Information Systems in Assessing Natural Hazards; Springer: Berlin, Germany, 1995; pp. 107–133. [Google Scholar]
  128. Ohlmacher, G.C.; Davis, J.C. Using multiple logistic regression and gis technology to predict landslide Hazard in Northeast Kansas, USA. Eng. Geol. 2003, 69, 331–343. [Google Scholar] [CrossRef]
  129. Pham, B.T.; Pradhan, B.; Tien Bui, D.; Prakash, I.; Dholakia, M.B. A comparative study of different machine learning methods for landslide susceptibility assessment: A case study of uttarakhand area (India). Environ. Model. Softw. 2016, 84, 240–250. [Google Scholar] [CrossRef]
Figure 1. Location of the study area and landslide inventory.
Figure 1. Location of the study area and landslide inventory.
Remotesensing 10 01538 g001
Figure 2. Photo of two recent landslides at the Lang Son area (the photos were taken on October 2016 by Ha Viet Pham).
Figure 2. Photo of two recent landslides at the Lang Son area (the photos were taken on October 2016 by Ha Viet Pham).
Remotesensing 10 01538 g002
Figure 3. Landslide conditioning factors used in the study area: (a) Slope; (b) Slope length; (c) Aspect; (d) Curvature; (e) Elevation; (f) TWI; (g) SPI; (h) STI; (i) Valley depth; (j) Topo-shape; (k) Land use (RA: Residential Area; PTL: Protective Forest Land; PDL: Productive Forest Land; PL: Paddy Land; BL: Barren Land; PCL: Perennial Crop Land; WSL: Water Surface Land); (l) Soil type (FA: Ferralic Acrisols; DG: Dystric Gleysols; PA: Plinthic Acrisols; WS: Water Surface; DF: Dystric Fluvisols; EF: Eutric Fluvisols; RF: Rhodic Ferralsols; RMS: Rocky Mountain Surface; (m) Lithology; (n) Distance to faults.
Figure 3. Landslide conditioning factors used in the study area: (a) Slope; (b) Slope length; (c) Aspect; (d) Curvature; (e) Elevation; (f) TWI; (g) SPI; (h) STI; (i) Valley depth; (j) Topo-shape; (k) Land use (RA: Residential Area; PTL: Protective Forest Land; PDL: Productive Forest Land; PL: Paddy Land; BL: Barren Land; PCL: Perennial Crop Land; WSL: Water Surface Land); (l) Soil type (FA: Ferralic Acrisols; DG: Dystric Gleysols; PA: Plinthic Acrisols; WS: Water Surface; DF: Dystric Fluvisols; EF: Eutric Fluvisols; RF: Rhodic Ferralsols; RMS: Rocky Mountain Surface; (m) Lithology; (n) Distance to faults.
Remotesensing 10 01538 g003aRemotesensing 10 01538 g003b
Figure 4. Illustrated concept of RVM classification model [113].
Figure 4. Illustrated concept of RVM classification model [113].
Remotesensing 10 01538 g004
Figure 5. A framework population of the ICA algorithm.
Figure 5. A framework population of the ICA algorithm.
Remotesensing 10 01538 g005
Figure 6. Initial population of ICA.
Figure 6. Initial population of ICA.
Remotesensing 10 01538 g006
Figure 7. The established GIS database.
Figure 7. The established GIS database.
Remotesensing 10 01538 g007
Figure 8. The proposed RVM-ICA used for landslide spatial modeling.
Figure 8. The proposed RVM-ICA used for landslide spatial modeling.
Remotesensing 10 01538 g008
Figure 9. Area under the ROC curve (AUC) of the three models using the validation dataset.
Figure 9. Area under the ROC curve (AUC) of the three models using the validation dataset.
Remotesensing 10 01538 g009
Figure 10. Landslide susceptibility map using the RVM model optimized by the ICA algorithm for the study area.
Figure 10. Landslide susceptibility map using the RVM model optimized by the ICA algorithm for the study area.
Remotesensing 10 01538 g010
Table 1. Landslide conditioning factors and their classes.
Table 1. Landslide conditioning factors and their classes.
No.FactorsClasses
1Slope (degree)(1) 0–8.1; (2) 8.2–15.1; (3) 15.2–24.3; (4) 24.4–31.5; (5) 31.6–41.1; (6) 41.2–80.4
2Slope length (m)(1) 0–8.6; (2) 8.7–34.8; (3) 34.9–61.1; (4) 61.2–84.9; (5) 85–774.2
3Aspect(1) Flat; (2) N; (3) NE; (4) E; (5) SE; (6) S; (7) SW; (8) W; (9) NW
4Curvature (1) >−2.1; (2) −2–−0.1; (3) −0.1–0.1; (4) 0.1–2.1; (5) <2.1
5Elevation (m)(1) 214.7–269.7; (2) 269.8–295.6; (3) 295.7–326.5; (4) 326.6–373.6; (5) 373.7–413.9; (6) 432–553.4; (7) 553.5–800.2
6TWI(1) 1.3–4.2; (2) 4.3–5.3; (3) 5.4–6.4; (4) 6.5–8.1; (5) 8.2–9.4; (6) 9.5–20.8
7SPI(1) 0–20.2; (2) 20.3–80.8; (3) 80.9–151.8; (4) 151.9–211.8; (5) 211.9–270.1; (6) >270.1
8STI(1) 0–6.8; (2) 6.9–22; (3) 22.1–38.2; (4) 38.3–44.2; (5) 44.3–72.9; (6) >72.9
9Valley depth (m)(1) −128.8–−12.1; (2) −12–13.4; (3) 13.5–32.2; (4) 32.3–53.7; (5) 53.8–85.9; (6) 86–213.4
10Toposhade (1) Ridge; (2) Saddle; (3) Flat; (4) Ravine; (5) Convex hillside; (6) Saddle hillside; (7) Slope hillside; (8) Concave hillside; (9) Inflection hillside; (10) Unknown hillside
11Land use(1) RA; (2) PTL; (3) PDL; (4) PL; (5) BL; (6) PCL; (7) WSL
12Soil type(1) FA; (2) DG; (3) PA; (4) WS; (5) DF; (6) EF; (7) RF; (8) RMS
13Lithology(1) Tuff; (2) Sandstone; (3) Siltstone; (4) Quaternary; (5) Basalt; (6) Conglomerate
14Distance to fault (m)(1) 0–100; (2) 100–200; (3) 200–300; (4) 300–400; (5) >400
Table 2. Predictive ability of the fourteen influencing factors using the Information Gain Ratio technique.
Table 2. Predictive ability of the fourteen influencing factors using the Information Gain Ratio technique.
NoConditioning FactorAverage Predictive AbilityStandard Deviation
1Slope (degree)0.6010.002
2STI0.3780.005
3Aspect0.2300.003
4SPI0.2170.004
5TWI0.2150.003
6Land use0.1550.002
7Curvature0.1520.003
8Toposhade0.1210.002
9Lithology0.1180.002
10Elevation (m)0.1190.001
11Slope length (m)0.0720.003
12Distance to fault (m)0.0680.002
13Soil type0.0550.001
14Valley depth (m)0.0230.001
Table 3. Model performance using the training dataset.
Table 3. Model performance using the training dataset.
Statistical IndexRelevance Vector MachineLogistic RegressionSupport Vector Machine
True positive233822002319
True negative206120402109
False positive7220991
False negative349370301
Sensitivity (%)87.085.688.5
Specificity (%)96.690.795.9
Accuracy (%)91.388.091.9
Table 4. Model validation using the validation dataset.
Table 4. Model validation using the validation dataset.
Statistical IndexRelevance Vector MachineLogistic RegressionSupport Vector Machine
True positive945798857
True negative867869911
False positive90246187
False negative178176135
Sensitivity (%)84.181.986.4
Specificity (%)90.677.983.0
Accuracy (%)87.179.884.6

Share and Cite

MDPI and ACS Style

Tien Bui, D.; Shahabi, H.; Shirzadi, A.; Chapi, K.; Hoang, N.-D.; Pham, B.T.; Bui, Q.-T.; Tran, C.-T.; Panahi, M.; Bin Ahmad, B.; et al. A Novel Integrated Approach of Relevance Vector Machine Optimized by Imperialist Competitive Algorithm for Spatial Modeling of Shallow Landslides. Remote Sens. 2018, 10, 1538. https://0-doi-org.brum.beds.ac.uk/10.3390/rs10101538

AMA Style

Tien Bui D, Shahabi H, Shirzadi A, Chapi K, Hoang N-D, Pham BT, Bui Q-T, Tran C-T, Panahi M, Bin Ahmad B, et al. A Novel Integrated Approach of Relevance Vector Machine Optimized by Imperialist Competitive Algorithm for Spatial Modeling of Shallow Landslides. Remote Sensing. 2018; 10(10):1538. https://0-doi-org.brum.beds.ac.uk/10.3390/rs10101538

Chicago/Turabian Style

Tien Bui, Dieu, Himan Shahabi, Ataollah Shirzadi, Kamran Chapi, Nhat-Duc Hoang, Binh Thai Pham, Quang-Thanh Bui, Chuyen-Trung Tran, Mahdi Panahi, Baharin Bin Ahmad, and et al. 2018. "A Novel Integrated Approach of Relevance Vector Machine Optimized by Imperialist Competitive Algorithm for Spatial Modeling of Shallow Landslides" Remote Sensing 10, no. 10: 1538. https://0-doi-org.brum.beds.ac.uk/10.3390/rs10101538

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop