Comparing the Performance of a Logistic Regression and a Random Forest Model in Landslide Susceptibility Assessments. the Case of Wuyaun Area, China

  • Hong H
  • Tsangaratos P
  • Ilia I
  • et al.
N/ACitations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The main objectives of the study was to apply a Logistic Regression and a Random Forest model for the construction of a landslide susceptibility map in the Wuyuan area, China, and to compare their results by performing non-parametric and linear regression analysis. Thirteen landslide variables were analyzed, namely: lithology, soil, slope, aspect, altitude, topographic wetness index, stream power index, stream transport index, plan curvature, profile curvature, distance to roads, distance to rivers and distance to faults, while 255 sites classified as landslide and 255 sites classified as non-landslide were separated into a training dataset (70\%) and a validation dataset (30\%). The comparison and validation of the outcomes of each model were achieved using statistical evaluation measures, the receiving operating characteristic and the area under the success and prediction rate curves. The presence of linear correlation between the two models was estimated by performing a simple linear regression analysis. The most accurate model was Random Forest, which identified correctly 98.32\% of the instances during the training phase, followed by Logistic Regression (87.43\%). During the validation phase, the Random Forest achieved a classification accuracy of 85.52\%, while Logistic Regression model achieved an accuracy of 80.92\%. The area under the success and prediction rate curves for the Random Forest were calculated to be 0.9805 and 0.9324, respectively, while the Logistic Regression model showed as slightly lower predictive performance, 0.9372 and 0.8903 respectively. Finally, by performing a non-parametric analysis, the two models were found to be significantly different. Strong evidence of linear relationship between the two models exist, having a p-value less than 0.0001 at a 95\% confidence level and an R-2 value estimated to be 0.6993 indicating that 69.93\% of the variability in the Logistic Regression model can be explained by variation in the Random Forest model.

Cite

CITATION STYLE

APA

Hong, H., Tsangaratos, P., Ilia, I., Chen, W., & Xu, C. (2017). Comparing the Performance of a Logistic Regression and a Random Forest Model in Landslide Susceptibility Assessments. the Case of Wuyaun Area, China. In Advancing Culture of Living with Landslides (pp. 1043–1050). Springer International Publishing. https://doi.org/10.1007/978-3-319-53498-5_118

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free