EGU23-733
https://doi.org/10.5194/egusphere-egu23-733
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Landslide Susceptibility Model based on Random Forest classification

Flavius Sirbu
Flavius Sirbu
  • Institute for Advanced Environmental Studies, West University of Timisoara, Romania, flavius.sirbu@e-uvt.ro

Random Forest (RF) is a classification algorithm used successfully in geomorphological and hazard mapping (Sîrbu et al., 2019). It performs a defined number of classifications, based on decision trees, on random samples with replacement, from the original training data. Because of this, the algorithm is especially robust for errors and outliers in the training data and it is also very good in producing uncertainty estimates for the variability of results on each of the classified features. Its resulting data can also be used, with different methods, to produce a ranking of the independent variables used in the classification.

The present study was performed on a given data set, in central Italy, containing 7,360 slope units covering an area of 4,095 km2. The slope units are classified twice, based on different methodologies, into units with or without landslides. Also each slope unit has assigned 26 attributes that were used as independent variables (Alvioli et al., 2022). The slope units are treated as spatially independent from each other, and have been randomly split 70%-30%, into training and validation data respectively.

The model was setup as a computer code, in the R software environment. It uses different libraries to integrate the input data, run the algorithm, run a validation and measure the performance of the model and finally produce the output data. Most of the model settings were used with their default value, with the number of classification trees (ntree) being the only important setting that was fine tuned to a value of 1501 based on different model runs.

The results of the two classifications (one for each classification of the dependent variable) are relatively similar, proving once again the robustness of the RF algorithm when it comes to minor to medium changes in the input data. The first classification had an AUC (area under the curve) value of 0.829 compared with the AUC value of 0.817 for the second classification. For each classification, a ranking of the independent variables was produce, with the standard deviation of slope being the most important predictor. Other predictors with relative high importance were elevation and curvatures.

The results show that RF is an important classifier, which can be used with relatively low custom settings and on almost any data set in order to produce a reliable susceptibility map. Its integration with the R software makes it easy to run the whole process virtually automatic. The computer code for the model will be made freely available.

How to cite: Sirbu, F.: Landslide Susceptibility Model based on Random Forest classification, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-733, https://doi.org/10.5194/egusphere-egu23-733, 2023.