Integrating XBoost and SHAP for Enhanced Interpretability in Landslide Susceptibility Assessment: A Case Study in North-western Peloponnese, Greece.
- 1National Technical University of Athens, School of Mining and Metallurgical Engineering, Greece (marisafrousiou@gmail.com)
- 2National Technical University of Athens, School of Mining and Metallurgical Engineering, Greece (gilia@metal.ntua.gr)
- 3National Technical University of Athens, School of Mining and Metallurgical Engineering, Greece (dimkasmas@gmail.com)
- 4National Technical University of Athens, School of Mining and Metallurgical Engineering, Greece (ivanapetropoulou@gmail.com)
Landslide phenomena, acknowledged as significant geohazards affecting both human infrastructure and the natural environment, have been the
subject of intensive research aimed at pinpointing areas at risk of instability. This task involves the complex modelling of variables related to landslides, which requires both knowledge-based and data-driven methodologies. The challenge is heightened by the often intricate and obscure processes that trigger landslides be they natural or anthropogenic. Over the past two decades, the application of artificial intelligence, specifically machine learning algorithms, has brought a transformative approach to landslide susceptibility evaluations. These advanced methodologies, encompassing fuzzy logic, decision trees, artificial neural networks, ensemble methods, and evolutionary algorithms, have demonstrated notable accuracy and dependability. A significant recent development in this field is the incorporation of eXplainable AI (XAI) techniques into landslide susceptibility models. XAI tools, such as SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME), offer a window into the previously opaque decision-making processes of AI models, thus demystifying the "black box" aspect of conventional AI systems.
The primary aim of this study was to employ the XBoost algorithm and integrate SHAP methods for an in-depth landslide susceptibility assessment. The methodology was methodically divided into five distinct phases: (i)the creation of the inventory map, (ii)the selection, classification, and weighting of landslide-influencing variables, (iii)conducting multicollinearity analysis, (iv)applying and testing the developed model, and (v)evaluating the predictive performance of various models and analyzing the results.
The computational work was performed using coding languages R and Python, while ArcGIS 10.5 was instrumental in compiling data and producing detailed landslide susceptibility maps. This study's efficiency was tested in the North-western Peloponnese region of Greece, known for its frequent landslide occurrences. Nine specific variables were considered: elevation, slope angle, aspect, plan and profile curvature, distance to faults, distance to river networks, lithology and hydrolithology cover and landslide locations, all contributing to the generation of training and test datasets. The Frequency Ratio method was applied to discern the correlation among these variables and assign weight values to each class. Multi-collinearity analysis further helped in identifying any collinearity among the variables.
SHAP values were utilized to rank features according to their importance, offering a transparent view of variable contributions. The evaluation phase involved calculating the model's predictive power using metrics like classification accuracy, sensitivity, specificity, and the area under the success and predictive rate curves (AUC). This comprehensive approach combining XBoost and SHAP methods presents a refined model for understanding and predicting landslide susceptibility, aiming for more accurate and interpretable hazard assessments. The results highlight the high performance of the XBoost algorithm, in terms of accuracy, sensitivity, specificity and AUC values. SHAP method indicates that slope angle was the most important feature in this model for landslide susceptibility. Other features such as elevation, distance to river network, and lithology cover also contribute to the model's predictions, though to a lesser extent and with more mixed effects. Aspect, profile curvature, plan curvature, distance to fault, and hydrolithology cover appear to have a more moderate or minimal impact on the model’s predictions.
How to cite: Frousiou, M. S., Ilia, I., Kasmas, D., and Petropoulou, I.: Integrating XBoost and SHAP for Enhanced Interpretability in Landslide Susceptibility Assessment: A Case Study in North-western Peloponnese, Greece., EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-9683, https://doi.org/10.5194/egusphere-egu24-9683, 2024.
Comments on the supplementary material
AC: Author Comment | CC: Community Comment | Report abuse