- 1Department of Earth Sciences, University of Florence, Florence, Italy (rajendranshobha.ajin@unifi.it, alessio.gatto@unifi.it, nicola.nocentini@unifi.it, riccardo.fanti@unifi.it)
- 2Department of General and Historical Geology, Institute of Geology, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
- 3Department of Geosciences, University of Cincinnati, Cincinnati, United States of America
The Carpathian region of Ukraine is significantly at risk of landslides attributed to its complex geology, steep and rugged topography, high levels of precipitation, and human-induced alterations in land use. This modelling employed the CatBoost algorithm to evaluate landslide susceptibility in the Transcarpathian region (Zakarpattia Oblast) of Ukraine, and comprised two phases, along with a performance comparison. A landslide inventory featuring 697 recorded landslides was utilized, with a data split of 70:30. In the initial phase, ten predisposing factors were utilized, and multicollinearity was assessed based on Variance Inflation Factor (VIF) values to confirm that correlated factors were absent. Subsequently, the modelling was implemented, and the performance was evaluated.
In the second phase, the Boruta feature selection algorithm was applied to eliminate irrelevant factors. The CatBoost-based modelling was executed again, and the predictive performance was assessed. Finally, the performance of the models was compared to analyze how it varies before and after the implementation of the Boruta algorithm. The performance of the models was analyzed using the Receiver Operating Characteristic (ROC) curve and other metrics, including Accuracy, F1-score, Precision, and Recall.
All ten factors yielded VIF values under the threshold of 10, and consequently, they were retained for modelling. Before the implementation of the Boruta algorithm, the model exhibited poor performance, with an area under the ROC curve (AUC) value of 0.644 (64.4%), an Accuracy of 0.600, an F1-score of 0.643, a Precision of 0.614, and a Recall of 0.674. The Boruta-based selection led to the rejection of four irrelevant predisposing factors; consequently, six factors qualified for subsequent analysis. The performance after applying the Boruta algorithm is as follows: a fair AUC value of 0.731 (73.1%), an Accuracy of 0.683, an F1-score of 0.725, a Precision of 0.676, and a Recall of 0.781. The model performance improved by 0.087 (8.7%) in AUC, 0.083 in Accuracy, 0.082 in F1-score, 0.062 in Precision, and 0.107 in Recall.
Despite the improvement in performance, the model did not yield superior evaluation scores. A possible reason is the constraint related to the quality of input data, which ongoing research is attempting to resolve by refining datasets and updating landslide inventories. However, the enhancement emphasizes the need for accurately selecting relevant factors in generating robust outputs. Moreover, the application of machine learning techniques in the Transcarpathian region, where there are limited methodological advancements, signifies a crucial advancement for landslide risk management in Ukraine. The insights from this modelling are instrumental as a preliminary step towards the future design of regional-scale early warning systems.
How to cite: Ajin, R. S., Gatto, A., Nocentini, N., Hadiatska, K., Ivanik, O., Kravchenko, D., Petrushenko, E., and Fanti, R.: Enhancing landslide susceptibility modelling through feature selection: A machine learning approach in the Ukrainian Carpathians, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16568, https://doi.org/10.5194/egusphere-egu26-16568, 2026.