Using machine learning for defining distributed monitoring variables correlated to the occurrence of rainfall-induced shallow landslides and debris flows: a case study in Campania region, Italy
- 1University of Salerno, Department of Civil Engineering, Italy
- 2ICAR-CNR, Italy
- 3Fondazione CMCC, Italy
Rainfall-induced shallow landslides and debris flows often cause casualties and significant damage to property. Territorial landslide early warning systems are recognized as an important countermeasure to avoid or reduce fatalities during rainfall events. A reliable warning model is a key component of these systems. Warning models operating over large areas usually relate the occurrence of landslides to rainfall monitoring data adopting appropriate thresholds (e.g., intensity-duration, cumulated rainfall-duration, hourly/daily rainfall indicators). The increasing availability of large sets of atmospheric and land monitoring data represents an opportunity to upgrade and improve existing landslide warning models. At the same time, appropriately treating such data may pose a significant challenge to analysts that are used to deal with much smaller amounts of data.
The objective of this preliminary study is to demonstrate that machine learning techniques can be effectively used to process monitoring data over large areas at regional scale, with the aim of defining and selecting the variables that best correlate with the initiation of shallow landslides and debris flows. The machine learning models have been tested in one of the warning zones defined by the regional civil protection agency for hydrogeological risk management in Campania (Italy). Two categories of data are used for the analyses: distributed monitoring data, and a landslide inventory. The monitoring variables are derived from the fifth generation of ECMWF atmospheric reanalysis (ERA5), available with a spatial resolution of about 31 km and a temporal resolution of 1 h (http://dx.doi.org/10.24381/cds.adbb2d47). Data on landslide events come from “FraneItalia”, a geo-referenced openly available catalogue of Italian landslides created consulting online news from 2010 onwards (http://dx.doi.org/10.17632/zygb8jygrw.2). Different machine learning models have been defined, trained, and tested to relate the occurrence of landslides in the case study area to multiple variables arising from different combinations of the adopted monitoring data, mainly rainfall and soil water content. The performance of these models is evaluated by means of standard contingencies and skill scores. The best performing variables are used to define an optimal multivariate threshold to be adopted in the landslide warning model. The results of the optimal model are also compared with the outcomes of an application of a more classical exceedance probability statistical methodology based on cumulated rainfall-duration thresholds.
How to cite: Calvello, M., Pecoraro, G., Esposito, M., Pota, M., Rianna, G., and Reder, A.: Using machine learning for defining distributed monitoring variables correlated to the occurrence of rainfall-induced shallow landslides and debris flows: a case study in Campania region, Italy, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-5272, https://doi.org/10.5194/egusphere-egu22-5272, 2022.