EGU24-15873, updated on 09 Mar 2024
https://doi.org/10.5194/egusphere-egu24-15873
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Machine-learning based feature selection for a regional flood damage model

Daniela Rodriguez Castro1, Kasra Rafiezadeh Shahi2, Nivedita Sairam2, Melanie Fischer2, Guilherme Samprogna Mohor3, Annegret Thieken3, Benjamin Dewals1, and Heidi Kreibich2
Daniela Rodriguez Castro et al.
  • 1University of Liège, Civil and Environmental Engineering, Belgium (drodriguez@uliege.be)
  • 2Helmholtz Centre Potsdam German Research Centre for Geosciences GFZ, Section Hydrology, Telegrafenberg, 14473 Potsdam, Germany
  • 3Institute for Environmental Sciences and Geography, University Potsdam, Potsdam, Germany

After the 2021 floods in Europe, independent data collection initiatives were undertaken in the impacted areas of Belgium and Germany. The resulting datasets at residential building level contain valuable information on hazard characteristics, vulnerability of exposed assets, socio-economic factors and coping capacity of the inhabitants and the emergency services (i.e., emergency and precautionary measures). A transnational analysis of these datasets enhances our understanding of flood damage mechanisms.

The data analysed resulted from 420, and 609 standardized surveys with private households affected by the 2021 floods in Belgium and Germany, respectively. Of these, 277 correspond to the area of Rhineland-Palatinate, and 332 were from North Rhine-Westphalia in Germany. A set of 64 potential damage influencing variables were harmonized across the datasets. The initial phase involved conducting descriptive statistics of the selected variables in three regions: the Vesdre valley in Belgium, the Ahr valley in Rhineland-Palatinate (Germany) and affected regions in North Rhine-Westphalia (Germany).

In a second step, the most influential variables for predicting flood damage to residential buildings were identified by means of feature selection. This was conducted using the linear approaches multilinear with k-best predictors, and Elastic net regression as well as the non-linear techniques Random Forest and Conditional Inference Trees. Total building loss and the total content loss were used as target values. Based on different evaluation metrics, the most important variables describing absolute building damage and absolute contents damage in the three analyzed areas, were identified.

Commonalities and differences in flood characteristics and damage in the three regions will be presented and interpreted in detail.

How to cite: Rodriguez Castro, D., Rafiezadeh Shahi, K., Sairam, N., Fischer, M., Samprogna Mohor, G., Thieken, A., Dewals, B., and Kreibich, H.: Machine-learning based feature selection for a regional flood damage model, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-15873, https://doi.org/10.5194/egusphere-egu24-15873, 2024.