EGU25-18240, updated on 15 Mar 2025
https://doi.org/10.5194/egusphere-egu25-18240
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Tuesday, 29 Apr, 08:55–09:05 (CEST)
 
Room N2
Flood damage in the residential sector: on the value of transnational datasets for robust feature selection
Maria Paula Avila1, Daniela Rodriguez Castro1, Thijs Endendijk2, Dillenardt Lisa3, Guntu Ravikumar4, Sébastien Erpicum1, Annegret Thieken3, Jeroen Aerts2, Kreibich Heidi4, and Dewals Benjamin1
Maria Paula Avila et al.
  • 1Liège, Civil and Environmental Engineering, Belgium (mpavila@uliege.be)
  • 2Vrije Universiteit Amsterdam, Institute for Environmental Studies (IVM), Environmental Economics, Netherlands
  • 3Institute of Environmental Science and Geography, University of Potsdam, Potsdam, Germany
  • 4Helmholtz Centre Potsdam German Research Centre for Geosciences GFZ, Section Hydrology, Telegrafenberg, 14473 Potsdam, Germany

Feature selection is an essential step in the development of empirical flood damage models based on machine learning techniques. So far, most models of this type were developed using data from a single region or country, and few of them utilize harmonized transboundary datasets. Here, we have harmonized 38 variables present in the datasets of three flood damage surveys conducted in Germany (n = 516), the Netherlands (n = 409) and Belgium (n = 320) after the 2021 mega-floods in Europe. After performing data imputation and multicollinearity check, we used linear and non-linear machine learning algorithms to assess permutation importance and identify features most influencing flood damage. The results of the four models suggest that besides the hazard variables such as water depth and human stability, the location of the heating system (in the basement or at a higher floor) appears among the topmost important features for both building and contents damage.

Subsequently, we did an analysis for a low and high range of water depths using the median value (0.6 m) as splitting criteria. In the lower range, for both types of damage, water depth appears to be the dominating driver, and specifically for the building damage, it exceeds by far the importance of any other variable. In contrast, for water depths above 0.6 m other factors outweigh water depth. In the case of content damage, building footprint area becomes the most important factor across all the models. For the building damage some hazard (e.g. human stability), exposure (e.g. building size) and vulnerability (e.g. hazard knowledge) variables have a comparable importance with that of water depth. Hence, our results show that multivariable models appear particularly necessary for modelling flood damage induced by high and extreme hazard conditions.

How to cite: Avila, M. P., Rodriguez Castro, D., Endendijk, T., Lisa, D., Ravikumar, G., Erpicum, S., Thieken, A., Aerts, J., Heidi, K., and Benjamin, D.: Flood damage in the residential sector: on the value of transnational datasets for robust feature selection, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-18240, https://doi.org/10.5194/egusphere-egu25-18240, 2025.