EGU23-16626, updated on 26 Feb 2023
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Danish national early warning system for flash floods based on a gradient boosting machine learning framework

Grith Martinsen1, Yann Sweeney2, Jonas Wied Pedersen1,3, Roxana Alexandru2, Sergi Capape2, Charlotte Harris2, Michael Butts1, and Maria Diaz2
Grith Martinsen et al.
  • 1Danish Meteorological Institute, Flooding and Hydrology, Lyngbyvej 100, Copenhagen, Denmark
  • 2Faculty, 160 Old Street, London, Great Britain
  • 3DTU Sustain, Technical University of Denmark, Anker Engelunds Vej 1, Kgs. Lyngby, Denmark

Fluvial and flash floods can have devastating effects if they occur without warning. In Denmark, management of flood risk and performing preventative emergency service actions has been the sole responsibility of local municipalities. However, motivated by the disastrous 2021 floods in Central Europe, the Danish government has recently appointed the Danish Meteorological Institute (DMI) as the national authority for flood warnings in Denmark, and DMI is in the process of building capacity to fulfill this role.


One of the most cost-effective ways to mitigate flood damages is a well-functioning early warning system. Flood warning systems can rely on various methods ranging from human interpretation of meteorological and hydrological data to advanced hydrological modelling. The aim of this study is to generate short-range streamflow predictions in Danish river systems with lead times of 4-12 hours. To do so, we train and test models with hourly data on 172 catchments.


Machine learning (ML) models have in many cases been shown to outperform traditional hydrological models and offer efficient ways to learn patterns in historical data. Here, we investigate streamflow predictions with LightGBM, which is a gradient boosting framework that employs tree-based ML algorithms and is developed and maintained by Microsoft (Ke et al., 2017). The main argument for choosing a tree-based algorithm is its inherent ability to represent rapid dynamics often observed during flash floods. The main advantages of LightGBM over other tree-based algorithms are efficiency in training and lower memory consumption. We benchmark LightGBM’s performance against persistence, linear regression and various LSTM setups from the Neural Hydrology library (Kratzert et al., 2022).


We evaluate the algorithm trained using different input features. This analysis include model explainability, such as SHAP, and the results indicate that simply using lagged real-time observations of streamflow together with precipitation leads to the best performing and most parsimonious models. The results show that the LightGBM setup outperforms the benchmarks and is able to generate predictions with high Klinge-Gupta Efficiency scores > 0.9 in most catchments. Compared to the persistence benchmark it especially shows strong improvements on peak timing errors.

How to cite: Martinsen, G., Sweeney, Y., Pedersen, J. W., Alexandru, R., Capape, S., Harris, C., Butts, M., and Diaz, M.: Danish national early warning system for flash floods based on a gradient boosting machine learning framework, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-16626,, 2023.