EGU26-17335, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-17335
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Thursday, 07 May, 11:45–11:55 (CEST)
 
Room 3.29/30
A Comparison of Data-Driven Regionalization Frameworks for Large-Scale Hydrological Modelling
Peter Salamon1, Olivier Chalifour1,4, Carlo Russo2, Stefania Grimaldi1, and Maria Luisa Taccari3
Peter Salamon et al.
  • 1European Commission, Joint Research Center, Ispra, Italy
  • 2UniSystems, Milan, Italy
  • 3European Center for Medium Range Weather Forecasts ECMWF, Reading, UK
  • 4Concordia University, Montreal, Canada

The Regionalization of parameters remains a major challenge for large-scale hydrological modeling, especially in regions with limited data where direct calibration against streamflow observations is not feasible. In this study, we compare two data-driven regionalization frameworks to a classical approach based on the physical and climatic proximity of catchments. The first framework relies on a modified Kling–Gupta efficiency (‘KGE) emulator coupled with a distributed evolutionary algorithm in Python (DEAP)-based evolutionary calibration framework. A deep neural network (DNN) and a random forest (RF) are trained using the "the most recent calibration dataset of the Copernicus Emergency Management Service Global Flood Awareness System (CEMS GloFASv5), which includes over 5000 catchments spanning a wide range of hydroclimatic, physiographic and land use conditions. Static catchment attributes and long-term climatic descriptors serve as predictors, and the target variable is the modified KGE obtained from an extensive parameter history generated by DEAP-based evolutionary calibration of the hydrological model OS LISFLOOD. To promote robust generalization across climates, we split the dataset into training, validation, and testing subsets using a climate-stratified sampling strategy that preserves key indicators, such as aridity, mean precipitation, and precipitation seasonality. Once trained, the emulator is embedded within an evolutionary algorithm to identify parameter sets that maximize the emulated ‘KGE for target catchments, thereby avoiding repeated hydrological simulations. The second framework uses a surrogate modeling approach that combines an LSTM-based emulator of OS LISFLOOD with a reinforcement learning-driven regionalization strategy. The surrogate model is trained to reproduce OS LISFLOOD's dynamic behavior, while a separate LSTM agent explores the parameter space and proposes parameter sets iteratively. This exploration is guided by a reward function based on a ‘KGE, which is computed from the surrogate model outputs. This enables efficient parameter optimization without the need for direct hydrological simulations. The transfer and optimization of parameters are governed by implicitly learned similarities in a latent feature space. Both data-driven approaches are evaluated by comparing the modified KGE achieved by OS LISFLOOD simulations using the inferred parameters with that achieved by the conventional regionalization method relying on explicit physical or geographical distance metrics. Preliminary results suggest that both data-driven methods reproduce large-scale spatial patterns of model performance and yield KGE values comparable to those obtained with the classical approach. While the current results are similar to existing methodologies, they suggest that emulator-based optimization and surrogate modeling are viable alternatives for large-scale regionalization with potential for further refinement.

How to cite: Salamon, P., Chalifour, O., Russo, C., Grimaldi, S., and Taccari, M. L.: A Comparison of Data-Driven Regionalization Frameworks for Large-Scale Hydrological Modelling, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17335, https://doi.org/10.5194/egusphere-egu26-17335, 2026.