EGU26-13767, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-13767
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Wednesday, 06 May, 08:30–10:15 (CEST), Display time Wednesday, 06 May, 08:30–12:30
 
Hall A, A.48
Machine Learning Emulator for Large-Sample Hydrologic Model Calibration across Multiple FUSE Structures
Shadi Hatami1, Nicolás Vásquez1, Cyril Thébault1, Wouter Knoben1, Darri Eythorsson1, Simon Michael Papalexiou2,1, and Martyn Clark1
Shadi Hatami et al.
  • 1Department of Civil Engineering, Schulich School of Engineering, University of Calgary, Calgary, Canada
  • 2Institute of Global Water Security, Hamburg University of Technology (TUHH), Hamburg, Germany

Large-sample hydrologic studies often require calibrating multiple model structures across numerous catchments, which can be computationally intensive with traditional optimization algorithms. Alternatively, recent advances in Machine Learning (ML) have enabled computationally frugal calibration strategies that rely on model emulators. Such approaches leverage information across sites, enabling improved calibration efficiency and parameter transferability to unseen catchments. However, exploring the parameter space using emulators is challenging because of emulator error and the need to explore high-dimensional parameter spaces. In this work, we investigate ML-based emulation and optimization strategies designed to improve parameter-space exploration, with the broader goal of supporting reproducible and computationally feasible large-sample hydrologic simulation. To this end, we use the Framework for Understanding Structural Errors (FUSE), which systematically represents alternative process formulations through multiple model configurations. Our framework is calibrated for 1,070 catchments across North America, spanning a wide range of hydroclimatic conditions. We develop Random Forest (RF) and Quantile Random Forest (QRF) emulators to approximate the relationship between model parameters, catchment attributes, and the Kling–Gupta Efficiency (KGE). While RF provides point estimates, QRF captures predictive uncertainty through conditional quantiles. These emulators are integrated into two calibration strategies: (1) a standard Genetic Algorithm (GA) that efficiently searches for high-performing parameter sets, and (2) a two-step hybrid optimizer that first performs a broad global search using Markov chain Monte Carlo sampling and then refines promising solutions using local gradient-based optimization. By more fully evaluating the parameter space and avoiding premature convergence, the two-step strategy captures a more diverse ensemble of near-optimal parameter solutions. This diversity is particularly valuable for emulator-based calibration, as it allows the emulator to be retrained iteratively on a broader range of the parameter space, improving robustness and reducing reliance on narrowly sampled regions. These improvements are expected to support more stable parameter estimates and improved hydrologic simulations across a large sample of catchments. Overall, this hybrid framework enables reproducible and computationally efficient calibration across multiple model structures and hundreds of catchments, providing a scalable pathway for integrating ML emulators into large-sample hydrologic modeling workflows.

How to cite: Hatami, S., Vásquez, N., Thébault, C., Knoben, W., Eythorsson, D., Papalexiou, S. M., and Clark, M.: Machine Learning Emulator for Large-Sample Hydrologic Model Calibration across Multiple FUSE Structures, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13767, https://doi.org/10.5194/egusphere-egu26-13767, 2026.