EGU25-6278, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-6278
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Wednesday, 30 Apr, 10:55–11:05 (CEST)
 
Room 3.16/17
Training Surrogates with Knowledge and Data: A Bayesian Hybrid Modelling Strategy
Anneli Guthke1, Philipp Luca Reiser1, and Paul Bürkner2
Anneli Guthke et al.
  • 1University of Stuttgart, Stuttgart Center for Simulation Science, Stuttgart, Germany (anneli.guthke@simtech.uni-stuttgart.de)
  • 2TU Dortmund University, Dortmund, Germany

Physics-based hydrological modelling provides great opportunities for risk assessment and water resources management. However, diagnostic model evaluation and quantitative uncertainty assessment remain a challenge: (1) Model choices, boundary conditions, and prior assumptions about input, parameter or data uncertainty might be hard to formulate or justify; (2) rigorous propagation of uncertainties struggles when the analysed model structure is not “true”, and (3) a full propagation of uncertainties is often computationally prohibitive for complex models.

Alternative approaches promote the extraction of information directly from data, thereby avoiding overly strict physics-based constraints and the pitfalls of uncertainty quantification. Challenges of these data-driven approaches include the lack (or difficulty of) explainability, transparency, and transferability to unseen scenarios.

To explore the frontier of where those two perspectives (should) converge, we investigate the potential of surrogate models (computationally cheaper, data-driven representations of complex models) as a binding link with several potential benefits: (1) they alleviate the computational burden and thereby allow for a fully Bayesian uncertainty analysis; (2) they are flexible enough to overcome structural deficits of the original complex model, thereby enabling a better predictive performance, and (3) being data-driven, we can elegantly fuse the information from available data into their training process.

Methodologically, we propose a weighted data-integrated training of surrogates via two competing approaches that differ technically, but also philosophically, and reveal complementing insights about the strengths and weaknesses of the physics-based model and about the additional information in the available data, thereby facilitating deeper system understanding and improved (hybrid) modelling. We demonstrate the proposed workflow on didactic examples and a real-world case study. We expect this approach to be generally useful for modelling dynamic systems, as it contributes to more realistic uncertainty assessment and opens up ways for model development.  

How to cite: Guthke, A., Reiser, P. L., and Bürkner, P.: Training Surrogates with Knowledge and Data: A Bayesian Hybrid Modelling Strategy, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-6278, https://doi.org/10.5194/egusphere-egu25-6278, 2025.