EGU26-1806, updated on 13 Mar 2026
https://doi.org/10.5194/egusphere-egu26-1806
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Wednesday, 06 May, 10:45–12:30 (CEST), Display time Wednesday, 06 May, 08:30–12:30
 
Hall A, A.106
Making surrogates robust against model misspecification: A residual-aware combination of Gaussian processes and U-Net architectures. 
Waqas Ahmed1,2, Ahsan Qasam Khan1, and Wolfgang Nowak1
Waqas Ahmed et al.
  • 1University of Stuttgart, Institute for Modelling Hydraulic and Environmental Systems, Stochastic Simulation and Safety Research for Hydrosystems, Germany (waqas.ahmed@iws.uni-stuttgart.de)
  • 2Mehran University of Engineering and Technology,US Pakistan Center for Advanced Studies in Water, Jamshoro, Sindh, Pakistan.

High-fidelity groundwater (GW) models are powerful tools for simulating complex subsurface processes and predicting groundwater levels with high accuracy—provided that high-quality input data is available. However, in many real-world applications, such high-quality data is rare. Input data are often noisy, sparse, and lack spatial resolution, which compromises the predictive power of these models. This presents a fundamental challenge: while the high-fidelity model is available, its application is limited by the low quality of the data typically encountered in operational settings. While physics-based simulations can help overcome the issue of data scarcity by generating synthetic training datasets, they do not address the issue of poor data quality—particularly the lack of spatial resolution in key inputs such as hydraulic conductivity (K), net recharge (R = N–ET), and pumping rates. These inputs should ideally be spatially distributed, but in practice, they are often poorly resolved or only available as point measurements. This raises a critical question: Should we deliberately degrade high-quality synthetic data during training to match the expected quality of application data? Or can we develop a surrogate model that is inherently robust to the data quality gap?

We propose the latter: a novel approach that trains a deep learning model to be aware of and compensate for residuals that occur due to a lack of input fidelity. The presented method tightly integrates the UNET deep learning architecture, physics-based MODFLOW model, and Gaussian process regression models into a hybrid training and prediction pipeline for building a residual-aware surrogate model. We tested this modelling approach on a study area in Germany, where we generated multi-fidelity training datasets with the MODLFOW 2005 simulator by varying the fidelity of the input permeability field. The presented hybrid approach is suitable for surrogating models where multi-fidelity models are available, but inference is only required for low-fidelity inputs.

How to cite: Ahmed, W., Khan, A. Q., and Nowak, W.: Making surrogates robust against model misspecification: A residual-aware combination of Gaussian processes and U-Net architectures. , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1806, https://doi.org/10.5194/egusphere-egu26-1806, 2026.