Information-Theoretic Bayesian Active Learning for Surrogate Training and Inverse Modeling in Subsurface Transport Applications

Maria Fernanda Morales Oreamuno; Tim Brünnette; Stefania Scheurer; Sergey Oladyshkin; Wolfgang Nowak

doi:https://doi.org/10.5194/egusphere-egu26-4056

[Back] [Session ERE5.7]

EGU26-4056, updated on 13 Mar 2026

https://doi.org/10.5194/egusphere-egu26-4056

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Information-Theoretic Bayesian Active Learning for Surrogate Training and Inverse Modeling in Subsurface Transport Applications

Maria Fernanda Morales Oreamuno, Tim Brünnette, Stefania Scheurer, Sergey Oladyshkin, and Wolfgang Nowak

Maria Fernanda Morales Oreamuno et al.

University of Stuttgart, IWS, LS3, Stuttgart, Germany (maria.morales@iws.uni-stuttgart.de)

Running detailed, physics-based numerical simulations of subsurface transport is often computationally expensive. This becomes a challenge when calibrating models against observed data using methods that require a large number of model runs, such as Bayesian inference. To address this challenge, surrogate models are frequently used to approximate simulation outputs. Surrogates are trained using input-output pairs generated by the physics-based model. Traditional approaches typically rely on space-filling designs that uniformly cover the entire parameter space. However, for high-dimensional problems, this becomes impractical and tends to waste computational resources on parameter regions that are either physically irrelevant or contradict available measurement data.

To overcome these limitations, we utilize a Bayesian Active Learning (BAL) framework that iteratively selects training points most informative for Bayesian inference given available measurements. We employ Gaussian Processes and Bayesian-Polynomial Chaos Expansions as surrogates, which provide probability distributions for their predictions. Our approach takes advantage of these predictive distributions to evaluate candidate training points using information-theoretic criteria. To account for measurement uncertainty and prevent the algorithm from over-sampling local likelihood maxima, we investigate different strategies for representing observations within the selection process. These criteria are integrated into a multi-objective scoring function that balances global exploration (reducing surrogate uncertainty) with targeted exploitation (refining high-likelihood regions). Additionally, we demonstrate how observations from early time steps can iteratively guide the selection of training points to improve predictive accuracy for later, critical periods of the transport process.

We test this method on analytical benchmarks and on subsurface transport models. The framework is evaluated in terms of convergence speed and posterior accuracy relative to existing active learning strategies and reference solutions derived from the full physics-based model. Overall, the proposed goal-oriented strategy aims to reduce the number of expensive model evaluations required for surrogate training, improving the efficiency of subsurface characterization, model calibration and predictive modeling.

How to cite: Morales Oreamuno, M. F., Brünnette, T., Scheurer, S., Oladyshkin, S., and Nowak, W.: Information-Theoretic Bayesian Active Learning for Surrogate Training and Inverse Modeling in Subsurface Transport Applications, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4056, https://doi.org/10.5194/egusphere-egu26-4056, 2026.

OSPP voting tool

This contribution takes part in the OSPP contest. Please log in to see the relevant judging section.