EGU26-20548, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-20548
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Wednesday, 06 May, 10:45–12:30 (CEST), Display time Wednesday, 06 May, 08:30–12:30
 
Hall A, A.103
A Deep Learning surrogate for groundwater storage change prediction at regional scale (Duero River Basin, Spain)
Hector Aguilera1, Víctor Gómez-Escalonilla2, Eva García Tricás1, Olga García Menéndez1, África de la Hera-Portillo1, Manuel Rodríguez del Rosario2, and Pedro Martínez-Santos2
Hector Aguilera et al.
  • 1Instituto Geológico y Minero de España (IGME-CSIC), Ríos Rosas 23, 28003 Madrid (h.aguilera@igme.es)
  • 2Departamento de Geodinámica, Estratigrafía y Paleontología, Facultad de Ciencias Geológicas, Universidad Complutense de Madrid (UCM), C/José Antonio Novais 12, 28040 Madrid, Spain.

Accurate, high-resolution estimation of groundwater storage changes (GWSC) is critical for sustainable water management, particularly in semi-arid basins facing increasing climatic and anthropogenic pressures. Traditional process-based hydrogeological models often fall short due to computational constraints, coarse resolution, and poor performance in data-scarce regions. This study presents an innovative data-driven surrogate modelling framework that overcomes these limitations by fusing large-scale model outputs with local observations to generate reliable, high-resolution GWSC estimates.

We demonstrate the framework in the Duero Basin as a pilot site, an 80,000 km2 basin in central Spain. The methodology involves a multi-step hybrid data conditioning process. First, total water storage (TWS) outputs from the Terrestrial Systems Modelling Platform (TSMP, 11 km) are corrected and downscaled using in-situ groundwater level (GWL) observations via spatiotemporal kriging. This generates a corrected GWSC target variable with an explicit pixel-level uncertainty flag (low, moderate, high). This conditioned dataset then trains a state-of-the-art Spatiotemporal Transformer (STT) deep learning model, designed to capture complex spatiotemporal dependencies. The STT uses 48 months of historical data to forecast GWSC 12 months ahead, incorporating static (e.g., geology, land use, socio-economic) and dynamic (precipitation, potential evapotranspiration, temperature) features. An uncertainty-aware training scheme uses the uncertainty flags both as an input feature and to weight the loss function. A key architectural innovation is the implementation of a "late fusion" concatenation strategy, which enhances spatial awareness. A learned coordinate embedding, generated by a small Multi-Layer Perceptron (MLP) from geographic coordinates, is concatenated to the transformer's final layer outputs before prediction. This allows the model to learn and correct for persistent, location-specific biases (e.g., systematic differences between southeastern and northwestern aquifer dynamics) without disrupting the core temporal attention mechanisms, thereby stabilizing training and improving regional accuracy. The STT’s performance is benchmarked against an XGBoost model and combined into an optimal linear ensemble.

Results show the ensemble model achieves robust performance, with a train, validation and test R2 of 0.82, 0.46 and 0.44, respectively, outperforming individual models. Spatial analysis reveals that predictive skill is highest in areas where data conditioning yielded low uncertainty. Feature importance analysis ranks precipitation, evapotranspiration, and water demands as the most influential predictors. The framework successfully generates spatially explicit maps of GWSC and associated uncertainty across the basin.

This study concludes that integrating process-model outputs with local observations through geostatistical conditioning provides a viable pathway for creating reliable training data for deep learning surrogates. The proposed STT-based framework offers a scalable, computationally efficient alternative to traditional models for operational groundwater monitoring and forecasting. Its modular design ensures transferability to other basins, marking a significant step towards improving groundwater resource management in data-scarce and hydrogeologically complex regions worldwide.

How to cite: Aguilera, H., Gómez-Escalonilla, V., García Tricás, E., García Menéndez, O., de la Hera-Portillo, Á., Rodríguez del Rosario, M., and Martínez-Santos, P.: A Deep Learning surrogate for groundwater storage change prediction at regional scale (Duero River Basin, Spain), EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20548, https://doi.org/10.5194/egusphere-egu26-20548, 2026.