EGU26-12440, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-12440
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Friday, 08 May, 14:00–15:45 (CEST), Display time Friday, 08 May, 14:00–18:00
 
Hall A, A.19
Large-domain transferability of machine learning metamodels for predicting water transit time to baseflow
Mario Soriano1 and Reed Maxwell2
Mario Soriano and Reed Maxwell
  • 1National University of Singapore, Department of Geography, Singapore, Singapore (mario.soriano@nus.edu.sg)
  • 2Princeton University, Department of Civil & Environmental Engineering and High Meadows Environmental Institute, New Jersey, United States of America (reedmaxwell@princeton.edu)

Transit time to baseflow refers to the amount of time between when a water parcel enters a catchment as precipitation and when it exits the system via discharge. It is a key concept that links climate variability, hydrological transport, and biogeochemical processes, with broad implications for both surface water and groundwater quality, resource sustainability, and vulnerability to climate change impacts. Transit time distributions can be inferred from spatially resolved time-series measurements of environmental tracer concentrations, but such observations are typically available only in a limited number of locations such as highly instrumented catchments. Across large regions, physically based numerical models have been shown to accurately describe transit time distributions when compared to tracer data, but these models often require extensive computational resources.

In this study, we examine machine learning approaches for efficient prediction of transit time, specifically investigating their spatial transferability across multiple large domains. We employ a continental scale physically based hydrologic model coupled with Lagrangian particle tracking to quantify transit time to baseflow metrics in four large river basins in the conterminous USA: Upper Colorado (290,000 sq km), Missouri (1,350,000 sq km), Upper Mississippi (490,000 sq km), and Ohio (420,000 sq km). We use results from the physically based model to train machine learning metamodels for predicting transit time metrics with multiple spatial aggregation units, with input predictors describing topography, climate, and geology. Functional input-output relationships learned by metamodels are assessed using model-agnostic explainability techniques and evaluated against theoretical physically based relationships. Spatial cross-validation frameworks are used to evaluate cross-domain predictive accuracy and characterize the influence of input data quantity and distribution similarity between training and target regions. Results from the analysis help elucidate the potential utility and limitations of machine learning metamodels for computationally efficient prediction of transit time metrics in data scarce regions.

How to cite: Soriano, M. and Maxwell, R.: Large-domain transferability of machine learning metamodels for predicting water transit time to baseflow, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12440, https://doi.org/10.5194/egusphere-egu26-12440, 2026.