EGU26-15497, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-15497
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Monday, 04 May, 16:35–16:45 (CEST)
 
Room 2.44
Leveraging machine learning and large-scale datasets to elucidate the spatial and temporal dynamics of C, N and P
Felipe Saavedra1, Pia Ebeling2, Lan Remeta1, Rohini Kumar3, Tam V. Nguyen1, Christian Siebert1, and Ralf Merz1
Felipe Saavedra et al.
  • 1Helmholtz Centre for Environmental Research - UFZ, Catchment Hydrology, Halle, Germany (felipe.saavedra@ufz.de)
  • 2Helmholtz Centre for Environmental Research - UFZ, Hydrogeology, Leipzig, Germany (felipe.saavedra@ufz.de)
  • 3Helmholtz Centre for Environmental Research - UFZ, Computational Hydrosystems, Leipzig, Germany (felipe.saavedra@ufz.de)

Carbon (C), Nitrogen (N) and Phosphorus (P) are key macronutrients controlling ecosystem functioning; however, human activities have caused severe disturbances both in concentration levels as well as in their C-N-P ratio with potential consequences for ecosystem health. Large-scale assessment of C, N and P concentrations based on in situ data at high temporal resolution remains challenging due to discontinuous and spatially limited data availability. 

 

To address this gap we leveraged the recently published low-frequency (biweekly to monthly) German water-quality dataset QUADICA v2 (Ebeling et al., 2025) to develop three regional deep learning models to predict daily concentrations of dissolved organic carbon (DOC), nitrate (NO3) and phosphate (PO4). These species are commonly used as proxies for the reactive and bioavailable fractions of C, N, and P, with NO3 representing the dominant form of dissolved inorganic nitrogen in the study catchments. We selected catchments with at least 20 years of concentration data and 200 samples for each compound as well as discharge observations, resulting in a total of 155 catchments. For each compound, we trained a single Long Short-Term Memory (LSTM) model across all catchments. Model performance is satisfactory in most of the catchments with median Kling–Gupta efficiencies of 0.55, 0.62 and 0.45 for DOC, NO3 and PO4 respectively (average across cross-validation folds).

 

We used  SHAP to explain spatial and temporal variabilities in predicted concentrations. Results for spatial variability indicate that DOC is mainly controlled by topographic and climatic factors, while NO3 is controlled by land use and soil properties, and PO4 variability is governed by geology, climate and point sources. For temporal variability, we further cluster catchments into groups with similar dominant drivers based on temporal SHAP values. For DOC and nitrate, the clusters are mainly explained by precipitation and temperature variability. In contrast, phosphate exhibits three distinct clusters characterized by either precipitation and temperature, discharge or seasonality. Our results demonstrate that low-frequency water-quality data combined with deep learning and explainable AI can provide new insights into daily C, N, P dynamics at the large scale. This basis allows us to further characterize C, N, P archetypes, nutrient interactions and their dominant drivers.  

How to cite: Saavedra, F., Ebeling, P., Remeta, L., Kumar, R., V. Nguyen, T., Siebert, C., and Merz, R.: Leveraging machine learning and large-scale datasets to elucidate the spatial and temporal dynamics of C, N and P, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15497, https://doi.org/10.5194/egusphere-egu26-15497, 2026.