EGU26-13835, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-13835
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Monday, 04 May, 16:45–16:55 (CEST)
 
Room 2.44
Large-Sample Nitrate Forecasting with a Regional LSTM: Multi-Source Inputs, Station Screening, and Attribution Analysis
Jiayi Tang1, Kwok Chun2, and Ana Mijic1
Jiayi Tang et al.
  • 1Imperial College London, Department of Civil and Environmental Engineering, London, UK
  • 2University of the West of England, School of Architecture and Environment, Bristol, UK

Characterizing large-scale spatiotemporal variability in river water qaulity and its drivers is challenging because monitoring data are irregular in time and space, and the underlying hydrological, biogeochemical, and human processes are complex and interconnected. These difficulties are acute for nitrate, whose behaviour reflects interacting natural and human drivers that vary across catchments and over time. In contrast to large-sample streamflow prediction, where measurements are frequent and relatively stable, large-sample water quality prediction usually needs to cope with sparse, uneven sampling and human-driven changes in both pressures and responses.

To meet these challenges, we designed a domain-guided, four-step workflow that emphasizes realistic handling of irregular monitoring data and trains a regional LSTM so that sites can share information and learn common patterns from many catchments. First, we assign one monitoring station to each catchment outlet using distance along the river network and apply quality checks to identify comparatively reliable sites. Second, we select and process input variables around nitrate-relevant processes and human activities (e.g., meteorology, land use, agricultural and urban pressures). Third, we train a single, England-wide long short-term memory (LSTM) model on historical records and evaluate performance using time-based tests within catchments and space-based tests across catchments (regions) to assess temporal and spatial generalisation. Finally, we apply attribution analysis to separate the roles of meteorological variability and static catchment characteristics and to examine how dominant drivers vary spatially for national upscaling.

Using nitrate measurements from the Environment Agency Water Quality Archive, the LSTM ingests diverse input categories to generate daily nitrate predictions at more than 2000 Water Framework Directive (WFD) catchment outlets. Results show that predictive skill varies across catchments; station screening improves generalization relative to models trained on all stations; and attribution reveals differing roles of meteorological drivers versus static properties across contrasting catchment settings. Overall, the framework produces daily predictions from irregular and limited observations, provides interpretable and water quality-focused insights into drivers at scale, and offers a large-scale view of how nitrate controls vary in space. It also supports future work on transfer learning and local fine-tuning to enable scalable assessment and management.

How to cite: Tang, J., Chun, K., and Mijic, A.: Large-Sample Nitrate Forecasting with a Regional LSTM: Multi-Source Inputs, Station Screening, and Attribution Analysis, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13835, https://doi.org/10.5194/egusphere-egu26-13835, 2026.