- 1Northwest University, China (tanzhenyu@nwu.edu.cn)
- 2Plymouth Marine Laboratory, UK
The water quality of lakes and reservoirs is influenced by atmospheric and land-use pressures, requiring actionable insights for effective management. Given the unique nature of each water body, data-driven modelling provides a practical solution for identifying sensitivities to these pressures, circumventing the complexity of hydrological-biogeochemical models. Remote sensing technologies offer consistent, multi-temporal, and multi-scale water quality monitoring, while global weather forecasting models enable predictions of key environmental parameters. Integrating these datasets facilitates a systematic examination of catchment-to-lake dynamics.
This study introduces a unified approach to modelling relationships between satellite-derived water quality metrics, such as Chlorophyll-a (Chl-a) concentration and turbidity, and meteorological drivers influencing catchments. Using multivariate autoregressive models, we aim to identify the influence of environmental factors on water quality variations, and determine which sub-basins exert the greatest influence on lake dynamics. This approach supports short-term predictions of water quality changes. Ultimately, we anticipate that the data-driven models can be used to predict short-term water quality changes
The study focuses on small and medium-sized lakes and reservoirs in the United Kingdom, using Sentinel-2 MSI observations for high-resolution water quality datasets. ERA5-Land hourly reanalysis data provided meteorological variables influencing water quality, including wind, lake mixed-layer temperature, solar radiation, precipitation, and runoff. Both datasets were aggregated into five-day time series to address observation intervals caused by orbital patterns and cloud cover. Aggregated data were normalized and stabilized to account for variable magnitudes before being input into autoregressive models.
Vector Autoregression (VAR) was used to assess long-term environmental influences on water quality, leveraging Impulse Response Function (IRF) and Forecast Error Variance Decomposition (FEVD). The reliance of VAR models on historical data enabled analysis of prolonged effects, with optimal four-time lags. In contrast, Autoregressive Integrated Moving Average with Explanatory Variables (ARIMAX) incorporated contemporary meteorological inputs, allowing for short-term impact analysis. ARIMAX models also enabled near-term water quality predictions using forecasted meteorological variables. At the sub-basin level, models were evaluated using the Fréchet distance, which quantifies the similarity between time-series curves. By comparing Fréchet distances across sub-basins, the relative contributions of each sub-basin to lake water quality variations were determined.
Our findings suggest that: 1) VAR models explained the temporal variability in lake water quality variables with a strong fitness (R2 > 0.82 for Chl-a and R2 > 0.69 for turbidity); 2) VAR models relied heavily on the lake water quality inputs from priors with optimal four time lags. The first lag contributed the most, with a mean weight of 0.61 (σ = 0.45) for Chl-a concentration and 0.71 (σ = 0.46) for turbidity; 3) Catchment drivers exhibited weights up to 2.3% at the second time horizon, with their influence increasing over time, while the contribution from water quality observations decreased; 4) ARIMAX models demonstrated high accuracy in simulating lake water quality variations (R2 > 0.83 for Chl-a and R2 > 0.68 for turbidity), showing promise for future water quality predictions.
How to cite: Tan, Z., Simis, S., and Warren, M.: Data-Driven Modelling of Lake Water Quality Response to Catchment Dynamics, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-11981, https://doi.org/10.5194/egusphere-egu25-11981, 2025.