- 1Department of Environmental and Biological Sciences, University of Eastern Finland, Kuopio, Finland.
- 2School of Engineering, University of Warwick, Coventry, UK.
Lakes cover only about three percent of the Earth’s land surface, yet they are a critical component of the hydrosphere and provide substantial ecosystem services. Sustained monitoring and modeling of lake ecological status are therefore essential. Within the European Union, the Water Framework Directive provides a harmonized framework for assessing lake ecological status using biological quality elements such as phytoplankton, aquatic flora, benthic invertebrates, and fish, classifying lakes into five status classes: high, good, moderate, poor, and bad. However, sparse and infrequent field-based ecological measurements limit spatial and temporal coverage, particularly for near-real-time assessments.
We present a national-scale machine-learning framework for ecological status classification of 2,487 Finnish lakes using routinely available water-quality and morphometric variables, including total nitrogen, total phosphorus, turbidity, conductivity, pH, color, dissolved oxygen, Secchi depth, maximum depth, and lake surface area. Multiple classification models were evaluated, including Random Forest, XGBoost, Support Vector Machine, Artificial Neural Network, and TabNet. Model uncertainty was explicitly quantified using a Bayesian neural network. An ensemble of models achieved a macro F1 score of 0.67 and a Matthews correlation coefficient of 0.50 under five-fold cross-validation.
The Bayesian neural network achieved the lowest Brier score of 0.44 and Expected Calibration Error of 0.04, indicating superior probabilistic calibration compared to other models. Mean total predictive uncertainty across all lakes was 0.12, with the lowest uncertainty observed for high and bad ecological status classes and the highest uncertainty associated with intermediate classes, reflecting transitional ecological conditions and class overlap. These results demonstrate that data-efficient machine-learning models, combined with explicit uncertainty quantification, can support cost-effective and scalable ecological status assessment for lakes with limited monitoring data.
The proposed framework enhances national-scale reporting, supports prioritization of restoration efforts, and provides uncertainty-aware decision support for lake management, particularly in Arctic–Boreal regions.
Keywords: Ecological status assessment, Water quality, Finnish lakes, Machine learning, Near-real-time monitoring.
How to cite: Mahdian, M., Abolfathi, S., Kukkonen, J., and Kolehmainen, M.: Uncertainty-Aware Data-Driven Framework for Near Real-Time Lake Ecological Status Assessment under the EU Water Framework Directive , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-23205, https://doi.org/10.5194/egusphere-egu26-23205, 2026.