A lake water level prediction method based on data augmentation and Physics-Informed Neural Networks with imbalanced data

lingjiang lu; Tao Yan; Yongcan Chen; Haoran Wang; Tong Yang; Zhaowei Liu

doi:https://doi.org/10.5194/egusphere-egu26-4139

[Back] [Session HS4.7]

EGU26-4139, updated on 13 Mar 2026

https://doi.org/10.5194/egusphere-egu26-4139

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

A lake water level prediction method based on data augmentation and Physics-Informed Neural Networks with imbalanced data

lingjiang lu¹, Tao Yan¹, Yongcan Chen¹, Haoran Wang², Tong Yang³, and Zhaowei Liu¹

lingjiang lu et al.

¹State Key Laboratory of Hydroscience and Engineering, Tsinghua University, Beijing 100084, China
²Sichuan Energy Internet Research Institute,Tsinghua University, Chengdu 610042, China
³Chongqing University, Chongqing 400044, China

Over the past three decades, lake water level fluctuations have intensified due to climate change and increasing water demand, creating an urgent need for accurate and efficient prediction methods. However, existing deep learning-based surrogates often suffer from two major limitations: the lack of physically informed guidance for hyper-parameter selection, which increases computational costs, and the scarcity of extreme water level samples, which leads to imbalanced datasets and reduced accuracy. To address the limitations, this study proposes a novel Physics-Informed Neural Network (PINN) framework that integrates data augmentation with physically guided hyper-parameter selection. The framework employs boundary water level time series as input, incorporates mass-conservation constraints, and applies a clustering-based augmentation method to enrich extreme event samples. Its applicability was validated in the Lower Lake of Nansi Lake in China. Evaluation using Root Mean Squared Error (RMSE) and Nash–Sutcliffe Efficiency (NSE) shows that incorporating physical constraints robustly improves predictive accuracy, with performance even surpassing that of a classical LSTM model. Physically guided hyper-parameter selection further enhances both training efficiency and accuracy, and the proposed augmentation method reduces RMSE by 69.1% under extreme conditions. Compared with an existing augmentation method, the proposed method can shorten training time by 63.35% with better prediction performance. The final surrogate achieves RMSE = 0.021 m and NSE > 0.94 (against observations), requiring only 2.42% of the computational time of a traditional hydrodynamic model. These results highlight the framework’s potential for reliable real-world water level prediction and its transferability to other hydrological systems.

How to cite: lu, L., Yan, T., Chen, Y., Wang, H., Yang, T., and Liu, Z.: A lake water level prediction method based on data augmentation and Physics-Informed Neural Networks with imbalanced data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4139, https://doi.org/10.5194/egusphere-egu26-4139, 2026.