Revealing the key factors and uncertainties in data-driven hydrological prediction using Explainable Artificial Intelligence techniques

Ye Tian; Weili Tan; Xing Yuan

doi:https://doi.org/10.5194/egusphere-egu24-14666

[Back] [Session HS3.5]

EGU24-14666, updated on 09 Mar 2024

https://doi.org/10.5194/egusphere-egu24-14666

EGU General Assembly 2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Revealing the key factors and uncertainties in data-driven hydrological prediction using Explainable Artificial Intelligence techniques

Ye Tian¹, Weili Tan², and Xing Yuan¹

Ye Tian et al.

¹School of Hydrology and Water Resources, Nanjing University of Information Science and Technology, Nanjing, China
²China Institute of Water Resources and Hydropower Research，Beijing, China

Deep learning models for streamflow prediction have been widely used but are often considered as "black boxes" due to their lack of interpretability. To address this issue, the field has recently focused on Explainable Artificial Intelligence (XAI) methods to improve the transparency of these models. In this study, we aimed to investigate the influence of precipitation uncertainty on data-driven modeling and elucidate the hydrological significance of deep learning streamflow modeling in both temporal and spatial dimensions by Explainable Artificial Intelligence techniques. To achieve this, an LSTM model for time series prediction and a CNN-LSTM model for fusion spatial-temporal information are proposed. These models are driven by five sets of reanalyzed datasets. The contribution of precipitation before peak flow to runoff simulation is quantified, in order to identify the most important processes in runoff generation for each river basin. In addition, visualization techniques are employed to analyze the relationship between the weights of the convolutional layers in our models and the distribution of precipitation features. By doing so, we aimed to gain insights into the underlying mechanisms of the models' predictions.

The results of our study revealed several key findings. In the high-altitude areas of the Yangtze River's upper reaches, we found that snowmelt runoff, historical precipitation, and recent precipitation were the combined causes for floods. In the middle reach of the Yangtze River, floods were induced by the combined effect of historical and recent precipitation, except for the Ganjiang River, where historical precipitation events played a major role in controlling flood events. Through the visualization of convolutional layers, we discovered that areas with high convolutional layer weights had a greater impact on the model's predictions. We also observed a high similarity between the weight distribution of the convolutional layers and the spatial distribution of multi-year average precipitation in the upper reach river basins. In the middle reach, the weight distribution of the model's convolutional layers showed a strong correlation with the monthly maximum precipitation in the basin. Overall, this study provides valuable insights into the potential of deep learning models for streamflow prediction and enhances our understanding of the impacts of precipitation in the Yangtze River Basin.

How to cite: Tian, Y., Tan, W., and Yuan, X.: Revealing the key factors and uncertainties in data-driven hydrological prediction using Explainable Artificial Intelligence techniques, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14666, https://doi.org/10.5194/egusphere-egu24-14666, 2024.