Multivariate multi-horizon streamflow forecasting for extremes and their interpretation using an explainable deep learning architecture
- Indian Institute of Science, Bengaluru, Civil Engineering, India (meeramohan789@gmail.com, dasikanagesh@gmail.com)
Streamflow can be affected by numerous factors, such as solar radiation, underlying surface conditions, and atmospheric circulation which results in nonlinearity, uncertainty, and randomness in streamflow time series. Diverse conventional and Deep Learning (DL) models have been applied to recognize the complex patterns and discover nonlinear relationships in the hydrological time series and incorporating multi-variables in deep learning can match or improve streamflow forecasts and hopes to improve extreme value predictions. Multivariate approaches surpass univariate ones by including additional time series as explanatory variables. Deep neural networks (DNNs) excel in multi-horizon time series forecasting, outperforming classical models. However, determining the relative contribution of each variable in streamflow remains challenging due to the black-box nature of DL models.
We propose utilizing the advanced Temporal Fusion Transformers (TFT) deep-learning technique to model streamflow values across various temporal scales, incorporating multiple variables. TFT's attention-based architecture enables high-performance multi-horizon forecasting with interpretable insights into temporal dynamics. Additionally, the model identifies the significance of each input variable, recognizes persistent temporal patterns, and highlights extreme events. Despite its application in a few studies across different domains, the full potential of this model remains largely unexplored. The study focused on Sundargarh, an upper catchment of the Mahanadi basin in India, aiming to capture pristine flow conditions. QGIS was employed to delineate the catchment, and daily streamflow data from 1982 to 2020 were obtained from the Central Water Commission. Input variables included precipitation, potential evaporation, temperature, and soil water volume at different depths. Precipitation and temperature datasets were obtained from India Meteorological Department (IMD) datasets, while other variables were sourced from the ECMWF fifth-generation reanalysis (ERA-5). Hyperparameter tuning was conducted using the Optuna optimization framework, known for its efficiency and easy parallelization. The model trained using quantile loss function with different combinations of quantiles, demonstrated superior performance with upper quantiles. Evaluations using R2 and NSE indicated good performance in monthly streamflow predictions for testing sets, particularly in confidently predicting low and medium flows. While peak flows were well predicted at certain timesteps, there were instances of underperformance. Unlike other ML algorithms, TFT can learn seasonality and lag analysis patterns directly from raw training data, including the identification of crucial variables. The model underwent training for different time periods, checking for performance improvement with increased length of data. To gain a better understanding of how distinct sub-processes affect streamflow patterns at various time scales, the model was applied at pentad and daily scales. Evaluation at extreme values prompted an investigation into improving predictions through quantile loss function adjustments. Given the computational expense of daily streamflow forecasting using TFT with multiple variables, parallel computing is employed. Results demonstrated considerable accuracy, but validating TFT's interpretive abilities require testing alternative ML models.
How to cite: Mohan, M. and Kumar D, N.: Multivariate multi-horizon streamflow forecasting for extremes and their interpretation using an explainable deep learning architecture, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-451, https://doi.org/10.5194/egusphere-egu24-451, 2024.