EGU24-571, updated on 08 Mar 2024
https://doi.org/10.5194/egusphere-egu24-571
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Hydrological Significance of Input Sequence Lengths in LSTM-Based Streamflow Prediction

Farzad Hosseini Hossein Abadi1, Cristina Prieto Sierra1, Grey Nearing2, Cesar Alvarez Diaz1, and Martin Gauch2
Farzad Hosseini Hossein Abadi et al.
  • 1IHCantabria—Instituto de Hidráulica Ambiental de la Universidad de Cantabria, Santander, Spain (farzad.hosseini@unican.es)
  • 2Google Research

Abstract

Hydrological modeling of flashy catchments, susceptible to floods, represents a significant practical challenge.  Recent application of deep learning, specifically Long Short-Term Memory networks (LSTMs), have demonstrated notable capability in delivering accurate hydrological predictions at daily and hourly time intervals (Gauch et al., 2021; Kratzert et al., 2018).

In this study, we leverage a multi-timescale LSTM (MTS-LSTM (Gauch et al., 2021)) model to predict hydrographs in flashy catchments at hourly time scales. Our primary focus is to investigate the influence of model hyperparameters on the performance of regional streamflow models. We present methodological advancements using a practical application to predict streamflow in 40 catchments within the Basque Country (North of Spain).

Our findings show that 1) hourly and daily streamflow predictions exhibit high accuracy, with Nash-Sutcliffe Efficiency (NSE) reaching values as high as 0.941 and 0.966 for daily and hourly data, respectively; and 2) hyperparameters associated with the length of the input sequence exert a substantial influence on the performance of a regional model. Consistently optimal regional values, following a systematic hyperparameter tuning, were identified as 3 years for daily data and 12 weeks for hourly data. Principal component analysis (PCA) shows that the first principal component explains 12.36% of the variance among the 12 hyperparameters. Within this set of hyperparameters, the input sequence lengths for hourly data exhibit the highest load in PC1, with a value of -0.523; the load of the input sequence length for daily data is also very high (-0.36). This suggests that these hyperparameters strongly contribute to the model performance.

Furthermore, when utilizing a catchment-scale magnifier to determine optimal hyperparameter settings for each catchment, distinctive sequence lengths emerge for individual basins. This underscores the necessity of customizing input sequence lengths based on the “uniqueness of the place” (Beven, 2020), suggesting that each catchment may demand specific hydrologically meaningful daily and hourly input sequence lengths tailored to its unique characteristics. In essence, the true input sequence length of a catchment may encapsulate hydrological information pertaining to water transit over short and long-term periods within the basin. Notably, the regional daily sequence length aligns with the highest local daily sequence values across all catchments.

In summary, our investigation stresses the critical role of the input sequence length as a hyperparameter in LSTM networks. More broadly, this work is a step towards a better understanding and achieving accurate hourly predictions using deep learning models.

 

Keywords

Hydrological modeling; Streamflow Prediction; LSTM networks; Hyperparameters configurations; Input sequence lengths

 

References:

Beven, K. (2020). Deep learning, hydrological processes and the uniqueness of place. Hydrological Processes, 34(16), 3608–3613. doi:10.1002/hyp.13805

Gauch, M., Kratzert, F., Klotz, D., Nearing, G., Lin, J., and Hochreiter, S. (2021). Rainfall–runoff prediction at multiple timescales with a single Long Short-Term Memory network, Hydrol. Earth Syst. Sci., 25, 2045–2062, DOI:10.5194/hess-25-2045-2021.

Kratzert, F., Klotz, D., Brenner, C., Schulz, K., & Herrnegger, M. (2018). Rainfall--runoff modelling using Long Short-Term Memory (LSTM) networks. Hydrology and Earth System Sciences, 22(11), 6005–6022. DOI:10.5194/hess-22-6005-2018.

 

How to cite: Hosseini Hossein Abadi, F., Prieto Sierra, C., Nearing, G., Alvarez Diaz, C., and Gauch, M.: Hydrological Significance of Input Sequence Lengths in LSTM-Based Streamflow Prediction, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-571, https://doi.org/10.5194/egusphere-egu24-571, 2024.

Supplementary materials

Supplementary material file

Comments on the supplementary material

AC: Author Comment | CC: Community Comment | Report abuse

supplementary materials version 1 – uploaded on 16 Apr 2024, no comments