EGU25-10650, updated on 15 Mar 2025
https://doi.org/10.5194/egusphere-egu25-10650
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
The relationship between theoretical maximum prediction limits of the LSTM and network size
Daniel Klotz1,2, Sanika Baste3, Ralf Loritz3, Martin Gauch4, and Frederik Kratzert2
Daniel Klotz et al.
  • 1IT:U Interdisciplinary Transformation University, Linz, Austria
  • 2Google Research, Vienna, Austria
  • 3Institute of Water and Environment, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
  • 4Google Research, Zurich, Switzerland

Machine learning is increasingly important for rainfall–runoff modelling. In particular, the community started to widely adopt the Long Short-Term Memory (LSTM) network. One of the most important established best practices  in this context is to train the LSTMs on a large number of diverse basins  (Kratzert et al., 2019; 2024). Intuitively, the reason for adopting this practice is that training deep learning models on small and homogeneous data sets (e.g., data from only a single hydrological basin) leads to poor generalization behavior — especially for high-flows. 

 

To examine this behavior, Kratzert et al. (2024) use a theoretical maximum prediction limit for LSTMs. This theoretical limit is computed as the L1 norm (i.e., the sum of the absolute values of each vector component) of the learned weight vector that relates the hidden states to the estimated streamflow. Hence, for random vectors we could simply obtain larger theoretical limits by increasing the size of the network (i.e., the  number of parameters). However, since LSTMs are trained using gradient descent, this relationship is more intricate. 

 

This contribution explores the relationship between the theoretical limit and the network size. In particular, we will look at how increasing the network size in untrained models increases the prediction limit and contrast it to the scaling behavior of trained models.



How to cite: Klotz, D., Baste, S., Loritz, R., Gauch, M., and Kratzert, F.: The relationship between theoretical maximum prediction limits of the LSTM and network size, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-10650, https://doi.org/10.5194/egusphere-egu25-10650, 2025.