Do We Need Deep Learning Models? Assessing the Complexity of Machine Learning Models for Seasonal Streamflow Forecasting Across Diverse Subbasins

Amin Elshorbagy; Duc-Hai Nguyen; Muhammad Naveed Khaliq; Fisaha Unduche; M. Khaled Akhtar

doi:https://doi.org/10.5194/egusphere-egu26-8281

[Back] [Session HS3.1]

EGU26-8281, updated on 14 Mar 2026

https://doi.org/10.5194/egusphere-egu26-8281

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Do We Need Deep Learning Models? Assessing the Complexity of Machine Learning Models for Seasonal Streamflow Forecasting Across Diverse Subbasins

Amin Elshorbagy¹, Duc-Hai Nguyen¹, Muhammad Naveed Khaliq², Fisaha Unduche³, and M. Khaled Akhtar⁴

Amin Elshorbagy et al.

¹Civil, Geological, and Environmental Engineering, University of Saskatchewan, Saskatoon, Canada (amin.elshorbagy@usask.ca)
²Ocean, Coastal, and River Engineering Research Centre, National Research Council Canada, Ottawa, Canada
³Manitoba Transportation and Infrastructure, Hydrologic Forecast Centre, Winnipeg, Canada
⁴Alberta Environment and Protected Areas, Government of Alberta, Edmonton, Canada

Over the past few years, the of use machine learning (ML) models in hydrology has shifted towards deep learning (DL), perhaps because of the icreasing availability of big data and ready-to-be-deployed software tools. It seems that the underlying assumption is that more complex (DL) models are desired, however, there are less efforts to systematically investigate and validate this perception. Deep learning models like LSTM have greatly improved sequential data forecasting, but their success depends on large labeled datasets, limiting their effectiveness in data-scarce domains. This study investigates whether complex ML models offer significant advantages over simpler approaches in predicting seasonal streamflow in Canada. Using a comprehensive case study, we examine multiple subbasin types—mountain, prairie, and non-prairie—along with headwater and downstream locations, exhibiting both natural and human-influenced regulated flow regimes. These variations introduce distinct hydrological behaviors, making them an ideal testbed for assessing model complexity requirements. Our case study includes 135 subbasins from the Canadian Nelson-Churchill River Basin, comprising the vast area starting from the Rocky mountains up to the Hudson Bay, with the monthly temporal resolution and spatial scales of the order of 200 km² to ~1.0 x10⁶ km², as reflected by drainage areas of all subbasins.

We implemented a suite of ML techniques, ranging from traditional algorithms to advanced DL architectures. Specifically, we compared models developed based on Artificial Neural Networks (ANNs), Random Forests (RF), Long Short-Term Memory (LSTM) networks, attention-based LSTM networks, and stacked LSTM configurations. Each model was trained and tested using historical flow data across multiple subbasins, with performance evaluated through metrics, such as Nash-Sutcliffe Efficiency and Percent Mean Bias Error. We also experimented with alternative sets of input features, i.e., (i) all potential hydrometeorological inputs, (ii) correlation and partial mutual information-based inputs, and (iii) causality-based inputs.

Our findings reveal that while simpler models like RF and ANNs perform adequately in certain contexts—particularly in headwater subbasins with natural flow regimes—complex architectures, such as LSTM and stacked LSTM configurations demonstrate superior performance for downstream and regulated basins, where flow patterns exhibit higher variability and nonlinearity. In contrast, attention-based LSTM networks do not appear to outperform other options across certain basins. Interestingly, the benefits of DL models are not uniform across all subbasin types; prairie and non-prairie basins show mixed results, suggesting that model complexity should be tailored to basin characteristics rather than universally applied. These results highlight the importance of context-driven model selection to inform operational forecasting. Thus, water managers can leverage simpler models in less complex basins to reduce computational costs and data requirements, while reserving advanced architectures for highly regulated or downstream basins where accuracy gains justify the added complexity. This approach can potentially optimize resource allocation, enhance forecast reliability, and support informed decision-making in water allocations, reservoir operations, and drought/flood preparedness. Additionally, in the absence of data and computational constraints, multi-model outputs can be synthesized further through fusion modelling techniques to enhnace overall prediction accuracy.

How to cite: Elshorbagy, A., Nguyen, D.-H., Khaliq, M. N., Unduche, F., and Akhtar, M. K.: Do We Need Deep Learning Models? Assessing the Complexity of Machine Learning Models for Seasonal Streamflow Forecasting Across Diverse Subbasins, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8281, https://doi.org/10.5194/egusphere-egu26-8281, 2026.