HS3.6 | Hydroinformatics: data analytics, machine learning, hybrid modelling, optimisation
EDI
Hydroinformatics: data analytics, machine learning, hybrid modelling, optimisation
Including Arne Richter Awards for Outstanding ECS Lecture
Convener: Claudia BertiniECSECS | Co-conveners: Alessandro AmarantoECSECS, Niels Schuetze, Pascal Horton
Orals
| Thu, 01 May, 14:00–15:45 (CEST)
 
Room C, Fri, 02 May, 08:30–12:30 (CEST)
 
Room 3.16/17
Posters on site
| Attendance Thu, 01 May, 16:15–18:00 (CEST) | Display Thu, 01 May, 14:00–18:00
 
Hall A
Posters virtual
| Attendance Tue, 29 Apr, 14:00–15:45 (CEST) | Display Tue, 29 Apr, 08:30–18:00
 
vPoster spot A
Orals |
Thu, 14:00
Thu, 16:15
Tue, 14:00
Hydroinformatics has emerged over the last decades to become a recognised and established field of independent research within the hydrological sciences. It is concerned with the development and application of mathematical modelling, information technology, systems science and computational intelligence tools in hydrology. Hydroinformatics nowadays also deals with collecting, handling, analysing and visualising Big Data sourced from remote sensing, Internet of Things (IoT), earth and climate models, and defining tools and technologies for smart water management solutions.
This session aims to provide an active forum in which to demonstrate and discuss the integration and appropriate application of emergent techniques and technologies in water-related contexts.
Topics addressed in the session include:
* Predictive and exploratory models based on the methods of statistics, computational intelligence, machine learning and data science: neural networks, fuzzy systems, genetic programming, cellular automata, chaos theory, etc.
* Methods for analysing Big Data and complex datasets (remote sensing, IoT, earth system models, climate models): principal and independent component analysis, time series analysis, clustering, information theory, etc.
* Optimisation methods associated with heuristic search procedures (various types of genetic and evolutionary algorithms, randomised and adaptive search, etc.) and their application to hydrology and water resources systems
* Multi-model approaches and hybrid modelling approaches that blend process-based (mechanistic) and data-driven (machine learning) models
* Data assimilation, model reduction in integrated modelling, and High-Performance Computing (HPC) in water modelling
* Novel methods for analysing and quantifying model uncertainty and sensitivity
* Smart water data models and software architectures for linking different types of models and data sources
* IoT and Smart Water Management solutions
* Digital Twins for hydrology and water resources
Applications could belong to any area of hydrology or water resources, such as rainfall-runoff modelling, hydrometeorological forecasting, sedimentation modelling, analysis of meteorological and hydrologic datasets, linkages between numerical weather prediction and hydrologic models, model calibration, model uncertainty, optimisation of water resources, smart water management.

Orals: Thu, 1 May | Room C

Chairpersons: Claudia Bertini, Alessandro Amaranto, Pascal Horton
14:00–14:05
Artificial Intelligence in Hydrology
14:05–14:35
|
EGU25-5863
|
ECS
|
solicited
|
Highlight
|
Arne Richter Awards for Outstanding ECS Lecture
|
On-site presentation
Frederik Kratzert

Long Short-Term Memory networks (LSTMs) have been around since the early 90’s but only in the last few years have LSTMs gained significant popularity in the hydrological sciences. Related publication counts have grown exponentially, and LSTMs power some of the largest-scale operational flood forecasting systems.

In this presentation, I'll look back at my relatively short career as a student and researcher at the intersection of hydrology and machine learning. I don't claim to have introduced LSTMs to hydrology, but I'll share my own experience helping to develop this modeling approach into what it is today. We will look at what I saw in this neural network architecture, and why I thought it was well suited for hydrologic applications.

The tale goes as follows: Once upon a time, in a land (not so) far far away, a (not so young) master student of environmental engineering was teaching himself the dark arts of machine learning (ML). While studying ML for automated fish detection, he stumbled upon the LSTM architecture. Having just concluded a course on the design of conceptual hydrological models, he noticed the underlying similarity between the LSTM and these established approaches — and more generally, the conceptual approach for modeling the water cycle. With one of his dearest colleagues and friends, he started to work night and day (actually more nights than days) to see if the LSTM is indeed suitable for hydrology. From initial attempts at emulating the ABC and HBV models, to first real-world experiments in individual catchments, the LSTM was showing great potential. But it was not until he discovered the CAMELS dataset and started experimenting with large-sample hydrology that he fully understood the potential of LSTMs for applications in hydrology. Equipped with nothing more than his first GPU, he embarked on a quest to explore the wondrous lands of academia. Countless nights were spent on the computer, forging transatlantic friendships, conducting experiments and writing publications. Eventually, he ascended to the ranks of PhDs by defending his research against Reviewer #2 and the high council of the PhD committee. Fast forward in time, today, LSTMs are widely used and among others, power Google’s current operational, global-scale flood forecasting model. And thus, the now (not so) old research scientist lived happily ever after with his wife and his children, and continues, to this day, to do much the same as he had in those earlier years.

If there is one thing that I would like for you to take away from this talk, it is that I hope my presentation will motivate young scientists to stay curious, to follow their own ideas, to not get demotivated by initial pushback and to not be afraid of reaching out to more senior researchers. I want to advocate strongly the importance of open science, of reproducibility, of collaborations, of benchmarking and of open data sharing to advance science.

How to cite: Kratzert, F.: Long Short-Term Memory networks in hydrology: From free-time project to Google’s operational flood forecasting model, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-5863, https://doi.org/10.5194/egusphere-egu25-5863, 2025.

14:35–14:50
14:50–15:00
|
EGU25-3093
|
On-site presentation
Amin Elshorbagy, Duc-Hai Nguyen, M. Naveed Khaliq, M. Khaled Akhtar, and Fisaha Unduche

The use of artifical intelligence (AI) and machine learning (ML) approaches in various scientific and engineering disciplines has grown exponentially over recent years. This upsurge also includes applications of physics-guided ML models and explainable AI. However, in addition to the dificulties involved in the identification of relevant model inputs, the advantages, contributions, and credibility of ML models are still open challenges, especially when these models are evaluated against the perceptual hydrologic understanding of the system in question. In this study, we aim to investigate some of these challenges using the case of seasonal streamflow forecasting with lead times up to three months in several hydrologically challenging river basins of prairie provinces of Canada (i.e., Alberta, Saskatchewan, and Manitoba).

Multiple ML techniques, including Random Forest (RF) and Long Short-Term Memory (LSTM) models, are used to produce ensemble forecasts for 135 sub-basins of the Nelson-Churchill River Basin, comprising the vast area from the Rocky mountains up to the Hudson Bay, with the monthly temporal resolution and spatial scales of the order of 200 km2 to ~1.0 x106 km2, as reflected by drainage areas of all sub-basins. A large set of potential inputs (105 predictors) is used in this study. These potential inputs include hydrometeorological variables derived from the Daymet database, Environment and Climate Change Canada’s hydrometric network, and hydrometeorological forecasts from the European Centre for Medium-Range Weather Forecasts, and various static attributes of all sub-basins.

The Pearson’s correlation coefficient (CC) and Partial Mutual Information (PMI) were used, as model agnostic methods, to analyze the set of potential predictors and identify the most appropriate inputs for seasonal flow forecasting, prior to ML model development. Subsequently, modeling experiments were designed to investigate the ML model performance and test the usefulness of CC and PMI based techniques on modeling results. The model-agnostic and model-dependent findings were compared and analyzed in light of the perceptual understanding of the hydrological system. Furthermore, the Convergent Cross-Mapping (CCM) method was used with selected variables to further explore the causal, rather than correlational, relationships and interpret the results with the aim of developing ethical and responsible ML (ERML) models. We define ERML models as data driven models that are transparent and hydrologically explainable.

The preliminary results of this study indicate that PMI is quite effective in filtering some of the CC-based selections, which might form multiple equifinale sets of predictors. This step is critical for identifying the most relevant and necessary inputs. In spite of the coarse spatial and temporal resolutions, which complicate crisp hydrologic perceptions, the CCM method seems to support the selection of various input variables with hydrologic causality, strengthening the transparency and credibility of ML models.

How to cite: Elshorbagy, A., Nguyen, D.-H., Khaliq, M. N., Akhtar, M. K., and Unduche, F.: Positioning ML Models for Spatial and Temporal Modeling of River Flows Through Causality and Information Content Analyses, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-3093, https://doi.org/10.5194/egusphere-egu25-3093, 2025.

15:00–15:10
|
EGU25-2018
|
ECS
|
On-site presentation
|
Farzad Hosseini, Cristina Prieto, and Cesar Álvarez

The application of artificial intelligence and deep learning (DL) in hydrological sciences presents significant challenges and opportunities, particularly in regional and large-scale modeling. Building on the foundational works of Valiela (2000) and Beven (2020)—which underscore the importance of catchment-wise performance evaluation and uniqueness of the place in regional model comparisons—this study investigates nuanced implementation of deep neural networks (DNNs), specifically Long Short-Term Memory (LSTM), for regional rainfall-runoff predictions. Insights from recent advancements in LSTM-based rainfall-runoff modeling (Kratzert et al., 2024) and ensemble learning of catchment-wise regional LSTMs (Hosseini et al., 2024, 2025) emphasize the critical role of network architecture and training strategies.

Findings reveal regionally optimized DNNs with identical neurons (e.g., LSTM cells) but differing architectures (hyperparameters) can exhibit meaningfully distinct behaviors on the same dataset. For instance, one model captured region-wide generalizable patterns by greedily prioritizing overall accuracy in natural basins but underperforming in specific catchments. While another optimized version emphasized on anomalies (e.g., data deficiencies or snow processes) or human-induced influences (regulated flows), leading to improved accuracy in specific locations. Ensemble deep learning, combined with systematic hyperparameter optimization of regional LSTMs, effectively mitigates these discrepancies by synthesizing diverse learning perspectives into robust and accurate predictions, align with “wisdom of the crowd” principle (Surowiecki, 2004). This approach enhances the potential scalability of “one-size-fits-all” large-scale hydrological DNN, advancing the development of high-accuracy regional hydrological models.

Despite computational challenges, the findings underscore the potential of large-scale hydrological models powered by intelligent agents, environment-aware frameworks (Russell & Norvig, 2020), emphasizing the transformative interplay of DL architectures, ensemble strategies, and scalability in AI-driven hydrological modeling.

References

Valiela, I., 2001, Doing Science: Design, Analysis & Communication of Scientific Research, Oxford Uni. Press

Beven, K., 2020, Deep learning, hydrological processes & the uniqueness of place, Hydrol. Process., 34 (16), pp. 3608-3613

Kratzert, F., et al., 2024, HESS Opinions: Never train a Long Short-Term Memory (LSTM) network on a single basin, HESS, 28 (17), pp. 4187-4201

Hosseini, F., et al., 2024, Hyperparameter optimization of regional hydrological LSTMs by random search. Jhydrol, 643, 132003, 10.1016/j.jhydrol.2024.132003

Hosseini, F., et al., 2025, Ensemble learning of catchment-wise optimized LSTMs enhances regional rainfall-runoff modelling. Jhydrol, 646, 132269. 10.1016/j.jhydrol.2024.132269

Surowiecki, J., 2004, The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies, and Nations. Doubleday.

Russell, S., & Norvig, P., 2020. Artificial intelligence: A modern approach. Pearson

How to cite: Hosseini, F., Prieto, C., and Álvarez, C.: Advancing AI and Deep Learning Applications in Hydrological Prediction: Insights on Regional Model Development, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-2018, https://doi.org/10.5194/egusphere-egu25-2018, 2025.