HS3.6 | Hydroinformatics: data analytics, machine learning, hybrid modelling, optimisation
Orals |
Thu, 14:00
Thu, 16:15
Tue, 14:00
EDI
Hydroinformatics: data analytics, machine learning, hybrid modelling, optimisation
Including Arne Richter Awards for Outstanding ECS Lecture
Convener: Claudia BertiniECSECS | Co-conveners: Alessandro AmarantoECSECS, Niels Schuetze, Pascal Horton
Orals
| Thu, 01 May, 14:00–15:45 (CEST)
 
Room C, Fri, 02 May, 08:30–12:30 (CEST)
 
Room 3.16/17
Posters on site
| Attendance Thu, 01 May, 16:15–18:00 (CEST) | Display Thu, 01 May, 14:00–18:00
 
Hall A
Posters virtual
| Attendance Tue, 29 Apr, 14:00–15:45 (CEST) | Display Tue, 29 Apr, 08:30–18:00
 
vPoster spot A
Orals |
Thu, 14:00
Thu, 16:15
Tue, 14:00

Orals: Thu, 1 May | Room C

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears just before the time block starts.
Chairpersons: Claudia Bertini, Niels Schuetze
14:00–14:05
Artificial Intelligence in Hydrology
14:05–14:35
|
EGU25-5863
|
ECS
|
solicited
|
Highlight
|
Arne Richter Awards for Outstanding ECS Lecture
|
On-site presentation
Frederik Kratzert

Long Short-Term Memory networks (LSTMs) have been around since the early 90’s but only in the last few years have LSTMs gained significant popularity in the hydrological sciences. Related publication counts have grown exponentially, and LSTMs power some of the largest-scale operational flood forecasting systems.

In this presentation, I'll look back at my relatively short career as a student and researcher at the intersection of hydrology and machine learning. I don't claim to have introduced LSTMs to hydrology, but I'll share my own experience helping to develop this modeling approach into what it is today. We will look at what I saw in this neural network architecture, and why I thought it was well suited for hydrologic applications.

The tale goes as follows: Once upon a time, in a land (not so) far far away, a (not so young) master student of environmental engineering was teaching himself the dark arts of machine learning (ML). While studying ML for automated fish detection, he stumbled upon the LSTM architecture. Having just concluded a course on the design of conceptual hydrological models, he noticed the underlying similarity between the LSTM and these established approaches — and more generally, the conceptual approach for modeling the water cycle. With one of his dearest colleagues and friends, he started to work night and day (actually more nights than days) to see if the LSTM is indeed suitable for hydrology. From initial attempts at emulating the ABC and HBV models, to first real-world experiments in individual catchments, the LSTM was showing great potential. But it was not until he discovered the CAMELS dataset and started experimenting with large-sample hydrology that he fully understood the potential of LSTMs for applications in hydrology. Equipped with nothing more than his first GPU, he embarked on a quest to explore the wondrous lands of academia. Countless nights were spent on the computer, forging transatlantic friendships, conducting experiments and writing publications. Eventually, he ascended to the ranks of PhDs by defending his research against Reviewer #2 and the high council of the PhD committee. Fast forward in time, today, LSTMs are widely used and among others, power Google’s current operational, global-scale flood forecasting model. And thus, the now (not so) old research scientist lived happily ever after with his wife and his children, and continues, to this day, to do much the same as he had in those earlier years.

If there is one thing that I would like for you to take away from this talk, it is that I hope my presentation will motivate young scientists to stay curious, to follow their own ideas, to not get demotivated by initial pushback and to not be afraid of reaching out to more senior researchers. I want to advocate strongly the importance of open science, of reproducibility, of collaborations, of benchmarking and of open data sharing to advance science.

How to cite: Kratzert, F.: Long Short-Term Memory networks in hydrology: From free-time project to Google’s operational flood forecasting model, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-5863, https://doi.org/10.5194/egusphere-egu25-5863, 2025.

14:35–14:50
14:50–15:00
|
EGU25-3093
|
On-site presentation
Amin Elshorbagy, Duc-Hai Nguyen, M. Naveed Khaliq, M. Khaled Akhtar, and Fisaha Unduche

The use of artifical intelligence (AI) and machine learning (ML) approaches in various scientific and engineering disciplines has grown exponentially over recent years. This upsurge also includes applications of physics-guided ML models and explainable AI. However, in addition to the dificulties involved in the identification of relevant model inputs, the advantages, contributions, and credibility of ML models are still open challenges, especially when these models are evaluated against the perceptual hydrologic understanding of the system in question. In this study, we aim to investigate some of these challenges using the case of seasonal streamflow forecasting with lead times up to three months in several hydrologically challenging river basins of prairie provinces of Canada (i.e., Alberta, Saskatchewan, and Manitoba).

Multiple ML techniques, including Random Forest (RF) and Long Short-Term Memory (LSTM) models, are used to produce ensemble forecasts for 135 sub-basins of the Nelson-Churchill River Basin, comprising the vast area from the Rocky mountains up to the Hudson Bay, with the monthly temporal resolution and spatial scales of the order of 200 km2 to ~1.0 x106 km2, as reflected by drainage areas of all sub-basins. A large set of potential inputs (105 predictors) is used in this study. These potential inputs include hydrometeorological variables derived from the Daymet database, Environment and Climate Change Canada’s hydrometric network, and hydrometeorological forecasts from the European Centre for Medium-Range Weather Forecasts, and various static attributes of all sub-basins.

The Pearson’s correlation coefficient (CC) and Partial Mutual Information (PMI) were used, as model agnostic methods, to analyze the set of potential predictors and identify the most appropriate inputs for seasonal flow forecasting, prior to ML model development. Subsequently, modeling experiments were designed to investigate the ML model performance and test the usefulness of CC and PMI based techniques on modeling results. The model-agnostic and model-dependent findings were compared and analyzed in light of the perceptual understanding of the hydrological system. Furthermore, the Convergent Cross-Mapping (CCM) method was used with selected variables to further explore the causal, rather than correlational, relationships and interpret the results with the aim of developing ethical and responsible ML (ERML) models. We define ERML models as data driven models that are transparent and hydrologically explainable.

The preliminary results of this study indicate that PMI is quite effective in filtering some of the CC-based selections, which might form multiple equifinale sets of predictors. This step is critical for identifying the most relevant and necessary inputs. In spite of the coarse spatial and temporal resolutions, which complicate crisp hydrologic perceptions, the CCM method seems to support the selection of various input variables with hydrologic causality, strengthening the transparency and credibility of ML models.

How to cite: Elshorbagy, A., Nguyen, D.-H., Khaliq, M. N., Akhtar, M. K., and Unduche, F.: Positioning ML Models for Spatial and Temporal Modeling of River Flows Through Causality and Information Content Analyses, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-3093, https://doi.org/10.5194/egusphere-egu25-3093, 2025.

15:00–15:10
|
EGU25-2018
|
ECS
|
On-site presentation
|
Farzad Hosseini, Cristina Prieto, and Cesar Álvarez

The application of artificial intelligence and deep learning (DL) in hydrological sciences presents significant challenges and opportunities, particularly in regional and large-scale modeling. Building on the foundational works of Valiela (2000) and Beven (2020)—which underscore the importance of catchment-wise performance evaluation and uniqueness of the place in regional model comparisons—this study investigates nuanced implementation of deep neural networks (DNNs), specifically Long Short-Term Memory (LSTM), for regional rainfall-runoff predictions. Insights from recent advancements in LSTM-based rainfall-runoff modeling (Kratzert et al., 2024) and ensemble learning of catchment-wise regional LSTMs (Hosseini et al., 2024, 2025) emphasize the critical role of network architecture and training strategies.

Findings reveal regionally optimized DNNs with identical neurons (e.g., LSTM cells) but differing architectures (hyperparameters) can exhibit meaningfully distinct behaviors on the same dataset. For instance, one model captured region-wide generalizable patterns by greedily prioritizing overall accuracy in natural basins but underperforming in specific catchments. While another optimized version emphasized on anomalies (e.g., data deficiencies or snow processes) or human-induced influences (regulated flows), leading to improved accuracy in specific locations. Ensemble deep learning, combined with systematic hyperparameter optimization of regional LSTMs, effectively mitigates these discrepancies by synthesizing diverse learning perspectives into robust and accurate predictions, align with “wisdom of the crowd” principle (Surowiecki, 2004). This approach enhances the potential scalability of “one-size-fits-all” large-scale hydrological DNN, advancing the development of high-accuracy regional hydrological models.

Despite computational challenges, the findings underscore the potential of large-scale hydrological models powered by intelligent agents, environment-aware frameworks (Russell & Norvig, 2020), emphasizing the transformative interplay of DL architectures, ensemble strategies, and scalability in AI-driven hydrological modeling.

References

Valiela, I., 2001, Doing Science: Design, Analysis & Communication of Scientific Research, Oxford Uni. Press

Beven, K., 2020, Deep learning, hydrological processes & the uniqueness of place, Hydrol. Process., 34 (16), pp. 3608-3613

Kratzert, F., et al., 2024, HESS Opinions: Never train a Long Short-Term Memory (LSTM) network on a single basin, HESS, 28 (17), pp. 4187-4201

Hosseini, F., et al., 2024, Hyperparameter optimization of regional hydrological LSTMs by random search. Jhydrol, 643, 132003, 10.1016/j.jhydrol.2024.132003

Hosseini, F., et al., 2025, Ensemble learning of catchment-wise optimized LSTMs enhances regional rainfall-runoff modelling. Jhydrol, 646, 132269. 10.1016/j.jhydrol.2024.132269

Surowiecki, J., 2004, The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies, and Nations. Doubleday.

Russell, S., & Norvig, P., 2020. Artificial intelligence: A modern approach. Pearson

How to cite: Hosseini, F., Prieto, C., and Álvarez, C.: Advancing AI and Deep Learning Applications in Hydrological Prediction: Insights on Regional Model Development, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-2018, https://doi.org/10.5194/egusphere-egu25-2018, 2025.

15:10–15:20
|
EGU25-1686
|
ECS
|
On-site presentation
Yiheng Du and Ilias G. Pechlivanidis

Post-processing large-scale hydrological models remains a significant challenge, particularly in ungauged basins, where limited observations hinder accurate representation of local hydrological conditions. In this study, we propose a machine learning (ML)-based approach for regionalizing and post-processing simulated streamflow from the E-HYPE hydrological model across the pan-European domain. Using Long Short-Term Memory (LSTM) models, we explored E-HYPE post-processing with two different regionalization strategies: (1) individual models trained for basins belonging to clusters of hydrological similarity (Cluster-Specific Model), and (2) a single model incorporating the hydrological clusters as categorical variables (Cluster-Informed Global Model). Performance was evaluated using multiple evaluation metrics (Mean Absolute Error, MAE; Nash-Sutcliffe Efficiency, NSE; and log transformed NSE, log-NSE) under a K-fold cross-validation framework allowing for spatial and temporal testing. Furthermore, the improvements at each location were assessed by examining different hydrological signatures, including mean, high (Q90) and low (Q20) streamflow situations, using the E-HYPE simulations as benchmark. Results show that both regionalization strategies achieve improvements in performance over raw simulations, including the ungauged basins (e.g. those that are excluded from the training dataset). The Cluster-Informed Global Model effectively balances regionalization and accuracy, outperforming the Cluster-Specific Model in both spatial and temporal testing, and it also shows enhanced representation of hydrological signatures. Building on these results, the Cluster-Informed Global Model was applied to all the catchments in E-HYPE, providing an updated pattern of hydrological signatures across the European domain. These findings highlight the potential of ML-based regionalization strategies to enhance hydrological model outputs and hence process understanding, particularly in data-scarce regions, potentially providing a framework for AI-enhancement of large-scale hydro-climate services.

How to cite: Du, Y. and Pechlivanidis, I. G.: Advancing ungauged catchment hydrology through regionalized ML-based post-processing, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-1686, https://doi.org/10.5194/egusphere-egu25-1686, 2025.

15:20–15:30
|
EGU25-12083
|
On-site presentation
Andras Bardossy, Jochen Seidel, and Eduardo Acuna

Artificial intelligence plays an increasingly significant in many areas of our lives. Its applications in hydrology are becoming more common, and many authors have reported excellent results in modelling rainfall and predicting floods. However, alongside the successes, it is also important to understand the limitations of these models. This study presents various issues with potential significant impacts on applications, using an LSTM model and the CAMELS-US and CAMELS-GB datasets.

The first important point is the problem of data quality. Hydrological observations are uncertain, with the largest error in observed discharge occurring with the highest measurements and the largest relative error with the smallest values. The error structure can change considerably due to alterations in riverbed geometry. Furthermore, areal rainfall is estimated based on point observations and is often biased, especially for extreme values (Bárdossy and Anwar 2023). Poor or variable quality of observational data can lead to suboptimal model outcomes. LSTM models act as bias correctors for many catchments by violating physical principles. For instance, water balances in catchments in the CAMELS-GB data are incorrect in more than 30% of the cases because evaporation is unrealistically high, which is compensated for by the LSTM models.

The purpose of modelling is not to repeat what is already known but rather to predict behaviour under varying weather conditions or changing catchment characteristics. Thus, it is important to investigate how these models respond under altered conditions. An increase in precipitation results in inappropriate increases in evaporation in more than 60% of cases in the CAMELS-GB test series. Therefore, the use of these models for climate change studies is questionable.

A major advantage of using LSTMs for hydrology is their ability to provide regional models for a large number of catchments. This is significantly different from the usual modelling for individual catchments. Several studies use static catchment attributes for regional modelling. However, integrating these static attributes changes the model structure. It is shown that a similar number of random numbers as attributes instead of catchment attributes can yield comparably good results. Therefore, the models may not be reliably applicable to uncalibrated catchments or changes within the catchments.

A frequently discussed problem with the application of AI to hydrological prediction of extreme events is its tendency not to extrapolate beyond the range of its training data. However, this is only a limited issue due to regional modelling. By modelling specific discharges, insights from catchments where extreme floods have occurred can be transferred to other catchments. This allows for the simulation of scenarios exceeding the maximum values previously observed in a single catchment.

How to cite: Bardossy, A., Seidel, J., and Acuna, E.: Is Artificial Intelligence the Ultimate Solution for Hydrological Modelling?, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-12083, https://doi.org/10.5194/egusphere-egu25-12083, 2025.

15:30–15:45

Orals: Fri, 2 May | Room 3.16/17

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears just before the time block starts.
Chairpersons: Pascal Horton, Alessandro Amaranto
Advanced methods in Hydroinformatics
08:30–08:40
|
EGU25-7742
|
ECS
|
On-site presentation
Byeongwon Lee, Hyemin Jeong, Younghun Lee, and Sangchul Lee

Parameter calibration of complex environmental models remains a significant challenge in watershed management, particularly when integrating multiple biogeochemical processes. Reinforcement learning (RL) has emerged as a promising approach in solving complex optimization problems with its ability to learn optimal strategies through continuous interaction and feedback. This study presents SWAT-C-RL, a novel approach that combines the Soil and Water Assessment Tool-Carbon (SWAT-C) and RL for efficient multi-objective parameter calibration. We implement a multi-agent degenerate proximal policy optimization framework that uniquely addresses the structural characteristics of SWAT-C models by optimizing both hydrological and carbon cycle parameters simultaneously. Each agent specializes in distinct parameter sets while coordinating through a shared reward mechanism, enabling comprehensive model calibration with reduced computational demands. The methodology was validated across two geographically and environmentally distinct watersheds: the Tuckahoe Creek Watershed (TCW, 220.7km2) in the United States and the Miho River Watershed (MRW, 1,855km2) in South Korea. The two watersheds, with their varying sizes, climate patterns, topography, and land use distributions, provided a test of the model's adaptability in simulating both water and carbon dynamics. Model performance will be evaluated using Nash-Sutcliffe Efficiency (NSE) and Percent Bias (P-bias) metrics. The performance of SWAT-C-RL would be compared against traditional Sequential Uncertainty Fitting version 2 (SUFI-2) calibration. The findings from this study would show the potential of integrated reinforcement learning approaches in environmental modeling, particularly for complex multi-objective calibration problems.

How to cite: Lee, B., Jeong, H., Lee, Y., and Lee, S.: A reinforcement learning approach for parameter optimization in the SWAT-C model, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7742, https://doi.org/10.5194/egusphere-egu25-7742, 2025.

08:40–08:50
|
EGU25-20258
|
ECS
|
On-site presentation
Mario Alberto Ponce-Pacheco, Linnaea Cahill, Ashray Tyagi, Anukool Nagi, Prashant Pastore, and Saket Pande

Increasing competition for water resources and rainfall variability driven by climate change have led to irrigation water scarcity, particularly in drought-prone regions such as Vidarbha, Maharashtra (India). Enhancing irrigation water efficiency is essential for sustainable agricultural intensification. However, adopting new technologies poses a risk for farmers, as it requires significant investment of time and financial resources to modify their practices. In this context, we have developed a mobile application that implements a hybrid model combining a sociohydrological approach with a KPCA-based structural error model, providing farmers with timely information to support decision-making in the adoption of new irrigation technologies and the implementation of Good Agricultural Practices (GAPs), such as irrigation and fertilization. Although the model explains 20% of the observed variance in yields at the plot scale, its main purpose is to provide farm-scale predictions to encourage the adoption of GAPs. In this work, we venture into providing more precise forecasting to the users for direct applicability for forward-looking field advisories. By integrating higher-resolution data (e.g., Sentinel-2A) and exploring Bayesian methods along with machine learning techniques, the accuracy of the state variables, such as biomass and water storage, was enhanced. This advancement is incorporated into the mobile application to provide opportunistic and precise forecasting to farmers in their decision-making process when implementing GAPs.

How to cite: Ponce-Pacheco, M. A., Cahill, L., Tyagi, A., Nagi, A., Pastore, P., and Pande, S.: Enhancing smallholder sociohydrological predictions at plot scale by novel data assimilation of high-resolution soil moisture and biomass data, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-20258, https://doi.org/10.5194/egusphere-egu25-20258, 2025.

08:50–09:00
|
EGU25-5127
|
ECS
|
On-site presentation
Eduardo Acuna, Frederik Kratzert, Daniel Klotz, Martin Gauch, Manuel Álvarez Chaves, Ralf Loritz, and Uwe Ehret

Long Short-Term Memory (LSTM) networks have demonstrated state-of-the-art performance for rainfall-runoff hydrological modeling. However, most studies focus on daily-scale predictions, limiting the benefits of sub-daily (e.g. hourly) predictions in applications like flood forecasting. Moreover, training an LSTM exclusively on sub-daily data is computationally expensive, and may lead to model-learning difficulties due to the extended sequence lengths. In this study, we introduce a new architecture, multi-frequency LSTM (MF-LSTM), designed to use input of various temporal frequencies to produce sub-daily (e.g. hourly) predictions at a moderate computational cost. Building on two existing methods previously proposed by coauthors of this study, the MF-LSTM processes older inputs at coarser temporal resolutions than more recent ones. The MF-LSTM gives the possibility to handle different temporal frequencies, with different number of input dimensions, in a single LSTM cell, enhancing generality and simplicity of use. Our experiments, conducted on 516 basins from the CAMELS-US dataset, demonstrate that MF-LSTM retains state-of-the-art performance while offering a simpler design. Moreover, the MF-LSTM architecture reported a 5x reduction in processing time, compared to models trained exclusively on hourly data.

 

Reference

Acuña Espinoza, E., Kratzert, F., Klotz, D., Gauch, M., Álvarez Chaves, M., Loritz, R., & Ehret, U. (2024). Technical note: An approach for handling multiple temporal frequencies with different input dimensions using a single LSTM cell. EGUsphere, 2024, 1–12. https://doi.org/10.5194/egusphere-2024-3355

How to cite: Acuna, E., Kratzert, F., Klotz, D., Gauch, M., Álvarez Chaves, M., Loritz, R., and Ehret, U.: An approach for handling multiple temporal frequencies with different input dimensions using a single LSTM cell, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-5127, https://doi.org/10.5194/egusphere-egu25-5127, 2025.

09:00–09:10
|
EGU25-14478
|
On-site presentation
Dean Meason, Konstantinos Andreadis, Barbara Höck, Guilherme Cassales, Serajis Salekin, Priscilla Lad, Don White, Chanatda Somchit, Bruce Dudley, James Griffins, Jing Yang, Alec Dempster, Albert Bifet, João Palma, Adedamola Wuraola, and Amanda Matson

Land-use intensification and climate change are increasing pressure on water availability and use around the world. It is becoming urgent to understand hydrological cycles to manage water availability for natural and human systems. Forests cover 31% of global land area and are crucial for storing and releasing precipitation, however, it is difficult to quantify forest hydrology processes and apply the learnings from one watershed to another.   

 In New Zealand, the 5-year Forest Flows MBIE Endeavour Research Programme (https://www.scionresearch.com/science/sustainable-forest-and-land-management/Forest-flows-research-programme) investigated these challenges with the novel integration of various terrestrial and remote sensing data in Pinus radiata (D. Don) plantation forests. At total of 1,717 terrestrial sensors were deployed above and below ground in wireless IoT sensor networks across five watersheds with a range of climatic and physiographic regions. The Kafka Big Data Pipeline streamed, cleaned and stored the 360,000 observations collected every 24-hours. The fusion of temporally rich terrestrial data and spatially rich remote sensing data provided new insights into the mechanistic drivers of forest hydrological processes at the point (tree), watershed, forest scales. Forest Flows used both traditional and machine learning methodologies, as well as process-based modelling, to quantify tree water use, watershed water storage and release.

 This presentation will introduce a novel deep learning (DL) framework applied to Big Data in environmental science, with a particular focus on the DL-based Neural Ordinary Differential Equations (NODE) Hydrological Framework. This innovative approach enabled high-precision super-resolution predictions of forest soil moisture derived from NASA's Soil Moisture Active Passive (SMAP) Mission, downscaling from a 9 km to a 1 km spatial resolution. Additionally, the framework provided reliable predictions for regions lacking direct observational products. We will demonstrate how this DL methodology can be leveraged to predict evapotranspiration, as well as surface and subsurface water fluxes, at fine spatial and temporal resolutions within forest ecosystems. The potential applications of this approach extend beyond forest environments, offering insights for other complex environmental Big Data challenges.

How to cite: Meason, D., Andreadis, K., Höck, B., Cassales, G., Salekin, S., Lad, P., White, D., Somchit, C., Dudley, B., Griffins, J., Yang, J., Dempster, A., Bifet, A., Palma, J., Wuraola, A., and Matson, A.: Forest Flows: the integration of remote sensing and terrestrial big data to quantify forest hydrological fluxes at multiple scales, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-14478, https://doi.org/10.5194/egusphere-egu25-14478, 2025.

09:10–09:20
|
EGU25-12182
|
ECS
|
On-site presentation
Giuditta Smerilli, Luca Lombardo, Anna Basso, Alberto Viglione, and Attilio Castellarin

Hydrological simulation in ungauged basins is essential for analysing extreme events and reconstructing historical data. A major challenge is deriving consistent model parameters that reflect basin characteristics. Regionalization methods address this by transferring information from gauged to ungauged basins, linking catchment attributes to model parameters.

An innovative approach - PArameter Set Shuffling (PASS) - uses a machine learning decision tree algorithm to establish relationships between locally calibrated parameters and basin descriptors, enabling spatially distributed and lumped parameter predictions. PASS has yielded valid results with semi-distributed hydrological models in flat terrains such as Germany and in more complex regions like the Alpine areas, but its application to lumped models remains largely unexplored.

This study investigates the performance of PASS for regionalizing an hourly lumped rainfall-runoff model, GR5H, in the eastern mountainous region of Emilia-Romagna, Italy. Specifically, the method was applied to a pool of 23 medium-small mountainous basins, using hourly discharge data covering up to 20 years for many of the catchments considered. The selection of the study region is motivated by the devastating 2023-2024 floods, causing casualties, significant losses and widespread displacement. Extensive levee breaches and damaged river gauges hindered accurate flood flow measurements.

KGE and NSE were adopted as efficiency measures in the calibration process and two independent analyses were conducted, providing additional insight into the potential, strengths, and weaknesses of these two metrics. The results demonstrate that the PASS procedure enables the attainment of good regional model efficiencies without significant loss of performance when transitioning from calibration to leave-one-out cross-validation, confirming the robustness of the methodology in handling complex terrains and diverse hydrological conditions with a simpler hydrological model. These findings highlight the potential of PASS to streamline parameter estimation for ungauged basins and provide a reliable tool for hydrological modelling with reduced computational complexity.

How to cite: Smerilli, G., Lombardo, L., Basso, A., Viglione, A., and Castellarin, A.: Regional Calibration of a Lumped Hourly Hydrological Model Using a Decision-Tree Approach, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-12182, https://doi.org/10.5194/egusphere-egu25-12182, 2025.

09:20–09:30
|
EGU25-9418
|
On-site presentation
Hideaki Kitauchi and Kozo Sato

 

ABSTRACT

Fluid flow around right-angle faults with a finite conductivity fracture is not only a basis for understanding motion of underground fluids such as water, oil and carbon dioxide in carbon capture and storage, but also important, for example, planning and installation of an artificial fracture for the purpose of irrigation in the dry area. In this study we present an optimal analytic solution of the fluid flow as shown in Figure 1.

The flow consists of two parts, one is the flow around right-angle faults and the other is the flow around a finite conductivity fracture. The latter exhibits singular behavior near the edges, on the other hand the former non-singular everywhere. The solution is thus expressed as the sum of non-singular and singular solutions. Being analytic everywhere within and on a problem boundary thus Cauchy’s integral formula holds, the non-singular solution can be determined by the complex variable boundary element method (Sato 2015). The singular solution is expressed as a combination of partial sums of different Laurent series expansion with multiple poles. We solve the non-singular and singular solutions simultaneously by using the implicit singularity programming (Sato 2015).

There is arbitrariness in choosing positions of the multiple poles appeared in the singular solution. In order to find optimal positions of the poles, we evaluate the discharge at an arbitrary point, for example a black dot in Figure 1, in the solution for some given poles. Changing the positions of the poles in the singular solution, we examine the convergence of the discharges so as to find the optimal positions of the poles, that is, the optimal solution. We also try to explain the reason for the optimal positions of the poles from a mathematical point of view.

 

Figure 1. An example of the solution for flow around right-angle faults with a finite conductivity fracture. The right-angle faults are x and y axes intersecting the origin, the fracture a red bold line. Red thin lines represent streamlines, blue equipotential lines. A black dot is an arbitrary point at which the discharge is evaluated. Near the right-angle faults fluid flow goes along the faults, while around the finite conductivity fracture flow goes perpendicular to the fracture.

 

REFERENCES

Sato, K., 2015: Complex Analysis for Practical Engineering, Springer.

How to cite: Kitauchi, H. and Sato, K.: An optimal analytic solution for flow around right-angle faults with a finite conductivity fracture, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9418, https://doi.org/10.5194/egusphere-egu25-9418, 2025.

09:30–09:40
|
EGU25-2422
|
ECS
|
On-site presentation
Jongyun Byun, Hyeon-Joon Kim, Jongjin Baik, and Changhyun Jun

Abstract

Accurate estimation of snowfall intensity is critical for effective winter weather management, transportation safety, and hydrological forecasting. Traditional approaches predominantly rely on ground-based sensors and radar systems, which are often spatially sparse and costly to install and maintain. In this study, we propose a novel convolutional neural networks (CNNs)-based framework for estimating snowfall intensity using images captured by closed-circuit television (CCTV) cameras, which are gaining attention as prominent IoT sensing devices. This approach capitalizes on the extensive availability of CCTV infrastructure, enabling high-frequency and localized monitoring of snowfall patterns. The proposed model is trained using matched datasets comprising snowfall intensity values obtained from PARSIVEL, a type of disdrometer capable of measuring particle information at ground observation stations, and CCTV data captured simultaneously. The study area, Daegwallyeong in Gangwon Province, South Korea, is highly suitable for snowfall observations, with an average of more than 10 snowy days per month during the winter season from December to February. A notable feature of this framework is its ability to estimate snowfall intensity values from CCTV data by leveraging convolutional neural networks. Furthermore, a dedicated preprocessing step was implemented to extract snowfall particles from the original images, thereby enhancing the accuracy of snowfall intensity estimation. Experimental results demonstrate that the CNNs-based framework developed in this study is highly effective for estimating snowfall intensity using CCTV data. Moreover, the incorporation of snowfall particle extraction during preprocessing significantly improved estimation accuracy compared to scenarios where particle extraction was not applied.

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2024-00334564), Korea Meteorological Administration Research and Development Program under Grant RS-2023-00243008, and Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (RS-2023-00272105).

How to cite: Byun, J., Kim, H.-J., Baik, J., and Jun, C.: CNNs-Based Snowfall Intensity Estimation Model Utilizing CCTV Data, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-2422, https://doi.org/10.5194/egusphere-egu25-2422, 2025.

09:40–09:50
|
EGU25-7869
|
On-site presentation
Bing-Chen Jhong

Typhoon-induced inundation is a critical issue in Taiwan, particularly under the intensifying impacts of extreme climate events. This study focuses on developing an AI-based hourly inundation forecasting model for real-time applications. Observational data, including rainfall, inundation depth, and sewer water levels from different typhoon events, were utilized as input factors. A traditional input factor selection method was employed to identify input variables for Support Vector Machine (SVM)-based models. Nine inundation reference points were selected, and an SVM-based forecasting model was developed for each point. To enhance forecasting accuracy and address potential overfitting issues, a novel model integrating Long Short-Term Memory (LSTM) networks with Multi-Objective Genetic Algorithm (MOGA), referred to as LSTM-MOGA, was proposed. This model automates the selection of influential input factors while optimizing forecasting performance. The study was conducted in Yilan County, Taiwan, and model validation was performed using cross-validation methods. The results indicate that, although SVM models with traditional input selection methods performed better in 3 out of 9 inundation reference points, the LSTM-MOGA model demonstrated superior forecasting accuracy in the remaining 6 points. Moreover, SVM models exhibited significant overfitting issues, with negative CE values during the testing phase, suggesting substantial underestimation in forecasting inundation depths during typhoon events. Conversely, the LSTM-MOGA model avoided overfitting, maintaining stable and reliable performance across both training and testing phases. The proposed LSTM-MOGA framework provides a robust solution for real-time inundation forecasting during typhoon events, with potential applications for disaster management and water resource planning. The outcomes of this study are expected to serve as valuable references for hydrological disaster mitigation and decision-making by water resource management agencies.

How to cite: Jhong, B.-C.: Typhoon-Induced Hourly Inundation Forecasting by Integrating Long Short-Term Memory and Multi-Objective Genetic Algorithms, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7869, https://doi.org/10.5194/egusphere-egu25-7869, 2025.

09:50–10:00
|
EGU25-9285
|
ECS
|
Virtual presentation
Viraj Vidura Herath Herath Mudiyanselage, Lucy Marshall, Abhishek Saha, Sun Han Neo, Sanka Rasnayaka, and Sachith Seneviratne

Diffusion models have emerged as state-of-the-art generative AI models in computer vision, excelling in generating high-fidelity and diverse images. These models surpass previous architectures in quality, generalizability, and stability. However, their potential remains largely untapped in water resource applications, including flood mapping.

This research introduces a diffusion model-based super-resolution approach to upscale coarse-grid hydrodynamic model outputs to fine-grid accuracy in a computationally efficient manner. Aligning with the theory-guided data science (TGDS) paradigm, the proposed model functions as a hybrid TGDS model.

The process begins by running a coarse-grid hydrodynamic model over the area of interest, with mesh resolution selected to enable simulation completion within several minutes. Acting as a corrective layer, the diffusion model refines these coarse estimates to align with high-resolution model outputs. In this study, the HEC-RAS model is employed to generate both coarse and fine-grid flood maps for model training and testing. The subgrid formulation within HEC-RAS incorporates fine-scale topographic details within each grid cell, significantly enhancing computational efficiency and accuracy. Additionally, the subgrid topography maps both coarse-grid and fine-grid mesh-level water level estimates onto the underlying terrain resolution, enabling compatibility with both structured and unstructured meshes.

The primary objective is to rapidly produce high-resolution flood maps, addressing the impracticality of fine-grid hydrodynamic models for operational flood management and probabilistic flood design due to their high computational demands. Once the coarse-grid model is executed at a catchment scale, the diffusion model can quickly generate high-resolution flood maps for user-specified areas. Within the diffusion model, digital elevation models (DEMs) and corresponding coarse-grid flood depth estimates serve as conditioning signals. The model processes flood depth raster patches of 128x128 pixels for both flood maps and DEM data. This raster size effectively balances spatial coverage and computational efficiency.

The proposed approach is tested on four large Australian catchments: Wollombi, Chowilla, Burnett River, and Lismore. Unlike general diffusion models focused on natural images, models trained for these catchments converged faster due to the strong correlation between coarse and fine-grid model outputs. The resulting flood depth maps closely matched fine-grid model predictions, outperforming popular U-Net-based super-resolution models in accuracy. Notably, a model trained on data from one catchment demonstrated strong generalizability, performing well on other catchments with minimal transfer learning.

While diffusion models traditionally have slower inference speeds due to iterative image generation from random noise, this study significantly reduced inference time at the inference stage by initiating the denoising process with noisy coarse-grid images instead of pure random noise. Future research will focus on further reducing inference times by transitioning from pixel-wise to latent space diffusion models.

How to cite: Herath Mudiyanselage, V. V. H., Marshall, L., Saha, A., Neo, S. H., Rasnayaka, S., and Seneviratne, S.: Physics-Informed Generative AI for High-Resolution Flood Mapping, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9285, https://doi.org/10.5194/egusphere-egu25-9285, 2025.

10:00–10:10
|
EGU25-20529
|
Virtual presentation
Narsappa Sudarshan and Seelam Naga Poojitha

Water distribution networks (WDNs) are critical infrastructure systems that must balance cost-efficiency and reliability while adapting to evolving water demand. A significant challenge in designing WDNs lies in addressing future demand uncertainties caused by factors such as population growth, urbanization, and climate change. This study presents a single-objective optimization framework focused on minimizing the investment cost of network pipes while ensuring system reliability, measured by the Network Resilience Index. A penalty function is integrated into the objective function to validate feasibility by satisfying minimum head requirements at all nodes.

A key feature of the proposed approach is the incorporation of phasing design, which allows for the gradual expansion of the network in alignment with projected demand growth. Phasing design ensures that infrastructure investments are staged strategically, reducing upfront costs and preventing overdesign in the early stages. This approach also provides flexibility, enabling network upgrades to be planned and executed in response to evolving demand patterns. By optimizing each phase, engineers can design a system that balances immediate needs with long-term goals, ultimately minimizing costs while maintaining reliable service.

To address demand uncertainty, a probabilistic model is employed, representing growth rates as discrete random variables with assigned probabilities. This approach enables the consideration of multiple demand scenarios across all phases of the network's lifecycle. By evaluating a range of potential future conditions, the methodology ensures robust performance under various scenarios, enhancing the network's adaptability.

Optimization is conducted using advanced algorithms, specifically Differential Evolution (DE) which is well-suited for complex nonlinear problems. The framework is validated using two benchmark problems: the Two-Loop Network (TLN). Results demonstrate that the phasing design approach, coupled with probabilistic demand modeling and advanced optimization techniques, produces cost-effective and reliable solutions.

This study highlights the critical role of phasing design in ensuring efficient resource allocation, flexibility in network development, and robustness against future uncertainties. By incorporating demand uncertainty and leveraging optimization techniques, the proposed framework supports the sustainable development of WDNs, providing a practical tool for engineers to address the dual challenges of cost minimization and reliability in real-world applications.

Keywords: Water distribution network; Phasing design; Differential evolution; Demand uncertainty; Network Resilience Index

How to cite: Sudarshan, N. and Poojitha, S. N.: Optimal and Phasing Design of Water Distribution Networks in View of Demand Uncertainty, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-20529, https://doi.org/10.5194/egusphere-egu25-20529, 2025.

Coffee break
Chairpersons: Niels Schuetze, Pascal Horton
Machine Learning - case studies
10:45–10:55
|
EGU25-17459
|
ECS
|
On-site presentation
Giulio Palcic, Guido Ascenso, Matteo Giuliani, and Andrea Castelletti

Drought is a prolonged dry period characterized by a lack of precipitation leading to water shortages. Due to climate change driven by human activities, Europe is experiencing a dramatic rise in temperatures at a rate unparalleled elsewhere in the world. Consequently, precipitation patterns are shifting, and regions like Northern Italy are increasingly experiencing episodes of drought, including extreme ones. Prolonged drought periods have devastating effects on the economy of sectors heavily reliant on water availability, such as agriculture, industry, energy production, and inland waterway transport, while also jeopardizing water resources for civilian use and the health of ecosystems. These sectors could benefit from the availability of subseasonal-to-seasonal (S2S) drought forecasts to trigger anticipatory actions. However, the accuracy of existing dynamical forecast systems often falls short of the standards needed for effective integration into basin management.

To address this limitation, we propose a framework that leverages information from teleconnections, global climate variables, and meteorological data. This approach is applied to predict inflows for Lake Como (Italy) with lead times of 1 to 6 months, which are crucial for long-term reservoir management and strategic water allocation. Our framework comprises three modules. The first module investigates major climatic oscillations to determine patterns in climate variables influencing lake inflows. Mutual information masking is then applied to identify the most significant variables. The global climate variable, after being filtered using mutual information, is aggregated through Principal Component Analysis (PCA), which reduces the dimensionality of the data and captures essential spatial features, thus enhancing the model’s ability to focus on the most relevant global patterns. The second module applies a feature selection algorithm based on mutual information to construct input datasets composed of the principal components of global variables and local meteorological variables. The third and final module performs regression to predict cumulated inflows based on the selected input variables using Random Forest models.

Results highlight the promising performances achieved by the framework, demonstrating its ability to generate accurate forecasts and outperform the subseasonal and seasonal large-scale ensemble forecasts produced by the European Flood Awareness System (EFAS). The model achieves a Mean Absolute Percentage Error (MAPE) of 6.73% and a skill score of 0.96 for 1-month-ahead forecasts, 6.17% MAPE and 0.98 skill score for 3-month-ahead forecasts, and 6.00% MAPE with a skill score of 0.85 for 6-month-ahead predictions, showcasing its reliability across varying lead times.

This framework advances automated data-driven modeling for robust hydrological forecasting by employing a novel combination of filter- and wrapper-based feature selection techniques. The optimal input dataset is autonomously selected based on the predictive performance of the Random Forest model.

How to cite: Palcic, G., Ascenso, G., Giuliani, M., and Castelletti, A.: Bridging Global Teleconnections and Local Data for Subseasonal-to-Seasonal Forecasting of Lake Como Inflows, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-17459, https://doi.org/10.5194/egusphere-egu25-17459, 2025.

10:55–11:05
|
EGU25-15297
|
ECS
|
Virtual presentation
Namitha Saji, Vinayakam Jothiprakash, and Bellie Sivakumar

In this study, an attempt is made to examine the effect of temporal scale on the prediction of rainfall and runoff in the Savitri River basin, Maharashtra, India. The rainfall data from six stations and runoff data from four stations in the Savitri River basin are used here. The complexity of the series is analysed first with False Nearest Neighbour (FNN) method, then the nonlinear prediction method with a local approximation approach is employed, and one-time step-ahead predictions are made. The local approximation prediction involves the phase space reconstruction at optimum embedding dimension ‘m’ followed by identifying the nearest neighbours (k) based on the Euclidean distance between the vectors in the phase space. The one-time step ahead prediction is made by taking the mean of the ‘k’ number of neighbors in the phase space reconstructed at optimum dimension ‘m’. For each series, 80% of the data length is used for phase space reconstruction and then 20% of the data is used for testing the accuracy of prediction. Three statistical evaluation measures, correlation coefficient (CC), Nash-Sutcliffe efficiency (NSE), and normalized root mean square error (NRMSE), are used to determine the performance of the method. The FNN analysis reveals that the noise level in the hourly rainfall is more than the daily rainfall, whereas the noise level in the daily runoff series is more when compared to that of the hourly runoff series. Since noise in the data limits the accuracy of prediction (i.e., the prediction error is always greater than the noise level), the above may be an indication of better predictability of daily rainfall and hourly runoff than the hourly rainfall and daily runoff, respectively. The prediction results for the daily rainfall showed good prediction with CC values ranging between 0.56 and 0.69, whereas the hourly rainfall resulted in poor prediction with CC values between 0.46 and 0.51. In the case of daily runoff, the local approximation method gave good prediction (CC is in the range of 0.67 to 0.87), and hourly runoff showed very good prediction (CC is in the range of 0.98 to 0.99). The findings of the local approximation approach are in line with the predictability identified from the FNN analysis.

How to cite: Saji, N., Jothiprakash, V., and Sivakumar, B.: Effect of temporal scale on prediction using local approximation approach of hydrologic series in the Savitri basin in India, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-15297, https://doi.org/10.5194/egusphere-egu25-15297, 2025.

11:05–11:15
|
EGU25-9768
|
On-site presentation
Martin Gauch, Frederik Kratzert, Daniel Klotz, Guy Shalev, Deborah Cohen, and Oren Gilon

Deep Learning models for streamflow prediction are now more than five years old (Kratzert et al., 2018, 2019), and lumped LSTMs, trained on as many basins and forcing products as we can get our hands on, continue to pose the state of the art. Or do they?

While traditional hydrologic modeling has long moved beyond lumped modeling, Deep Learning methods are only now starting to leverage the inherent graphical topology of rivers through graph neural networks (GNNs). Such models come with their own set of challenges, both from an engineering standpoint (e.g., dealing with the sheer amount of data from many small sub-basins) and from a modeling standpoint (e.g., ensuring generalization to ungauged basins along the river graph). Yet, GNNs promise more accurate predictions, the ability to assimilate real-time up- and downstream data, make predictions at arbitrary points along a river, or integrate knowledge about human intervention. 

We present a Deep Learning semi-distributed hydrologic model that combines the time-series capabilities of LSTMs with a learned GNN routing mechanism. The model is trained on streamflow data from all around the world, providing predictions that are strong competitors to their lumped counterparts—especially on large, ungauged rivers.



Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M.: Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks, Hydrol. Earth Syst. Sci., 22, 6005–6022, 2018.

Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., and Nearing, G.: Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets, Hydrol. Earth Syst. Sci., 23, 5089–5110, 2019.

How to cite: Gauch, M., Kratzert, F., Klotz, D., Shalev, G., Cohen, D., and Gilon, O.: Towards Deep Learning River Network Models, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9768, https://doi.org/10.5194/egusphere-egu25-9768, 2025.

11:15–11:25
|
EGU25-14915
|
ECS
|
Virtual presentation
Insaf Aryal and Natthachet Tangdamrongsub

 

Hydrological modeling is essential for understanding water balance dynamics, yet physical models often face limitations such as computational inefficiency, insufficient representation of complex processes, and challenges in integrating diverse data sources. To address these limitations, data-driven models offer enhanced predictive capabilities, scalability, and real-time analysis potential. In this study, a Long Short-Term Memory (LSTM) model was developed to simulate the water balance using open-source ERA-5 reanalysis data, trained with outputs from the Noah-MP land surface model conducted over Thailand as a case study. The trained LSTM model was subsequently transferred to other basins with varying land use, topography, and climatic conditions, enabling an evaluation of its adaptability across diverse environments. Seasonal performance was assessed to understand the model's sensitivity to climatic variability. To enhance the accuracy of water balance predictions, satellite datasets, including GRACE-derived terrestrial water storage, GLEAM-derived evapotranspiration, and SMAP-derived surface soil moisture, were assimilated into the data-driven model, improving its representation of hydrological processes. Model performance was assessed using observations, yielding notable results: correlation coefficients (R) of 0.98, 0.89, 0.99, and 0.99; and RMSE values of 16.77, 8, 5.2, and 0.01 for runoff, evaporation, groundwater, and soil moisture, respectively. This study highlights the potential of combining data-driven approaches and satellite data assimilation to improve water balance modeling,  providing accurate hydrological predictions across regions with diverse landscapes and climatic regimes.

How to cite: Aryal, I. and Tangdamrongsub, N.: Data-Driven Surrogate Modeling with Satellite Data Assimilation: Advancing Basin-Scale Hydrology for Water Balance Simulation, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-14915, https://doi.org/10.5194/egusphere-egu25-14915, 2025.

11:25–11:35
|
EGU25-5440
|
On-site presentation
jeonghwan baek, jungi moon, sangjin jung, sungmin suh, seunghyeon lee, chanhae ok, and jongcheol pyo

Heavy metal contamination in river sediments poses serious risks to human health, particularly through the use of river water as a drinking source and the bioaccumulation of pollutants in aquatic ecosystems. Industrial wastewater discharge and soil erosion caused by rainfall introduce heavy metals into rivers. These metals undergo adsorption and deposition processes, accumulating in sediments where natural removal is exceedingly slow. Moreover, current sediment contamination assessments rely on direct sampling and chemical analysis, which are time-consuming and costly. To enable more efficient monitoring of heavy metals, there is a growing need for predictive modeling using machine learning techniques.

          This study aims to identify the optimal machine learning model for predicting heavy metal concentrations in river sediments. The target heavy metals include Zn, Cu, Ni, Cd, and Hg. For model development and validation, nine years of data from South Korea's four major rivers (Han, Nakdong, Yeongsan, and Geum Rivers) were utilized. Considering the imbalance in the dataset due to the distinct characteristics of heavy metal inflows from polluted wastewater discharges from industrial areas and other sources, preprocessing techniques such as Z-score normalization and MinMaxScaler were employed to standardize the data. Three approaches were evaluated: Convolutional Neural Networks (CNNs), Random Forest, and a hybrid CNN RF model combining CNN parameters with Random Forest. Among these, the Random Forest model demonstrated relatively higher accuracy than the others. By leveraging machine learning techniques, this study offers a practical alternative to traditional methods, overcoming temporal and spatial limitations while significantly reducing the time and costs associated with sediment monitoring.

How to cite: baek, J., moon, J., jung, S., suh, S., lee, S., ok, C., and pyo, J.: Evaluating Machine Learning Models for Predicting Heavy Metal Contamination in Sediments of South Korea's Four Major Rivers, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-5440, https://doi.org/10.5194/egusphere-egu25-5440, 2025.

11:35–11:45
|
EGU25-5441
|
On-site presentation
Chanhae Ok, Jungi Moon, Jeongwhan Baek, Sungmin Suh, Sangjin Jung, Seunghyeon Lee, and Jongcheol Pyo

Algal blooms by eutrophication are regarded as a serious issue in many regions including Korea’s Four Major Rivers. Accurately measuring water Chlorophyll-a (Chl-a) is essential to propose effective solutions for addressing this problem. However, it is very hard to obtain water quality data for all desired regions through direct measurement. By utilizing remote sensing to collect a large amount of data from various water bodies, an accurate and rapid model to estimate Chl-a concentration can be developed, playing a crucial role in addressing the algae problem.

This study utilized Sentinel-3 OLCI (Ocean and Land Color Instrument) data with a spatial resolution of approximately 300 meters. Bio-optical algorithms were applied to estimate Chl-a concentration. Bio-optical algorithms vary in types depending on the parameters used, such as radiance, reflectance, and Inherent optical properties (IOPs). In this study, IOPs were utilized to use the inherent properties of water. Accurate IOPs estimation is important because the coefficients of IOPs estimation algorithms are influenced by regional and temporal variability. The Bottom of Atmosphere (BOA) reflectance, derived from radiance data of OLCI EFR using the C2RCC processor, was utilized to estimate IOPs. Based on the derived IOPs, bio-optical algorithms were applied to estimate Chl-a concentration. After that, reinforcement learning was employed to refine the IOPs estimation process, dynamically adjusting coefficients to improve Chl-a concentration accuracy across varying conditions. Observed Chl-a data from the Water Environment Information System were used for model training and validation. Therefore, this study aims to estimate and map algal concentrations across Korea’s Four Major Rivers. Reflectance-based NDWI was calculated to delineate inland water bodies, and the reflectance data were incorporated into the Chl-a reinforcement learning model developed in this study to generate detailed spatial maps. This study is expected to contribute to solving green algae problems and water quality management by enabling more accurate and rapid Chl-a concentration estimation as it is not swayed by regional and temporal variations.

How to cite: Ok, C., Moon, J., Baek, J., Suh, S., Jung, S., Lee, S., and Pyo, J.: Building a Reinforcement learning Model for Estimating Reliable Algae Concentration: Widely Applicable Correction Factors, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-5441, https://doi.org/10.5194/egusphere-egu25-5441, 2025.

11:45–11:55
|
EGU25-7543
|
On-site presentation
J. Andres Estupiñan-Camero and J. Sebastian Hernandez-Suarez

Evapotranspiration (ET) is of paramount importance due to its crucial role in the water cycle, moving water from land to the atmosphere. This process is critical for sustaining atmospheric rivers and guiding water management operations. Process-based hydrological modeling is commonly used to predict ET in various ecosystems. However, while plant growth dynamics are better understood in temperate regions, the accuracy of ET predictions in tropical areas remains limited. This reduced accuracy is primarily due to challenges in simulating the Leaf Area Index (LAI), the intensity of mass and energy exchanges, and the prevalence of energy-limited conditions.

In this study, we explore the potential of data-driven models to estimate LAI in the tropics and improve ET predictions. We implemented a Multilayer Perceptron (MLP) model trained using climatological variables from ERA-5 and CHIRPS as inputs. The model's performance was evaluated using LAI values from MODIS at the Cesar River Watershed in northern Colombia, South America.

Comparisons between the selected MLP model and SWAT reveal an improvement over the default LAI simulated by the latter. Particularly, SWAT underestimates foliage growth and fails to capture the bimodal behavior observed in the study area. The MLP model, tested at the watershed and Hydrologic Response Unit (HRU) scales, demonstrated promising results. The performance of the proposed MLP model was evaluated using shuffled and sequential schemes, achieving validation Nash-Sutcliffe efficiencies between 0.5 and 0.99 at the tested scales. In addition, the results show that the MLP model is especially sensitive to the seasonal component of relative humidity. By leveraging remote sensing data, data-driven models become a potential tool to simulate remote sensed LAI with greater accuracy. This potential of the MLP model to significantly improve LAI and ET predictions can enhance hydrologic models' reliability, especially under shifting environmental conditions, and offers an enhanced outlook for better simulating multiple water compartments in the tropics.

How to cite: Estupiñan-Camero, J. A. and Hernandez-Suarez, J. S.: Leaf Area Index prediction in the Tropics using Machine Learning and Remote Sensing, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7543, https://doi.org/10.5194/egusphere-egu25-7543, 2025.

11:55–12:05
|
EGU25-16098
|
ECS
|
On-site presentation
Ghunwa Shah and Tomasz Kolerski

Climate change in the Gdańsk region, particularly in terms of precipitation, is marked by an increase in the intensity of individual rainfall events, while the annual total precipitation remains relatively stable. High-intensity rainfall often triggers flood surges, as seen in two major episodes in July 2016 and 2017, which caused flash floods in the catchments of streams flowing through the city.

In this context, accurate precipitation forecasting is crucial for safeguarding the city against flooding. This study aims to predict precipitation over the Oliwski Stream watershed using data-driven machine learning techniques, focusing on hourly and daily precipitation prediction. The dataset comprises observed temperature and rainfall data from three stations surrounding the watershed, sourced from the municipal monitoring system (Oliwa IBW and Matemblewo stations) and the national meteorological network (Gdańsk Airport), covering the period from 2005 to 2024.

Three machine learning regression models—Artificial Neural Network Multilayer Perceptron (ANN-MLP), Multiple Linear Regression (MLR), and Random Forest (RF)—will be applied for rainfall forecasting. Model performance will be evaluated using statistical metrics, including Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and the coefficient of determination (R²). This study will be helpful for water managers and researchers in the future.

How to cite: Shah, G. and Kolerski, T.: Machine Learning Approach to Rainfall Prediction in the Oliwski Stream Watershed, Gdańsk, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-16098, https://doi.org/10.5194/egusphere-egu25-16098, 2025.

12:05–12:15
|
EGU25-152
|
ECS
|
On-site presentation
Apoorva Bamal, Md Galal Uddin, and Agnieszka I. Olbert

Climate change is one of the most critical global challenges causing the disruption of the complex hydro-climatic systems and is significantly affecting the quantity and quality of water resources. For the mitigation of the adverse effects of climate change impact on water resources, it should be measured accurately. To the best of the authors’ knowledge, there are no specific approaches available that can be utilized to effectively detect or evaluate the degree of impact that various hydro-climatic factors have on water quality. Addressing these challenges, the research introduced a comprehensive framework for assessing the impact of various hydro-climatic factors on surface water quality (WQ). In terms of the novelty of the research, the developed framework considered a range of vital hydro-climatic and WQ indicators. While most existing studies focus on specific WQ indicator(s) or hydroclimatic variable(s), this study was the first attempt to develop a tool by combining a set of hydro-climatic variables and WQ indicators in order to determine the impact of hydro-climatic factors on WQ. To achieve this, the study utilized 23 years of historical data (2000-2022) for eight hydro-climatic variables including precipitation, temperature, evapotranspiration, windspeed, surface run-off, total run-off, solar radiation, and relative humidity in County Cork and nine WQ indicators including temperature, total organic nitrogen, dissolved oxygen, pH, salinity, molybdate-reactive phosphorus, biological oxygen demand, and transparency, and dissolved inorganic nitrogen in Cork Harbour (2007-2022). Advanced machine learning (ML) and artificial intelligence (AI) techniques were employed to analyze long-term, high-dimensional hydro-climatic data patterns. To detect the historical data pattern in the dataset(s), the study developed 15 ML/AI models to predict the patterns of eight hydro-climatic variables and the overall WQ trend, using the recently developed and widely utilized Irish Water Quality Index (IEWQI). Moreover, advanced statistical methods were also applied to validate the reliability and trend patterns of the ML/AI results.

The research also explored the relationship between the eight hydro-climatic variables and the overall WQ trend (IEWQI scores) by creating two scenarios- actual trends and simulated trends- to evaluate the impact of these variables on water quality in Cork Harbour. ANN-MLP outperformed the 14 ML/AI algorithms in predicting the trends of different hydro-climatic variables (except evaporation) and IEWQI scores, while for evaporation, the hybrid model (CNN+RNN+DNN) outperformed. The advanced statistical approaches confirmed that both hybrid models were effective for identifying historical trends in high-dimensional hydro-climatic data.

Therefore, the findings suggest that hybrid models can effectively predict trends and data patterns in high-dimensional data, such as hydro-climatic variables, with a high degree of confidence (95%) in understanding the historical data characteristics. Additionally, the research suggests that the scalability and applicability of these hybrid models should be further explored using different datasets. It also encourages additional research to assess the impact of hydro-climatic variables on WQ, considering the spatio-temporal resolution of domains. Moreover, the developed framework could effectively aid policymakers, water resource managers, and researchers in formulating strategies to assess changes in WQ due to various hydro-climatic events and promote sustainable resource management.

How to cite: Bamal, A., Uddin, M. G., and Olbert, A. I.: A comprehensive framework for assessing the hydro-climatic impacts on water quality using data-driven methods, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-152, https://doi.org/10.5194/egusphere-egu25-152, 2025.

12:15–12:25
|
EGU25-7353
|
ECS
|
On-site presentation
Elaf Seif, Essam Shaaban, Ahmed Azzam, and Abdallah Ragab

Air Gap Membrane Distillation (AGMD) is a promising desalination technology with significant potential for addressing the negative environmental impacts of brine disposal. However, the interplay of operational parameters significantly impacts its performance, making optimization a challenging task. This research focuses on brine desalination as a means to mitigate the negative environmental impacts of brine disposal. By optimizing the AGMD process, the study aims to provide a sustainable solution for handling brine while producing freshwater.

An ANN model is trained and validated using experimental data while varying membrane pore size, feed salinity and feed flow rate to predict two critical performance metrics: permeate flux and specific thermal energy consumption (STEC). Different activation functions and different numbers of neurons were tested. The ReLU activation function was found to be the most effective with 25 neurons resulting in a RMSE of 0.068. The model achieved an R² value of 0.92, 0.9123, and 0.9005 for the training, validation, and test datasets, respectively. For the combined dataset, the model achieved an R² value of 0.9156. While flux predictions yielded a slightly lower R² value of 0.8697, STEC predictions achieved the highest R² value of 0.9316, showcasing higher precision in the prediction of energy consumption metrics.

As for optimization, results for the 0.2 µm membrane reveal that optimal salinity levels depend on feed flow rate. At higher flow rates (> 1.5 lpm), a salinity of 65,000 ppm achieves superior performance, producing higher flux with relatively lower STEC compared to lower salinities. For the 0.45 µm membrane, higher salinity levels of 65,000 ppm generally result in lower STEC for a given flux across all flow rates. As indicated by the pareto front, the 0.2 µm membrane offers a more energy-efficient balance between water production and energy use compared to the 0.45 µm membrane.

Differential evolution is then applied to predict optimal performance metrics by assigning different weights to flux and STEC. This approach allows for the identification of operating conditions that best meet specific application needs, ensuring a tailored balance between water production and energy efficiency. By addressing the challenges of brine desalination through AGMD, this study provides a pathway for reducing the environmental risks associated with brine disposal. It also contributes to sustainable water management strategies by enabling the efficient recovery of freshwater from brine.

How to cite: Seif, E., Shaaban, E., Azzam, A., and Ragab, A.: Multi-Parameter Optimization of Brine Desalination using Machine Learning, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7353, https://doi.org/10.5194/egusphere-egu25-7353, 2025.

12:25–12:30

Posters on site: Thu, 1 May, 16:15–18:00 | Hall A

The posters scheduled for on-site presentation are only visible in the poster hall in Vienna. If authors uploaded their presentation files, these files are linked from the abstracts below.
Display time: Thu, 1 May, 14:00–18:00
Chairperson: Alessandro Amaranto
A.11
|
EGU25-3400
Sungmin Suh, Sangjin Jung, Jungi Moon, Jeonghwan Baek, Seunghyun Lee, Chanhae Ok, and Jongcheol Pyo

Fecal coliforms are thermotolerant bacteria excreted from warm-blooded animals into soil and water, contaminating water bodies through runoff and resuspension of sediments. This contamination poses significant public health risks, especially during summer recreational activities, leading to waterborne diseases like diarrhea, typhoid, cholera, and dysentery. Monitoring and managing fecal coliform levels in recreational waters are crucial for public health and environmental safety. However, variability in fecal coliform concentrations due to human and wildlife activities complicates the management. This study aims to enhance water safety and public health by utilizing sentinel-2 band reflectance data and backscattering albedo to understand the relationship between fecal coliform reflectance in the rivers to generalize the fecal coliform management model.

In this study, we constructed Sentinel-2 dataset covering the period from January 2017 to December 2022 for the Han, Nakdong, Geum, and Yeongsan Rivers in South Korea. To accurately align the water quality monitoring stations with the Sentinel-2 data, we ensured that the latitude and longitude coordinates were free from clouds and not located on bridges. Therefore, monitoring stations that did not meet the specified conditions, with an above NDWI (Normalized Difference Water Index) of 0.1, and a below HOT (Hazed-Optimized Transformation) of 0.05 were preprocessed. For the preprocessed data points, this study converted the reflectance values of 10 Sentinel-2 bands (2, 3, 4, 5, 6, 7, 8, 8A, 11, and 12) into backscattering albedo. This approach was taken to account for the characteristics of fecal coliform, which is colorless. Model training was performed using CNN (Convolutional Neural Network), ANN (Artificial Neural Network), Random Forest, and XGBoost. As a result, CNN successfully predicted the trend of fecal coliform in the all the rivers and showed superior performance compared to other models. The results of this study are expected to provide a basis for fecal coliform management using Sentinel-2 band reflectance data in the four major rivers of South Korea and other regions around the world.

How to cite: Suh, S., Jung, S., Moon, J., Baek, J., Lee, S., Ok, C., and Pyo, J.: Machine Learning-Driven Estimation of Fecal Coliform Concentrations Using Sentinel-2 Imagery in South Korea, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-3400, https://doi.org/10.5194/egusphere-egu25-3400, 2025.

A.12
|
EGU25-4192
|
ECS
Mohammad Taani, Falk Händel, Catalin Stefan, and Traugott Scheytt

The increasing water scarcity around the world has led to a widespread interest in the implementation of managed aquifer recharge (MAR) systems, which offer the potential for storing surface water underground for future use or for environmental benefits. MAR has been proven to be an effective approach in addressing problems related to spatial and temporal water shortages and mitigation of climate change impacts on global water resources. Nevertheless, when designing MAR systems, competing objectives must be balanced, such as optimizing recharge efficiency while reducing operational costs. To address these trade-offs and aid decision-making, this study aims to develop a novel framework for a multi-objective optimization of MAR systems. The paper introduces the first design steps and the general structure of a framework that integrates the capabilities of the existing web-based groundwater modelling platform INOWAS (www.inowas.com) with a hybrid evolutionary algorithm. The framework effectively explores complex solution spaces by combining groundwater models setup on the INOWAS platform using tools from MODFLOW family (MODFLOW-2005, MT3DMS, SEAWAT) with global search capabilities (using Genetic algorithm) and local refining methods (using Simplex algorithm). This allows the simulation of specific MAR challenges such as, for example, optimization of recharge wells’ location by maximizing the removal of total dissolved solids (TDS) at recovery wells, the water recovery efficiency, the recharge rate and the overall economic feasibility, etc. Solutions are expressed as pareto fronts, which represent a set of optimal trade-off solutions that are non-dominated to each other but are superior to the rest of solutions in the search space.  To achieve a tool which eventually provides a robust framework for planners, engineers, and policymakers to design and manage MAR systems effectively, typical MAR scenarios will be defined to identify and classify boundary conditions and limiting factors together with the objectives to be optimized.

How to cite: Taani, M., Händel, F., Stefan, C., and Scheytt, T.: Web-based Framework for Multi-Objective Optimization of Managed Aquifer Recharge, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-4192, https://doi.org/10.5194/egusphere-egu25-4192, 2025.

A.13
|
EGU25-4909
Zhe Zhu, Yu Li, Guangtao Fu, and Chi Zhang

Pollution Source Identification (PSI) based on watershed environmental sensing (IoT, low-cost sensors, etc.) is a key topic in hydroinformatics and watershed water resources/quality management, and timely, accurate PSI is crucial for reducing water environmental risks. Machine learning-based PSI directly maps water environmental observations to source information, offering high computational efficiency and emerging as a new research trend. However, the high uncertainty and spatial sparsity in water environmental observations force Machine learning-based PSI methods to face the trade-off problem between PSI accuracy and data volume-quality requirements, creating an urgent need for data-demand-reduction strategies to facilitate the PSI practical adoption in water management. Therefore, this study proposes an X-T-C image recognition-based ResNet Machine learning PSI method coupled with data Inpainting techniques (InRes-PSI). InRes-PSI converts spatial coordinates (X), time (T), and pollutant concentration (C) into 2D images and realizes end-to-end localization and reconstruction through multi-feature convolution, reducing the interference of data uncertainty; In addition, InRes-PSI integrates an image inpainting strategy to fill missing data under sparse monitoring conditions, thereby ensuring reliable PSI with fewer data, reducing the data volume demand of PSI. Tests on real and semi-synthetic river cases show that InRes-PSI effectively handles non-point pollution uncertainty interference, improving PSI accuracy by 6.27% and 7.72% compared to the Batch-Matching method and the LeNet, respectively; As for data-demand-reduction, the inpainting strategy enables reliable PSI even when half of the grid data are missing, which can reduce the density of stations by about 55% in a real watershed. Additionally, we discovered a logarithmic relationship between the river flow field characteristic (Péclet number) and sensor deployment density, indicating that diffusion-dominated rivers require higher sensor density. This finding can provide an intuitive and transferable design for water-environment sensing and digital watershed management.

How to cite: Zhu, Z., Li, Y., Fu, G., and Zhang, C.: Optimal Pollution Source Identification via machine learning approach based on X-T-C image recognition and Inpainting, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-4909, https://doi.org/10.5194/egusphere-egu25-4909, 2025.

A.14
|
EGU25-5151
|
ECS
Henning Müller, Marvin Hempel, Jens Heger, and Kai Schröter

Water Management in low-lying coastal regions of Germany is characterized by controlled drainage of polder areas. Flood risk in these coastal polders depends on the storage and drainage capacity of the infrastructure and the effectiveness of drainage control. Current operations rely on on-site specialists who base their decisions on expertise, system status, and ad-hoc interpretation of weather and tidal forecasts to manage the system and meet variable target stages. Effective management requires the consideration of flood and tidal dynamics of the adjacent marine or fluvial systems as well as the flood dynamics within the polder. Climate change significantly impacts these factors, driving adaptation needs for drainage management for low-lying coastal regions.

To address these challenges, we develop a model-based approach for optimizing drainage operations in a German coastal polder, aligning water and energy objectives to enhance flood risk and water resource management through increased operational flexibility. The model system incorporates deep learning-based forecasts of drainage volumes and water levels, surrogate models of drainage processes and wind energy availability, operational status data, and meteorological and tidal forecasts to optimize short-term sluice and pump operations of the primary drainage infrastructure via mixed-integer linear programming. We show that this integrated optimization approach reschedules pumping operations to coincide with high energy availability periods, thus reducing costs and enhancing renewable energy utilization while meeting the drainage management objectives. This approach is also applicable for anticipatory drainage management, facilitating preemptive adjustments to drainage operations in response to impending flood events or prolonged drought conditions, thereby mitigating associated risks.

How to cite: Müller, H., Hempel, M., Heger, J., and Schröter, K.: Integrated modelling and control optimization for adaptive drainage management in coastal lowlands, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-5151, https://doi.org/10.5194/egusphere-egu25-5151, 2025.

A.15
|
EGU25-5432
SangJin Jung, SungMin Suh, JunGi Moon, JeongHwan Baek, SeungHyeon Lee, ChanHae Ok, and Jongcheol Pyo

High concentrations of chlorophyll-a (Chl-a) in aquatic systems pose serious environmental and public health concerns. Chl-a, a primary marker of phytoplankton biomass, is often associated with the proliferation of harmful algal blooms (HABs). These blooms produce toxins that not only threaten marine organisms but also have far-reaching impacts on human health and aquatic ecosystems. These toxins can degrade water quality, disrupt food webs, and result in significant fish mortality. When these harmful substances contaminate drinking water sources, they can cause a range of health problems, from short-term illnesses to chronic diseases.

Despite the importance of predicting Chl-a levels, earlier research has largely focused on water quality parameters without adequately considering the dynamic nature of river hydrology. This study bridges that gap by leveraging satellite data to enhance predictive accuracy. Sentinel-2 imagery was utilized to monitor water quality, while Sentinel-1 data captured the hydrological characteristics of rivers. To forecast Chl-a, four machine learning models were deployed, with their performance evaluated through Nash-Sutcliffe Efficiency (NSE) and Root Mean Square Error (RMSE) metrics. Additionally, the study used Shapley Additive Explanations (SHAP) to unravel the contribution of individual water quality variables and satellite-derived data to the prediction process.

By integrating hydrological factors with water quality predictions, this research provides a more holistic understanding of river systems. Such insights are vital for optimizing the operation of water management structures like dams and weirs. Moreover, the incorporation of retention time analysis offers a proactive approach to monitoring and preventing HABs, enabling more effective management of aquatic ecosystems under varying environmental conditions worldwide.

How to cite: Jung, S., Suh, S., Moon, J., Baek, J., Lee, S., Ok, C., and Pyo, J.: Water Quality Prediction Using Machine Learning with Hydrologic factors and Satellite Imagery Integration, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-5432, https://doi.org/10.5194/egusphere-egu25-5432, 2025.

A.16
|
EGU25-7069
|
ECS
Karthik Ramesh, Laura Ramsamy, Patricia Sullivan, Nicholas Leach, Victor Padilha, Graham Reveley, Sally Woodhouse, Joe Stables, James Brennan, Aidan Starr, and Claire Woodcock

Advancements in the fields of remote sensing, and high-performance computing have facilitated higher resolution and coverage of global flood risk maps. However, the distribution and availability of streamflow from in-situ gauges is not uniformly distributed, posing significant challenges. Machine learning (ML) offers a powerful framework to augment streamflow datasets by leveraging diverse data sources, such as remote sensing, climate reanalysis, and hydrological simulations. This study explores the application of ML techniques to generate synthetic streamflow data for ungauged basins, enhancing the coverage, and quality of global flood models for commercial applications. By integrating the principles of physical hydrology with data-driven approaches, we demonstrate that ML can effectively capture spatial and temporal dynamics of streamflow in regions with scarce observational data, or seasonal variation in flows. Key methods include supervised learning algorithms trained on gauged basins to predict streamflow to create a synthetic dataset of streamflow observations. Validation using global hydrological benchmarks indicates that the ML-augmented datasets significantly improve flood prediction accuracy, particularly in data-sparse regions.

How to cite: Ramesh, K., Ramsamy, L., Sullivan, P., Leach, N., Padilha, V., Reveley, G., Woodhouse, S., Stables, J., Brennan, J., Starr, A., and Woodcock, C.: Machine Learning to augment global flood modelling in ungauged basins, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7069, https://doi.org/10.5194/egusphere-egu25-7069, 2025.

A.17
|
EGU25-7429
|
ECS
Camilla Giulia Billari, Marc Girona-Mata, Kevin Wheeler, Andrea Marinoni, and Edoardo Borgomeo

Hydrological analysis and prediction with sparse and discontinuous data remain a key challenge for water resources planning and climate adaptation, especially in large river basins across the Global South.  Traditional stochastic hydrology methods and process-based models often fall short in their attempts to capture the complexity of these systems. Recent efforts to apply machine learning for river discharge imputation (assigning values to any data gaps in the target variable) and reconstruction (the inclusion of other proxy data to further inform imputation, such as climatic variables) show promise in creating complete historical datasets based on a limited set of discontinuous observations. However, these methods have not been tested on datasets from large river basins with a high proportion of missing values. Here, we address this gap and investigate the suitability of machine learning methods for streamflow imputation and reconstruction in a case study of the Nile River basin. We tested a range of common regression models, imputers (algorithms designed specifically for the purpose of estimating missing data points but with limited flexibility), and Conditional Neural Processes (CNPs, models that leverage the advantages of both deep neural networks and Gaussian Processes). We modelled 13 stations with different observational periods to fill a dataset with 53% missing values between 1900-2002. The first set of benchmarking experiments relied solely on spatio-temporal gauged streamflow data as input to the models (imputation). The second set also incorporated climate proxies from ECMWF ERA5 reanalysis data to model streamflow from 1964-2002 (reconstruction). For this, we took monthly average precipitation, temperature, relative humidity, wind speed, and soil moisture data.

Imputation experiments found random forest and gradient-boosting regressors achieving the most consistent mean and median scores of Root Mean Squared Error (RMSE), Coefficient of Determination (R2), and Nash Sutcliffe Efficiency (NSE) across all stations. Bayesian ridge regression and the CNP performed the worst on these metrics. Reconstruction experiments using the same models with the added input of climate proxies yielded similar findings, with gradient-boosting regression again outperforming the other methods. CNP found a salient improvement in metric performance by including these proxies, while regressors modelled the data less accurately. This suggests that contextual data benefit the meta-learning capabilities of the CNP, but it is too much information for the regressions to capture. CNP was the only well-performing model tested that provided uncertainty estimates for the predictions. Nearly all models achieved an average NSE>0.7 across all stations in all experiments, thus suggesting that machine learning methods can be a reliable and scalable streamflow imputation method. The approach developed in this study can be applied to other river basins with sparse observations to build more complete hydrological datasets for water resources management and planning applications.

How to cite: Billari, C. G., Girona-Mata, M., Wheeler, K., Marinoni, A., and Borgomeo, E.: Machine Learning for Reconstructing Streamflow Time Series: An Application to the Nile River, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7429, https://doi.org/10.5194/egusphere-egu25-7429, 2025.

A.18
|
EGU25-7759
Park Jongpyo, Darae Kim, Soohyun Kim, and Yonghyeon Gwon

A flood analysis of the Hyoja drainage basin (Gwanghwamun area) in Seoul was conducted using the XP-SWMM model, which includes 315 pipeline datas and 293 subbasins. The model was created by considering topographical factors (e.g., pipeline slope and buildings). curve number (CN) values were determined for each subbasin based on land use and detailed soil maps for estimating infiltration. Additionally, a digital terrain model (DTM) was generated using a 1:5,000 digital topographical map and survey data.

In the XP-SWMM model, the evaluation of ponding varies depending on the ponding conditions (e.g., Ponding Allowed(PA), Link Spill Crest to 2D(2D), None Poning(NP)), leading to considerable differences in the calculation of hydraulic heads at the nodes. The Hyoja drainage basin, the focus of this study, features a steep slope in the upstream area and hilly terrain in the downstream, indicating potential limitations in accurately modeling flood volume.

To overcome the limitations of the XP-SWMM model, the surface slope and lowlands were incorporated when establishing node conditions. Areas with a slope of less than 5% were assigned the “2D” option, whereas areas with a slope greater than 5% were assigned the “NP” condition. In addition, supplementary modeling was conducted by combining certain subbasins to assess the impact of runoff from upstream areas on downstream flooding.

The flood volume analysis based on the ponding conditions under 95 mm/hr rainfall revealed that the “NP” condition tended to underestimate flood volume, while the “2D” condition tended to overestimate it by assuming ponding was present even in sloped areas. As a result, the “2D” condition was applied to basins with a surface slope of 5% or less, while the “NP” condition was applied to areas that were not lowlands. Lowlands were designated as areas with an elevation below 40.0 m, accounting for approximately 20% of the total area.

When comparing the flood volume before and after merging the subbasins, the flood volume in upstream areas with a slope of 5% or more decreased, whereas the volume in the downstream Gwanghwamun area increased. This situation was attributed to runoff from the upstream basin, which added to the increase in the downstream flood volume.

The study findings showed that by setting node ponding conditions considering basin slope and lowland conditions, simulating flooding events that closely resemble actual occurrences is possible. Thus, limitations of the XP-SWMM model can be overcome. Furthermore, merging subbasins has proven to be an effective approach for analyzing the interactions between upstream and downstream regions and assessing the impacts of critical flood zones. Technical decisions should reflect consideration for characteristics of the basin and topographical factors when designing a model

Acknowledgements

This work was supported by Korea Planning & Evaluation Institute of Industrial Technology funded by the Ministry of the Interior and Safety (MOIS, Korea). [Development and Application of Advanced Technologies for Urban Runoff Storage Capability to Reduce the Urban Flood Damage / RS-2024-00415937]

 

 

How to cite: Jongpyo, P., Kim, D., Kim, S., and Gwon, Y.: Assessment of Model Applicability Based on Node Ponding Conditions of the Hyoja Drainage Basin (Gwanghwamun area), EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7759, https://doi.org/10.5194/egusphere-egu25-7759, 2025.

A.19
|
EGU25-7767
Inkyeong Sim, Kyoungdo Lee, Yonghyeon Gwon, and Darae Kim

Recently, Korea has faced several challenges associated with limited water supply and agricultural water shortages, driven by an increase in the frequency of sudden droughts and a decrease in the water storage rate of reservoirs. Reservoirs play a crucial role in the water supply during periods of drought. Enhancing water supply efficiency through optimized reservoir management is becoming increasingly important for efficient use of the water stored in reservoirs during droughts. Therefore, this study aimed to assess the water supply capacity of new reservoirs based on different scenarios to ensure their functionality during emergencies.

Two scenarios were developed and applied in this study based on the installation and operating conditions of a new water source with a daily capacity of 7,000 m³ to analyze and evaluate its water supply capacity. First, a scenario was developed based on water usage; second, another scenario was created based on precipitation levels. The water supply capacity was then assessed for each scenario.

The reservoir inflow was calculated using the natural flow determined by the soil moisture storage structure TANK model, which is a rainfall-runoff model. The flow rate, used to analyze the water supply capacity of the water source, was derived from the flow rate data comprising 10 years (2012–2021).

In the water supply capacity analysis using the first scenario, the reservoir tracking method was used to analyze the daily time series of the reservoir inflow, evaporation, and water supply volume. Subsequently, an annual assessment was conducted to evaluate the impact of water shortages on the daily water supply. The water supply capacity review estimated that, under the operation condition of using 7,000 m³ of water per day, there would be 367 days (out of a total of 3,652 days) with water shortages over a 10-year period. The normal supply guarantee rate was 89.9%. Furthermore, water usage conditions ranging from 8,000 m³ to 14,000 m³ were applied in succession. The number of days with water shortages ranged from a minimum of 469 to a maximum of 903 days, with the supply guarantee rate varying between 75.3% and 87.2%.

Subsequently, the precipitation scale scenario was implemented based on the operating condition of 7,000 m³ per day, with precipitation levels ranging from 10 mm to 43.5 mm and a water storage rate capped at 100%. The application of the scenario revealed that the number of supply days, depending on the precipitation level, ranged between 6 and 27 days. 

 

Acknowledgements

This work was supported by Korea Environment Industry & Technology Institute(KEITI) through Water Management Program for Drought Project, funded by Korea Ministry of Environment(MOE).(RS-2023-00230286)

 

How to cite: Sim, I., Lee, K., Gwon, Y., and Kim, D.: Scenario-Based Assessment of Water Supply Capacity for New Reservoirs, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7767, https://doi.org/10.5194/egusphere-egu25-7767, 2025.

A.20
|
EGU25-7993
|
ECS
|
Conrad Brendel, René Capell, Alena Bartosova, Mark Horan, and Duong Bui

HYPEtools is an open-source R package for hydroinformatics that simplifies hydrological modeling and data analysis through a suite of tools for data management, visualization, interpretation, and exploration. Although initially conceived as a companion toolbox for the HYPE hydrological model, HYPEtools has since grown into a standalone package with a diverse collection of functions which can be used as standalone tools or as building blocks for larger scripts, workflows, and apps. Applications of the package include (1) data manipulation, conversion, and aggregation, (2) analysis and summarization of complex datasets, (3) plotting and mapping, and (4) interactive data exploration. Case studies demonstrate how HYPEtools can be used to facilitate hydrological workflows and analyses, both independently of and within HYPE modeling contexts. First, HYPEtools is used to streamline the analysis and mapping of water transfers in South African. Then, HYPE model setups of river basins in Vietnam illustrate how HYPEtools can assist with the development, calibration, and validation of hydrological and water quality models.


Brendel, C., R. Capell, and A. Bartosova. (2024) Rational gaze: Presenting the open-source HYPEtools R package for analysis, visualization, and interpretation of hydrological models and datasets. Environmental Modelling & Software, 178, 106094. https://doi.org/10.1016/j.envsoft.2024.106094

How to cite: Brendel, C., Capell, R., Bartosova, A., Horan, M., and Bui, D.: Hyping up HYPEtools, an open-source R package for analysis, visualization, and interpretation of hydrological models and datasets, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7993, https://doi.org/10.5194/egusphere-egu25-7993, 2025.

A.21
|
EGU25-9405
|
ECS
Alberto Mena, Rafael J. Bergillos, Javier Paredes-Arquiola, Abel Solera, and Joaquín Andreu

The Tagus-Segura aqueduct (TSA) is a strategic water transfer scheme and the largest hydraulic infrastructure in Spain. It consists of a 286 km-long pipeline that connects the Bolarque reservoir, in the Tagus River Basin, to the Talave reservoir, in Segura River Basin, which is one of the most water-stressed Mediterranean basins.

To ensure a sustainable management of the system, a series of water transfer rules were created, that establish the monthly transferred volume according to the total water volume stored in the Entrepeñas and Buendía reservoirs, located in the headwaters of the Tagus River Basin, and the inflows to these reservoirs in the previous twelve months.

Artificial intelligence methods, such as Artificial Neural Networks (ANN), have become very popular in streamflow forecasting applications due to their simple implementation, low requirement of hydrological data and good prediction performance. Accurate and reliable streamflow forecasting may have a significant impact on water resources management, especially for reservoir operation optimization.

This work focuses on the development of ANN models to predict the monthly inflows to the Entrepeñas and Buendía reservoirs. For each of the reservoirs, multi-layer perceptron ANN with backpropagation were trained, using monthly historical data of the inflows and precipitation. To identify the model with the best performance, various tests were conducted involving different combinations of hyperparameters, as well as varying sets of explanatory variables. The models were evaluated using the Nash-Sutcliffe Efficiency (NSE) coefficient to assess their predictive accuracy, in each of the subsets: training, validation and testing.

The best fit was achieved by incorporating several lags from the original series along with precipitation data including a single lag. This combination resulted in a fit to the full series with NSE values exceeding 0.7 for the inflows to both reservoirs.

These models could be used to support the management of water resources in the TSA system. By identifying future trends in water resource availability, decision-makers can implement more efficient strategies to optimize water allocation, ensure sustainability, and mitigate the effect of potential droughts.

How to cite: Mena, A., Bergillos, R. J., Paredes-Arquiola, J., Solera, A., and Andreu, J.: Machine Learning Approaches to Optimize Water Management in the Tagus-Segura Aqueduct System, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9405, https://doi.org/10.5194/egusphere-egu25-9405, 2025.

A.22
|
EGU25-9515
|
ECS
Shiwei Yang and Ruifeng Liang

The construction of reservoirs has altered river water temperature, consequently impacting aquatic ecosystems. In this study, we investigated the influence of cascade reservoirs on river water temperature, focusing on six cascade reservoirs in the Lancang River Basin. Seasonal and trend decomposition (STL) and the Pettitt test were employed to analyze the characteristics of water temperature changes. Trend analysis and the Pettitt test identified critical water temperature variation points that align with reservoir construction time. By comparing the water temperature data before and after reservoir construction, it is shown that reservoir construction significantly changes the annual process of water temperature, with a significant increase in low-temperature water. Among all reservoirs, Xiaowan (XW) and Nuozhadu (NZD), two reservoirs with high regulation capacity, have a particularly prominent impact on water temperature. Ecological operation is an effective way to improve the outflow water temperature of reservoirs, and it uses accurate outflow water temperature prediction as a basis. Compared to numerical models, machine learning models have the advantages of high efficiency and nonlinear fitting; hence, they can be used to predict the outflow water temperature of reservoirs. However, most machine learning models from previous studies exhibit poor interpretability. To simulate and predict reservoir outflow water temperature, four machine learning algorithms—Support Vector Regression (SVR), Random Forest (RF), Light Gradient Boosting Machine (LightGBM), and eXtreme Gradient Boosting (XGBoost)—were applied to XW and NZD reservoirs. Hyperparameters were tuned using Bayesian optimization. Results indicated that the XGBoost model performed the best, achieving the highest prediction accuracy (RMSE ≤ 0.25°C, R² = 0.98), with a maximum prediction error of less than 1°C. RF and LightGBM also demonstrated strong performance, while SVR showed relatively lower accuracy. In order to improve the interpretability of machine learning models, we use Shapley additive explanations (SHAP) method to reflect the importance of input variable features. SHAP analysis results of the XGBoost model revealed that thermal input factors, such as reservoir inflow temperature (Tin) and inflow discharge (Qin), were the most influential variables affecting outflow water temperature, followed by reservoir operation factors, including outflow discharge (Qout) and water level (WL). Air temperature (Tair) had the least impact. The research frame and results can provide a reference for reservoir ecological regulation and watershed ecological environment management.

How to cite: Yang, S. and Liang, R.: A Novel Perspective on Exploring Reservoir Impacts on River Water Temperature Using Machine Learning, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9515, https://doi.org/10.5194/egusphere-egu25-9515, 2025.

A.23
|
EGU25-10814
Xuan Ji

The VIC model, as a large-scale, semi-distributed hydrological model, has been widely used in basin-to-global-scale applications, including hydrological dataset construction, trend analysis of hydrological fluxes and states, data evaluation and assimilation, forecasting, coupled climate modeling, and climate change impact assessment. However, on the one hand, since the VIC model was developed based on the Linux/Unix platform, it lacks a visual interface interaction, which causes certain inconvenience in its application. On the other hand, the VIC model requires a large amount of complex work in data preparation, parameter file creation and calibration, which poses high demands on users' geographic data processing capabilities and programming skills. These limitations, to some extent, make beginners daunting and hinder the popularization and application of the model.

This study, based on the ArcGIS environment and using the Python Add-In mode, has developed a user-friendly and low-threshold visualization modeling tool for the VIC model, enabling model construction, data processing, parameter calibration and result display on a unified platform. Firstly, users can complete all the preparation of modeling data simply by operating on the interface. Secondly, this tool enables the non-discriminatory invocation of the VIC model on the Windows platform and introduces multi-process parallel processing to enhance the operational efficiency of the VIC model. In addition, the tool integrates the Genetic Algorithm (GA) and the SCE-UA algorithm, allowing for lumped, non-lumped and multi-site collaborative parameter calibration, providing users with diverse options for their research. The tool has been tested in the Brahmaputra River Basin, the Mekong River Basin, the Irrawaddy River Basin, and the upper Yangtze River, confirming its convenience and efficiency in practical applications. We believe that this tool is friendly and attractive to beginners of hydrological models and can promote the popularization and application of the VIC hydrological model.

How to cite: Ji, X.: ArcVIC: An ArcGIS-based Tool for the VIC (Variable Infiltration Capacity) Model, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-10814, https://doi.org/10.5194/egusphere-egu25-10814, 2025.

A.24
|
EGU25-13977
Maria Kireeva, Artem Gorbarenko, and Vsevolod Moreydo

The last decade has been crucial for the development of artificial intelligence in all spheres of human life. The interest in this field can be explained by two factors: the increase in computational capacities and the availability of large datasets in both qualitative and quantitative terms. This study is devoted to the application of the LSTM neural network in modeling and forecasting daily discharge time-series for the rivers of the East European Plain with predominantly snowmelt or mixed river nourishment. A unique CAMELS_ru dataset, including both dynamic and static characteristics for 75 rivers, was created. Reanalysis data, geospatial grids, and time series of meteorological and hydrological characteristics covering a period of 70 years from 1950 to 2019 were collected and processed. As part of the input data preparation process, information on soil parameters, forestation, geological structure, and averaged climatic and hydrological parameters were obtained. Based on the existing LSTM architecture, a model implementation for the selected rivers was created. The dataset was partitioned in the following ratio: 60% for the training sample, 10% for the validation sample, and 30% for the test sample.

The Nash-Sutcliffe coefficient is close to 0.9 in most cases, which indicates that the model has sufficient predictive ability. The model captures the main patterns and trends in the existing data well, and the low value of the RMSE to STD ratio confirms that it is able to predict the time series with high accuracy. However, forecasting historical extreme events that lie beyond existing time-series data remains a “mission impossible” due to the overall concept of data-driven (DD) models.

An important modeling experiment was conducted on a reduced sample of 5 years, which proved the theory that high modeling results are a consequence of the large length of the data series and not an error. Additionally, it is assumed that in the context of using neural networks, there is no need to limit the time series to the last 30 years due to climate variability. The approach underlying the use of neural networks allows the model to account for climate dynamics when building internal relationships, thus ensuring good data reproduction and high-quality modeling. The current version of the model is a promising start for the development of data-driven models on a major regional scale.

How to cite: Kireeva, M., Gorbarenko, A., and Moreydo, V.: Harnessing data-driven insights: advanced modeling of discharge time-series for the East European plain, application and potential, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-13977, https://doi.org/10.5194/egusphere-egu25-13977, 2025.

A.25
|
EGU25-14255
Yangwan Kim and Jongmin Park

Recently, global climate change has led to the occurrence of intensified heavy rainfall and extreme floods. As a result, analyzing, monitoring, and predicting flood risks caused by extreme rainfall and water level fluctuations have become increasingly important. In South Korea, river water levels are continuously monitored using telemeter (TM) based water level observations. However, uneven distribution of water level gauges leads to obtain water level data over ungauged stream or basins. This phenomenon further poses a significant constraint for developing spatial monitoring and prediction systems of extreme flood events.

To address these challenges, this study proposes a Deep learning-based algorithm for estimating river water levels using Sentinel-1 Synthetic Aperture Radar (SAR), which enables to provide spatially continuous observations over ungauged basin. Sentinel-1 SAR has advantages in providing all-weather observations regardless of the weather conditions, and their data, based on surface roughness characteristics, are widely used in urban flooding and flood mapping research.

This study extends beyond water body and flood monitoring by aiming to estimate river water levels based on Sentinel-1 SAR data and the variation in backscatter intensity due to river water level changes. It seeks to overcome the limitations of existing observation systems and provide a new methodology that can contribute to flood response and water resource management.

In this study, a Long-Short Term Memory (LSTM)-based water level estimation model was developed, and σ⁰VH, σ⁰VV, Local Incidence angle from Sentinel-1 C-band SAR (from 2015 to 2024) and Day of Year (DOY) was considered as input variable. For the training datasets, water level observations from In order to find the optimized set of input variables, this study generated sets of input data scenarios based on the Fisher’s Chi-Square test, and the model performance was examined by using multiple statistical indices (e.g., Correlation coefficient [R], Root Mean Square Error [RMSE], Mean Absolute Error [MAE], and the Index of Agreement [IOA]). Overall results indicated that 4 out of 11 stations revealed that LSTM models with all four input variables yielded the best statistical performance. Especially, Nasan Bridge station located in Hampyeong, South Korea, yielded best statistical results with R of 0.77, RMSE of 0.30m, MAE of 0.25m, and IOA of 0.63. However, other locations yielded relatively low statistical results, which can be attributed to the relatively less dynamic water level variations. For the future study, separation of training period based on the rainfall pattern and explicit consideration of meteorological information could help to enhance the overall model performance.

Acknowledgement: This work was supported by Korea Environment Industry & Technology Institute(KEITI) through R&D Program for Innovative Flood Protection Technologies against Climate Crisis Program, funded by Korea Ministry of Environment(MOE)(RS-2023-00218873).

This work also was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT)(RS-2024-00416443).

How to cite: Kim, Y. and Park, J.: Water Level Estimation by using Sentinel-1 C-band SAR and Deep Learning approach, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-14255, https://doi.org/10.5194/egusphere-egu25-14255, 2025.

A.26
|
EGU25-16887
|
ECS
Basil Kraft, William H. Aeberhard, and Lukas Gudmundsson

Neural networks are increasingly used in hydrological applications. In streamflow modeling, long short-term memory (LSTM) networks have demonstrated considerable skill in lumped configurations, where hydrological and meteorological properties are averaged at the catchment scale. However, such averaging may mask important sub-catchment dynamics and routing processes. Process-based, semi-distributed models address these limitations by partitioning catchments into smaller hydrological response units (HRUs) for more detailed simulations, albeit at higher computational cost and with added complexity.

This research proposes a semi-distributed deep learning approach, merging the computational efficiency of neural networks with the spatial fidelity of HRU-based models. By explicitly modeling streamflow routing at the sub-catchment level, the framework seeks to provide improved streamflow predictions, whilst providing spatially explicit runoff predictions at sub-catchment scale.

We developed and tested our approach on a fine-grained grid of 20’000 HRU polygons over Switzerland. Despite the fine spatial resolution, a routed forward run for multiple decades is computed within minutes. The proposed framework has the potential to deliver real-time, spatially resolved forecasts to support improved water resource management, risk mitigation, and early warning efforts.

How to cite: Kraft, B., Aeberhard, W. H., and Gudmundsson, L.: Deep learning for efficient semi-distributed streamflow modeling, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-16887, https://doi.org/10.5194/egusphere-egu25-16887, 2025.

A.28
|
EGU25-19634
Luisa-Bianca Thiele, Gerret Lose, Alexander Verworn, and Markus Wallner

Climate change is causing an increase in extreme, high-intensity rainfall events, which are locally and temporally limited and pose a significant risk potential for urban stormwater drainage. Particularly affected are densely populated and urban hardscapes, where substantial damage potential is expected. The hydraulics of the drainage network can be calculated with a high degree of accuracy using spatially and temporally high-resolution rainfall data. Numerical stormwater models are used for this purpose. However, such models have the disadvantage of long computation times, which can exceed the time scale of a forecast depending on the application. Our aim is to improve the forecasting and early warning systems for the operational optimisation of drainage network control and hazard prevention in the stormwater drainage system using artificial neural networks (ANN).

The study area is a part of the city of Osnabrück in Germany. To quantify the risk of overflow, a hydrodynamic stormwater model with 1896 real and synthetic rainfall events with a temporal resolution of 5 minutes, durations between 15 and 60 minutes and return periods between 1 and 100 years is operated. The catchment area is divided into 1x1km pixels and one of four risk categories is defined for each pixel based on the sum of the overflow for each time step. In order to check whether the risk categories can be reduced by controlling the drainage network, three different control scenarios of the drainage network are calculated hydrodynamically in addition to the uncontrolled state, so that 7584 (4 x 1896) simulations are available for training the ANN. Finally, the ANN will be evaluated for its suitability to support decision-making for an optimised control scenario in real time. A particular challenge here is the evaluation of the AI model in the comparison of the risk categories in neighbouring pixels.

How to cite: Thiele, L.-B., Lose, G., Verworn, A., and Wallner, M.: Efficient drainage network control: from hydrodynamic modelling to AI-supported decision-making, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-19634, https://doi.org/10.5194/egusphere-egu25-19634, 2025.

Posters virtual: Tue, 29 Apr, 14:00–15:45 | vPoster spot A

The posters scheduled for virtual presentation are visible in Gather.Town. Attendees are asked to meet the authors during the scheduled attendance time for live video chats. If authors uploaded their presentation files, these files are also linked from the abstracts below. The button to access Gather.Town appears just before the time block starts. Onsite attendees can also visit the virtual poster sessions at the vPoster spots (equal to PICO spots).
Display time: Tue, 29 Apr, 08:30–18:00
Chairperson: Louise Slater

EGU25-18342 | Posters virtual | VPS9

Application of Machine Learning in Predicting the Water Temperature Released from Reservoirs 

Chen Junguang
Tue, 29 Apr, 14:00–15:45 (CEST) | vPA.11

The release of low-temperature water from a reservoir can have negative impacts on downstream fish spawning and crop growth in irrigation areas. Therefore, predicting the discharged water temperature accurately and swiftly is crucial. This study focused on the Pubugou Hydropower Station, a major project situated on the Dadu River in the upper reaches of the Yangtze River, and evaluated the impacts of meteorological factors and reservoir operational parameters on the released water temperature using Spearman correlation coefficients (R). To predict the discharged water temperature of Pubugou Reservoir, five models were optimized by genetic algorithms including random forests, support vector regression, convolutional neural network, long short-term memory network, and the lightweight gradient boosting machine respectively. The results showed that: (1)The dew point temperature exhibited the highest correlation with discharged water temperature (R = 0.89), However, the correlation coefficient between wind speed, cloud cover, solar radiation, dam front water level, and discharge water temperature was not found to be 0.4. (2) All the five models optimized by genetic algorithms performed well on the training set, especially the random forest model (R2 = 0.997). The worst performing model is the long short-term memory network model (R2 = 0.985). (3) In the prediction of discharge water temperature, all models have good fitting effects, with r2 greater than 0.93, average absolute error not greater than 0.662 ℃, and mean square error not greater than 0.852 ℃. Random forest models and lightweight gradient boosting machine models have shown good performance on the most of sample data, with a small residual range, while support vector regression models and convolutional neural network models have smaller maximum residuals. This research indicated that machine learning methods can effectively predict water temperature released from reservoirs, providing more reliable decision support for formulating relevant measures to alleviate the impact of reservoir discharge water temperature.

How to cite: Junguang, C.: Application of Machine Learning in Predicting the Water Temperature Released from Reservoirs, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-18342, https://doi.org/10.5194/egusphere-egu25-18342, 2025.

EGU25-2662 | ECS | Posters virtual | VPS9

 A Diversity Driven Deep Convolutional Network for Enhanced Coastal Urban Flood Risk Assessment 

Bowei Zeng, Guoru Huang, and Ge Yang
Tue, 29 Apr, 14:00–15:45 (CEST) | vPA.29

Climate change and urbanization intensify urban pluvial flooding, posing significant threats to human lives and infrastructure. This situation underscores the critical need for efficient and accurate predictive systems for disaster prevention and mitigation. Traditional flood simulation models, while precise, are often limited by their data-intensive requirements and substantial computational complexity. In contrast, deep learning (DL) models show their advantages by high efficiency and powerful capability in processing large-scale non-linear data, making them highly appropriate for modeling complex flood dynamics. Consequently, integrating DL with conventional urban flood models has emerged as a promising strategy to enhance the accuracy and efficiency of flood prediction systems. However, existing research predominantly focuses on inland flooding, with limited attention to the role of tidal levels in coastal cities, which can significantly impact the accuracy of urban flood simulations.
To bridge the GAP, this study proposes an innovative hybrid DL approach that explores spatial and temporal data to improve the accuracy and efficiency of urban flood simulations, particularly in coastal areas. Simulation results from physics-based urban flood models are utilized to construct a comprehensive database for the DL model. Afterwards, patch-size and random sampling methods are employed to construct the sample dataset for training DL models. The convolutional neural network (CNN)-based data-driven urban pluvial flood model can simulate floods using topographic, rainfall, and tidal data, enabling the simulation of large urban areas within seconds. Incorporating diverse input data and advanced network architectures enhances model robustness and generalization across various scales and rainfall events. Fusion models that combine the strengths of DL and traditional hydrological models demonstrate improved prediction accuracy and computational efficiency by integrating tidal data and other environmental factors. Consequently, these hybrid models hold significant potential for integration into early warning systems and supporting decision-making processes in urban flood risk management.

How to cite: Zeng, B., Huang, G., and Yang, G.:  A Diversity Driven Deep Convolutional Network for Enhanced Coastal Urban Flood Risk Assessment, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-2662, https://doi.org/10.5194/egusphere-egu25-2662, 2025.

EGU25-19160 | Posters virtual | VPS9

Meta-modeling of a physically-based pesticide runoff model with a Long-Short term Memory approach 

Guillaume Métayer, Cécile Dagès, Marc Voltz, and Jean-Stéphane Bailly
Tue, 29 Apr, 14:00–15:45 (CEST) | vPA.31

Surface water contamination by pesticides is widespread across the European Union (European Environment Agency, 2024). A primary pathway for pesticide transfer from agricultural fields to surface waters is surface runoff (Wauchope et al., 1995 [https://doi.org/10.1162/neco.1997.9.8.1735]; Louchart et al., 2001 [https://doi.org/10.2134/jeq2001.303982x]; Reichenberger et al., 2007 [https://doi.org/10.1016/j.scitotenv.2007.04.046]). This process is influenced by various spatial and temporal factors, including compound properties, topography, application date and methods, climate, soil properties, and agricultural practices (Shipitalo and Owens, 2003 [https://doi.org/doi: 10.1021/es020870b]). Richards-based models are valuable for predicting the temporal variability of pesticide runoff (Métayer et al., 2024 [https://doi.org/10.1016/j.scitotenv.2023.167357]), especially in regions with high rainfall intensity variability, such as the Mediterranean. However, their operational application is constrained by substantial computational demands and extensive data requirements. Meta-modeling approaches provide a means to reduce the computational time of an initial physically-based model. Among these, the Long Short-Term Memory (LSTM; Hochreiter and Schmidhuber, 1997 [https://doi.org/10.1162/neco.1997.9.8.1735]) model has demonstrated high efficiency in replicating hydrological (Kratzert et al., 2018 [https://doi.org/10.5194/hess-22-6005-2018]) and hydrochemical time series (Pyo et al., 2023 [https://doi.org/10.1016/j.wroa.2023.100207]), making them a promising meta-modeling strategy for pesticide runoff models. This study aimed to develop and evaluate a meta-modeling approach using LSTM models for a Richards-based model to simulate hourly variations in water and pesticide runoff over an entire year while minimizing computation times. The proposed approach was applied to a field-scale pesticide runoff model implemented in the fully spatially distributed hydrological model MHYDAS-Pesticide 1.0 (Crevoisier et al., 2021 [https://hal.inrae.fr/hal-04090048v1]) that integrates Richards and convection-dispersion equations, the uniform mixing cell concept, and an overland flow routine. This represents a challenge for at least the following three reasons: i) the time series contains mainly zero values of runoff discharge, ii) the prediction of pesticide runoff requires an efficient prediction of water runoff and ii) the order of magnitude of the targeted non-zero values of runoff concentration varies by several orders of magnitude. The LSTM meta-model was trained and validated using 560,560 annual time series simulations generated by the initial physically-based model. The training dataset comprised 70% of the simulations, with the remaining 30% reserved for validation. The resulting meta-model accounted for meteorological conditions, compound properties, and pesticide application date and rate. It demonstrated high accuracy in simulating hourly runoff and pesticide concentrations, achieving significant reductions in computation time. However, challenges remain, such as improving the precision of runoff occurrence simulation and enhancing the meta-model's generalizability by incorporating additional static parameters as inputs.

The poster will focus on the methodology for the meta-model’s development and the results of its evaluation. The meta-model has been implemented within a fully spatially distributed physically-based hydrological model, MHYDAS Pesticide, to form an hybrid version.

How to cite: Métayer, G., Dagès, C., Voltz, M., and Bailly, J.-S.: Meta-modeling of a physically-based pesticide runoff model with a Long-Short term Memory approach, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-19160, https://doi.org/10.5194/egusphere-egu25-19160, 2025.