Machine learning (ML) is now widely used across the Earth Sciences and especially its subfield deep learning (DL) has recently enjoyed increased attention in the context of Hydrology. The goal of this session is to highlight the continued integration of ML, and DL in particular, into traditional and emerging Hydrology-related workflows. Abstracts are solicited related to novel theory development, novel methodology, or practical applications of ML and DL in Hydrology. This might include, but is not limited to, the following:

(1) Identifying novel ways for DL in hydrological modelling.
(2) Testing and examining the usability of DL based approaches in hydrology.
(3) Improving understanding of the (internal) states/representations of DL models.
(4) Integrating DL with traditional hydrological models.
(5) Creating an improved understanding of the conditions for which DL provides reliable simulations. Including quantifying uncertainty in DL models.
(6) Clustering and/or classifying hydrologic systems, events and regimes.
(7) Using DL for detecting, quantifying or cope with nonstationarity in hydrological systems and modeling.
(8) Deriving scaling relationships or process-related insights directly from DL.
(8) Using DL to model or anticipate human behavior or human impacts on hydrological systems.
(10) DL based hazard analysis, detection/mitigation, event detection, etc.
(11) Natural Language Processing to analyze, interpret, or condense hydrologically-relevant peer-reviewed literature or social media data or to assess trends within the discipline.

Co-organized by ESSI2/NP4
Convener: Frederik Kratzert | Co-conveners: Claire Brenner, Daniel Klotz, Grey Nearing
| Attendance Tue, 05 May, 14:00–15:45 (CEST)

Files for download

Download all presentations (46MB)

Chat time: Tuesday, 5 May 2020, 14:00–15:45

D153 |
| Highlight
Asher Metzger, Zach Moshe, Guy Shalev, Ofir Reich, Zvika Ben-Haim, Vova Anisimov, Efrat Morin, Ran Elyaniv, Gal Elidan, and Sella Nevo

One of the major natural disasters is flooding, which causes thousands of fatalities, affects the lives of hundreds of millions, and results in huge economic damages annually. Google’s Flood Forecasting Initiative aims at providing high-resolution flood forecasts and timely warnings around the globe, while focusing first on developing countries where most of the fatalities occur. The high level structure of Google’s flood forecasting framework follows the natural hydrologic-hydraulic coupling, where the hydrologic modeling predicts discharge (or other proxies for discharge) based on rainfall-runoff relationships, and the hydraulic model produces high resolution inundation maps based on those discharge predictions.  Within this general partition, both the hydraulic and hydrologic modules benefit by the use of advanced machine learning techniques allowing for precision and global scale.

Classical conceptual hydrologic models such as the Sacramento Soil Moisture Accounting Model explicitly model the dynamics of water volumes based on explicit measurements and estimates of the variables (parameters) involved. These models are, however, inherently challenged by the lack of accurate estimates of model parameters and by inaccurate/incomplete description of the complex non-linear rules that govern the underlying dynamics. In contrast, machine learning models, driven by data alone, are potentially capable of describing complex functional dynamics without explicit modelling.  Both the hydrologic and hydraulic models employed by Google rely on data-driven machine learning technologies to achieve superior and scalable performance. In this presentation we focus on describing one of the deep neural hydrologic models proposed by Google. 

As was already shown in a recent work by Kratzert et al. (2018, 2019)[1], a deep neural model can achieve high performance hydrologic forecasts using deep recurrent models such as long short-term memory networks (LSTMs). Moreover, it was shown by Shalev et al. (2019)[2] that a single globally shared LSTM can achieve state-of-the-art performance by utilizing a data-driven learned embedding without the need for geographical-specific attributes.  While the need for explicit rules in pure conceptual modeling is likely to impede the creation of scalable and accurate hydrologic models, an agnostic approach that ignores reliable and available physical properties of water networks is also likely to be sub-optimal. HydroNet is one of Google’s hydrologic models that leverages the known water network structure as well as deep neural technology to create a scalable and reliable hydrologic model. HydroNet builds a globally shared model together with regional adaptation sub-models at each site by utilizing the tree structure of river flow network, and is shown to achieve state-of-the-art scalable hydrologic modeling in several large basins in India and the USA. 


[1] Kratzert, Frederik, Daniel Klotz, Guy Shalev, Günter Klambauer, Sepp Hochreiter, and Grey Nearing. "Benchmarking a catchment-aware Long Short-Term Memory Network (LSTM) for large-scale hydrological modeling." arXiv preprint arXiv:1907.08456 (2019).

[2] Shalev, Guy, Ran El-Yaniv, Daniel Klotz, Frederik Kratzert, Asher Metzger, and Sella Nevo. "Accurate Hydrologic Modeling Using Less Information." arXiv preprint arXiv:1911.09427 (2019).

How to cite: Metzger, A., Moshe, Z., Shalev, G., Reich, O., Ben-Haim, Z., Anisimov, V., Morin, E., Elyaniv, R., Elidan, G., and Nevo, S.: How Google's Flood Forecasting Initiative Leverages Deep Learning Hydrologic Models, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-4134, https://doi.org/10.5194/egusphere-egu2020-4134, 2020.

D154 |
Catharine Brown, Helen Smith, Simon Waller, Lizzie Weller, and David Wood

National-scale flood hazard maps are an essential tool for the re/insurance industry to assess property risk and financial impacts of flooding. The creation of worst-case scenario river flood maps, assuming defence failure, and additional separate datasets indicating areas protected by defences enables the industry to best assess risk. However, there is a global shortage of information on defence locations and maintenance. For example, in the United States it is estimated that there are around 160,000 kilometres (100,000 miles) of defence levees, but the location of many of these is not mapped in large-scale defence datasets. We present a new approach to large-scale defence identification using deep learning techniques.

In the generation of flood hazard maps, the elevation depicted in the Digital Elevation Model (DEM) used in the hydraulic modelling is fundamental to determining the routing of water flow across the terrain and thus determining where flooding occurs. The full or partial representation of raised river defences in DEMs affects this routing and subsequently causes difficulty when developing both undefended and defended flood maps. To generate undefended river flood maps these raised defences need to be entirely removed, which requires knowledge of their locations. Without comprehensive defence datasets, an alternative method to identify river defences on a large-scale is required.

The use of deep learning techniques to recognise objects in images is fast developing. DEMs and other related datasets can be represented in a similar raster format to images. JBA has developed a successful methodology which involves training a U-Net Convolutional Neural Network, originally designed for image segmentation, to identify raised river defences in DEMs. Using this defence dataset, we have been able to generate true river undefended flood maps for a selection of countries including Italy, Germany, Austria and the US. We present details of the methodology developed, the model training and the challenges faced when applying the model to different geographical regions.

How to cite: Brown, C., Smith, H., Waller, S., Weller, L., and Wood, D.: Using image-based deep learning to identify river defences from elevation data for large-scale flood modelling, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-8522, https://doi.org/10.5194/egusphere-egu2020-8522, 2020.

D155 |
Basil Kraft, Martin Jung, Marco Körner, and Markus Reichstein

Deep (recurrent) neural networks have proven very useful to model multivariate sequential data streams of complex dynamic natural systems and have already been successfully applied to model hydrological processes. Compared to physically based models, however, the internal representation of a neural network is not directly interpretable and model predictions often lack physical consistency. Hybrid modeling is a promising approach that synergizes the advantage of process-based modeling (interpretability, theoretical foundations) and deep learning (data adaptivity, less prior knowledge required): By combining these two approaches, flexible and partially interpretable models can be created that have the potential to advance the understanding and predictability of environmental systems.

Here, we implement such a hybrid hydrological model on a global scale. The model consists of three main blocks: 1) A Long-Short-Term Memory (LSTM) model, which extracts temporal features from the meteorological forcing time-series. 2) A multi-branch neural network comprising of independent, fully connected layers, taking the LSTM state as input and yielding a set of latent, interpretable variables (e.g. soil moisture recharge). 3) A conceptual model block that implements hydrological balance equations, driven by the above interpretable variables. The model is trained simultaneously on global observation-based products of total water storage, snow water equivalent, evapotranspiration and runoff. To combine the different loss terms, we use self-paced task uncertainty weighing as done in state-of-the-art multi-task learning.

Preliminary results suggest that the hybrid modeling approach captures global patterns of the hydrological cycle’s variability that are consistent with observations and our process understanding. The approach opens doors to novel data-driven simulations, attribution and diagnostic assessments of water cycle variations globally. The presented approach is—to our knowledge—the first application of the hybrid approach to model environmental systems.

How to cite: Kraft, B., Jung, M., Körner, M., and Reichstein, M.: Towards global hybrid hydrological modeling by fusing deep learning and a conceptual model, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-16210, https://doi.org/10.5194/egusphere-egu2020-16210, 2020.

D156 |
Lennart Schmidt, Elona Gusho, Walter de Back, Kira Vinogradova, Rohini Kumar, Oldrich Rakovec, Sabine Attinger, and Jan Bumberger

The prediction of streamflow from precipitation data is one of the traditional disciplines of hydrological modelling and has major societal implications such as flood forecasting, efficient use of hydro-power and urban and regional planning. Recently, data-driven approaches have been applied successfully for rainfall-runoff modelling, often outperforming equivalent physical modeling approaches. However, these studies have almost exclusively focused on temporal data and have neglected data on the spatial distribution of the inputs.

To close this gap, we trained convolutional long-short-term-memory (ConvLSTM) models on daily temperature and precipitation maps of the catchment area to predict the streamflow of the Elbe river. This supervised deep learning method combines convolutional and recurrent neural networks to extract useful features in the spatio-temporal input maps to predict the river’s streamflow. We embedded the model into a Bayesian framework to deliver estimates of prediction uncertainty along with the predictions. Moreover, we derived saliency maps that highlight the most relevant patterns in precipitation and temperature for the Elbe‘s major flood events.

Comparison with physical simulations show that our Bayesian ConvLSTM approach (1) performs on par with results from physical modeling while requiring only input data on temperature and precipitation, (2) provides useful uncertainty estimates, and (3) is able to generate interpretable saliency maps of flooding events.

In conclusion, this study showcases the applicability of deep learning methods for rainfall-runoff modelling as well as the methods' potential to gain spatial insight into the hydrological system.

How to cite: Schmidt, L., Gusho, E., de Back, W., Vinogradova, K., Kumar, R., Rakovec, O., Attinger, S., and Bumberger, J.: Spatially-distributed Deep Learning for rainfall-runoff modelling and system understanding, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-20736, https://doi.org/10.5194/egusphere-egu2020-20736, 2020.

D157 |
Thomas Lees, Gabriel Tseng, Steven Reece, and Simon Dadson

Tools from the field of deep learning are being used more widely in hydrological science. The potential of these methods lies in the ability to generate interpretable and physically realistic forecasts directly from data, by utilising specific neural network architectures. 

This approach offers two advantages which complement physically-based models. First, the interpretations can be checked against our physical understanding to ensure that where deep learning models produce accurate forecasts they do so for physically-defensible reasons. Second, in domains where our physical understanding is limited, data-driven methods offer an opportunity to direct attention towards physical explanations that are consistent with data. Both are important in demonstrating the utility of deep learning as a tool in hydrological science.

This work uses an Entity Aware LSTM (EALSTM; cf. Kratzert et al., 2019) to predict a satellite-derived vegetation health metric, the Vegetation Condition Index (VCI). We use a variety of data sources including reanalysis data (ERA-5), satellite products (NOAA Vegetation Condition Index) and blended products (CHIRPS precipitation). The fundamental approach is to determine how well we can forecast vegetation health from hydro-meteorological variables. 

In order to demonstrate the value of this method we undertook a series of experiments using observed data from Kenya to evaluate model performance. Kenya has experienced a number of devastating droughts in recent decades. Since the 1970s there have been more than 10 drought events in Kenya, including droughts in 2010-2011 and 2016 (Haile et al 2019). The National Drought Monitoring Authority (NDMA) use satellite-derived vegetation health to determine the drought status of regions in Kenya.

First, we compared our results to other statistical methods and a persistence-based baseline. Using RMSE and R-squared we demonstrate that the EALSTM is able to predict vegetation health with an improved accuracy compared with other approaches. We have also assessed the ability of the EALSTM to predict poor vegetation health conditions. While better than the persistence baseline the performance on the tails of the distribution requires further attention.

Second, we test the ability of our model to generalise results. We do this by training only with subsets of the data. This tests our model’s ability to make accurate forecasts when the model has not seen examples of the conditions we are predicting. Finally, we explore how we can use the EALSTM to better understand the physical realism of relations between hydro-climatic variables embedded within the trained neural network. 



Gebremeskel, G., Tang, Q., Sun, S., Huang, Z., Zhang, X., & Liu, X. (2019, June 1). Droughts in East Africa: Causes, impacts and resilience. Earth-Science Reviews. Elsevier B.V. https://doi.org/10.1016/j.earscirev.2019.04.015

Klisch, A., & Atzberger, C. (2016). Operational drought monitoring in Kenya using MODIS NDVI time series. Remote Sensing, 8(4). https://doi.org/10.3390/rs8040267

Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., & Nearing, G. (2019). Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets. Hydrology and Earth System Sciences, 23(12), 5089–5110. https://doi.org/10.5194/hess-23-5089-2019

Github Repository: https://github.com/esowc/ml_drought

How to cite: Lees, T., Tseng, G., Reece, S., and Dadson, S.: Deep Learning for Drought and Vegetation Health Modelling: Demonstrating the utility of an Entity-Aware LSTM, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-8173, https://doi.org/10.5194/egusphere-egu2020-8173, 2020.

D158 |
Joseph Hamman and Andrew Bennett

Early work in the field of Machine Learning (ML) for hydrologic prediction is showing significant potential. Indeed, it has provided important and measurable advances toward prediction in ungauged basins (PUB). At the same time, it has motivated a new research targeting important ML topics such as uncertainty attribution and physical constrains. It has also brought into question how to best harness the wide variety of climatic and hydrologic data available today. In this work, we present initial results employing transfer learning to combine information about meteorology, streamflow, surface fluxes (FluxNet), and snow (SNOTEL) into a state of the art ML-based hydrologic model. Specifically, we will present early work demonstrating how relatively simple implementations of transfer learning can be used to enhance predictions of streamflow by transferring learning from flux and snow station observations to the watershed scale. Our work is shown to extend recently published results from Kratzert et al. (2018) using the CAMELS data set (Newman et al. 2014) for streamflow prediction in North America.

  • Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M.: Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks, Hydrol. Earth Syst. Sci., 22, 6005-6022, https://doi.org/10.5194/hess-22-6005-2018, 2018a.
  • Newman; K. Sampson; M. P. Clark; A. Bock; R. J. Viger; D. Blodgett, 2014. A large-sample watershed-scale hydrometeorological dataset for the contiguous USA. Boulder, CO: UCAR/NCAR. https://dx.doi.org/10.5065/D6MW2F4D

How to cite: Hamman, J. and Bennett, A.: Transfer learning applications in hydrologic modeling, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-11332, https://doi.org/10.5194/egusphere-egu2020-11332, 2020.

D159 |
David Lambl, Dan Katz, Eliza Hale, and Alden Sampson

Providing accurate seasonal (1-6 months) forecasts of streamflow is critical for applications ranging from optimizing water management to hydropower generation. In this study we evaluate the performance of stacked Long Short Term Memory (LSTM) neural networks, which maintain an internal set of states and are therefore well-suited to modeling dynamical processes.

Existing LSTM models applied to hydrological modeling use all available historical information to forecast contemporaneous output. This modeling approach breaks down for long-term forecasts because some of the observations used as input are not available in the future (e.g., from remote sensing and in situ sensors). To solve this deficiency we train a stacked LSTM model where the first network encodes the historical information in its hidden states and cells. These states and cells are then used to initialize the second LSTM which uses meteorological forecasts to create streamflow forecasts at various horizons. This method allows the model to learn general hydrological relationships in the temporal domain across different catchment types and project them into the future up to 6 months ahead.

Using meteorological time series from NOAA’s Climate Forecast System (CFS), remote sensing data including snow cover, vegetation and surface temperature from NASA’s MODIS sensors, SNOTEL sensor data, static catchment attributes, and streamflow data from USGS we train a stacked LSTM model on 100 basins, and evaluate predictions on out-of-sample periods from these same basins. We perform sensitivity analysis on the effects of remote sensing data, in-situ sensors, and static catchment attributes to understand the informational content of these various inputs under various model architectures. Finally, we benchmark our model to forecasts derived from simple climatological averages and to forecasts created by a single LSTM that excludes all inputs without forecasts.


How to cite: Lambl, D., Katz, D., Hale, E., and Sampson, A.: Forecasting Seasonal Streamflow Using a Stacked Recurrent Neural Network, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-11393, https://doi.org/10.5194/egusphere-egu2020-11393, 2020.

D160 |
Edouard Patault, Valentin Landemaine, Jérôme Ledun, Arnaud Soulignac, Matthieu Fournier, Jean-François Ouvry, Olivier Cerdan, and Benoit Laignel

Sediment Discharge (SD) at karstic springs refers to a black-box due to the non-linearity of the processes generating SD, and the lack of accurate physical description of karstic environments. Recent research in hydrology emphasized the use of data-driven techniques for black-box models, such as Deep Learning (DL), considering their good predictive power rather than their explanatory abilities. Indeed, their integration into traditional hydrology-related workflows can be particularly promising. In this study, a deep neural network was built and coupled to an erosion-runoff GIS model (WATERSED, Landemaine et al., 2015) to predict SD at a karstic spring. The study site is located in the Radicatel catchment (88 km² in Normandy, France) where spring water is extracted to a Water Treatment Plant (WTP). SD was predicted for several Designed Storm Project (DSP0.5-2-10-50-100) under different land-use scenarios by 2050 (baseline, ploughing up 33% of grassland, eco-engineering (181 fascines + 13ha of grass strips), best farming practices (+20% infiltration)). Rainfall time series retrieved from French SAFRAN database and WATERSED modelling outputs extracted at connected sinkholes were used as input data for the DL model. The model structure was found by a classical trial and error procedure, and the model was trained on two significant hydrologic years (nevents = 731). Evaluation on a test set suggested good performance of the model (NSE = 0.82). Additional evaluation was performed comparing the ‘Generalized Extreme Value’ (GEV) distribution for the five DSP under the baseline scenario. The SD predicted by the DL model was in perfect agreement with the GEV distribution (R² = 0.99). Application of the model on the other scenarios suggests that ploughing up 33% of grasslands will increase SD at the WTP to an average 5%. Eco-engineering and best farming practices will reduce SD in the range of 10-44% and 63-80% respectively. This novel approach offers good opportunities for SD prediction at karstic springs or WTP under multiple land use scenarios. It also provide robust decision making tools for land-use planning and drinking water suppliers.

How to cite: Patault, E., Landemaine, V., Ledun, J., Soulignac, A., Fournier, M., Ouvry, J.-F., Cerdan, O., and Laignel, B.: Integrating Deep Learning to GIS Modelling: An Efficient Approach to Predict Sediment Discharge at Karstic Springs Under Different Land-Use Scenarios, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-467, https://doi.org/10.5194/egusphere-egu2020-467, 2020.

D161 |
Rim Cherif and Emna Gargouri Ellouze

Regional frequency approaches are frequently proposed in order to estimate runoff quantiles for non-gauged catchments. Partitioning methods such as cluster analysis are often applied in order to regionalize catchments.

This study presents an investigation based on the hierarchical clustering method related to watershed Hydro-geomorphic descriptors and aims to compare types of distances signatures involved in the clustering approach.

The delineation pooling groups (regions) is based on distances calculated between sites in multidimensional space: hydrological, physiographical and geomorphological characteristics.

Resulting clusters are then checked for homogeneity level by silhouette index.

We consider in this work A data set from nineteen (19) catchments situated in the Tunisian ridge, monitored since 1992, is used to apply this comparison. 

 Latitudes vary from 35°N to 37°N and longitudes from 8°E to 11°E; areas range between 1 km2 and 10 km2. These catchments are located in a semi-arid zone; with annual average rainfall fluctuating between 280 mm and 500 mm. The relief is moderately high to-high for the majority of the basins, which helps rapid runoff. These catchments are little permeable to impermeable. The rain gauge network consists of 20 gauges.

The delineation of regions in multidimensional space involves hydrological signatures, physiographical and geomorphological catchment characteristics. The last ones are : area, perimeter, maximum altitude, minimum altitude, specific height, global slope index, equivalent rectangle length, equivalent rectangle width, Gravellus index, the percentage of pasture land ; the percentage of forest cover, the percentage of cereal culture area, the percentage of arboriculture area and the percentage of area affected by anti-erosive practices. Hydrological signatures are: specific maximum discharge, runoff volume, time to peak, base time, infiltration index and runoff coefficient.

Hierarchical culstering are applied with several distances calculated from these signatures and characteristics. Two clusters are considered for basin regions. Nine distances are compared (euclidean , Spearman, Cheybechev, cityblock, correlation, cosine, hamming, Jaccard, Minkowsky).

Silhouettes values are calculated for each cluster based on the distances calculated. All distances give satisfying results and correlation and Cosine distance give relative best silhouette values.


How to cite: Cherif, R. and Gargouri Ellouze, E.: Flow signatures and basin parameters for Hierarchical tunisian Catchments clustering and similarity assessement., EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-492, https://doi.org/10.5194/egusphere-egu2020-492, 2020.

D162 |
Carolina Natel de Moura, Jan Seibert, Miriam Rita Moro Mine, and Ricardo Carvalho de Almeida

The advancement of big data and increased computational power have contributed to an increased use of Machine Learning (ML) approaches in hydrological modelling. These approaches are powerful tools for modeling non-linear systems. However, the applicability of ML in non-stationary conditions needs to be studied further. As climate change will change hydrological patterns, testing ML approaches for non-stationary conditions is essential. Here, we used the Differential Split-Sample Test (DSST) to test the climate transposability of ML approaches (e.g., calibrating in a wet period and validating in a dry one, and vice-versa).  We applied five ML approaches using daily precipitation and temperature as input for the prediction of the daily discharge in six snow-dominated Swiss catchments. Lower and upper benchmarks were used to evaluate performances through a relative performance measure. The lower benchmark is the average of the bucket-type HBV model runs from 1000 random parameter sets. The upper benchmark is the automatically calibrated HBV model. In comparison with the stationary condition, the models performed slightly poorer in the non-stationary condition. The performance of simple ML approaches was poor for non-stationary conditions with an underestimation of peak flows, as well as a poor representation of the snow-melting period. On the other hand, a more complex ML approach (deep learning), the Long Short -Term Memory (LSTM), showed a good performance when compared with the lower and upper benchmarks. This might be explained by the fact that the so-called memory cell allowed to simulate the storage effects. 

How to cite: Natel de Moura, C., Seibert, J., Moro Mine, M. R., and Carvalho de Almeida, R.: Are Machine Learning methods robust enough for hydrological modeling under changing conditions?, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-690, https://doi.org/10.5194/egusphere-egu2020-690, 2020.

D163 |
Tomy-Minh Trùòng, Márk Rudolf Somogyvári, Martin Sauter, Reinhard Hinkelmann, and Irina Engelhardt

Groundwater resources are expected to be affected by climate change and population growth and thus sophisticated water resources management strategies are of importance especially in arid and semi-arid regions. A better understanding of groundwater recharge and infiltration processes will allow us to consider not only water availability but also the sustainable yield of karst aquifers.

Because of the thin or frequently absent soil cover and thick vadose zones the assessment of groundwater recharge in fractured rock aquifers is highly complex. Furthermore, in (semi)-arid regions, precipitation is highly variable in space and time and frequently characterized by data scarcity. Therefore, classical methods are often not directly applicable.

This is especially the case for karstic aquifers, where i) the surface is characterized by depressions and dry valleys, ii) the vadose zone by complex infiltration processes, and iii) the saturated zone by high hydraulic conductivity and low storage capacity. Furthermore, epikarst systems display their own hydraulic dynamics affecting spatial and temporal distribution of infiltration rates. The superposition of all these hydraulic effects and characteristics of all compartments generates a complex groundwater recharge input signal.

Artificial neural networks (ANN) have the advantage, that they do not require knowledge about the underlying physical processes or the structure of the system, nor do they need prior hydrogeological information and therefore no model parameters, usually difficult to obtain. Groundwater recharge shows a high dependency on precipitation history and therefore the ANN to be chosen should be capable to reproduce some memory effects. This is considered by a standard multilayer perceptron (MLP) ANN, which uses a time frame as an input signal, as well as a recurrent ANN. For both large data sets are desirable. Because of the delay between input (precipitation, temperature, pumping) and output (spring discharge) signals, the data have to be analyzed in a geostatistical framework to determine the time lag between the input and the corresponding output as well as the input time frame for the MLP.

Two models are set up, one for the Lez catchment, located in the South of France, and one for the catchment of the Gallusquelle spring, located in South-West Germany. Both catchments aquifers are characterized by different degrees of karstification. While in the Lez catchment flow is dominated by conduit network, the Gallusquelle aquifer shows a lower degree of karstification with a stronger influence of the aquifer matrix. Additionally, the two climates differ, with the Lez catchment displaying a Mediterranean type of climate while the Gallusquelle catchment is characterized by oceanic to continental climatic conditions.

Our goal is to find neural network architecture(s) capable of reproducing the general system behaviour of the two karst aquifers possibly transferable to other karst systems. Therefore, the networks will be trained for the two different locations and compared to analyze similarities and differences.

How to cite: Trùòng, T.-M., Somogyvári, M. R., Sauter, M., Hinkelmann, R., and Engelhardt, I.: Development of a neural network to calculate groundwater recharge in karstified aquifers , EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-3552, https://doi.org/10.5194/egusphere-egu2020-3552, 2020.

D164 |
Pierre Jacquier, Azzedine Abdedou, and Azzeddine Soulaïmani

Key Words: Uncertainty Quantification, Deep Learning, Space-Time POD, Flood Modeling

While impressive results have been achieved in the well-known fields where Deep Learning allowed for breakthroughs such as computer vision, language modeling, or content generation [1], its impact on different, older fields is still vastly unexplored. In computational fluid dynamics and especially in Flood Modeling, many phenomena are very high-dimensional, and predictions require the use of finite element or volume methods, which can be, while very robust and tested, computational-heavy and may not prove useful in the context of real-time predictions. This led to various attempts at developing Reduced-Order Modeling techniques, both intrusive and non-intrusive. One late relevant addition was a combination of Proper Orthogonal Decomposition with Deep Neural Networks (POD-NN) [2]. Yet, to our knowledge, in this example and more generally in the field, little work has been conducted on quantifying uncertainties through the surrogate model.
In this work, we aim at comparing different novel methods addressing uncertainty quantification in reduced-order models, pushing forward the POD-NN concept with ensembles, latent-variable models, as well as encoder-decoder models. These are tested on benchmark problems, and then applied to a real-life application: flooding predictions in the Mille-Iles river in Laval, QC, Canada.
For the flood modeling application, our setup involves a set of input parameters resulting from onsite measures. High-fidelity solutions are then generated using our own finite-volume code CuteFlow, which is solving the highly nonlinear Shallow Water Equations. The goal is then to build a non-intrusive surrogate model, that’s able to know what it knows, and more importantly, know when it doesn’t, which is still an open research area as far as neural networks are concerned [3].

[1] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4, inception-resnet and the impact of residual connections on learning”, in Thirty-First AAAI Conference on Artificial Intelligence, 2017.
[2] Q. Wang, J. S. Hesthaven, and D. Ray, “Non-intrusive reduced order modeling of unsteady flows using artificial neural networks with application to a combustion problem”, Journal of Computational Physics, vol. 384, pp. 289–307, May 2019.
[3] B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and scalable predictive uncertainty estimation using deep ensembles”, in Advances in Neural Information Processing Systems, 2017, pp. 6402–6413.

How to cite: Jacquier, P., Abdedou, A., and Soulaïmani, A.: Reduced-Order Flood Modeling Using Uncertainty-Aware Deep Neural Networks, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-3726, https://doi.org/10.5194/egusphere-egu2020-3726, 2020.

D165 |
Zach Moshe, Asher Metzger, Frederik Kratzert, Efrat Morin, Sella Nevo, Gal Elidan, and Ran Elyaniv

Accurate and scalable hydrologic models are essential building blocks of several important applications, from water resource management to timely flood warnings. In this work we present a novel family of hydrologic models, called HydroNets, that leverages river network connectivity structure within deep neural architectures. The injection of this connectivity structure prior knowledge allows for scalable and accurate hydrologic modeling.

Prior knowledge plays an important role in machine learning and AI. On one extreme of the prior knowledge spectrum there are expert systems, which exclusively rely on domain expertise encoded into a model. On the other extreme there are general purpose agnostic machine learning methods, which are exclusively data-driven, without intentional utilization of inductive bias for the problem at hand. In the context of hydrologic modeling, conceptual models such as the Sacramento Soil Moisture Accounting Model (SAC-SMA) are closer to expert systems. Such models require explicit functional modeling of water volume flow in terms of their input variables and model parameters (e.g., precipitation, hydraulic conductivity, etc.) which could be calibrated using data. Instances of agnostic methods for stream flow hydrologic modelling, which for the most part do not utilize problem specific bias, have recently been presented by Kratzert et al. (2018, 2019) and by Shalev et al. (2019). These works showed that general purpose deep recurrent neural networks, such as long short-term models (LSTMs), can achieve state-of-the-art hydrologic forecasts at scale with less information.

One of the fundamental reasons for the success of deep neural architectures in most application domains is the incorporation of prior knowledge into the architecture itself. This is, for example, the case in machine vision where convolutional layers and max pooling manifest essential invariances of visual perception. In this work we present HydroNets, a family of neural network models for hydrologic forecasting. HydroNets leverage the inherent (graph-theoretic) tree structure of river water flow, existing in any multi-site hydrologic basin. The network architecture itself reflects river network connectivity and catchment structures such that each sub-basin is represented as a tree node, and edges represent water flow from sub-basins to their containing basin. HydroNets are constructed such that all nodes utilize a shared global model component, as well as site-specific sub-models for local modulations. HydroNets thus combine two signals: site specific rainfall-runoff and upstream network dynamics, which can lead to improved predictions at longer horizons. Moreover, the proposed architecture, with its shared global model, tend to reduce sample complexity, increase scalability, and allows for transferability to sub-basins that suffer from scarce historical data. We present several simulation results over multiple basins in both India and the USA that convincingly support the proposed model and its advantages.

How to cite: Moshe, Z., Metzger, A., Kratzert, F., Morin, E., Nevo, S., Elidan, G., and Elyaniv, R.: HydroNets: Leveraging River Network Structure and Deep Neural Networks for Hydrologic Modeling , EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-4135, https://doi.org/10.5194/egusphere-egu2020-4135, 2020.

D166 |
Ather Abbas, Sangsoo Baek, Minjeong Kim, Mayzonee Ligaray, Olivier Ribolzi, Norbert Silvera, Joong-Hyuk Min, Laurie Boithias, and Kyung Hwa Cho

Recent increase in climate change has resulted in rise of hydrologic extreme events, which demands better understanding of flow patterns in catchment. Modeling surface and sub-surface flow at high temporal resolution helps to understand catchment dynamics. In this study, we simulated surface and sub-surface flow in a Laotian catchment at 6-minute resolution. We used one physically based model called Hydrological Simulated Program-FORTRAN (HSPF) and developed two deep learning-based models. One deep learning model consisted of only one long short-term memory (LSTM), whereas the other model simulated processes in each hydrologic response unit (HRU) by defining one separate LSTM for each HRU. The models consider environmental data as well as changing landuse in catchment and predict surface and sub-surface flows. Our results show that simple LSTM model outperformed other models for surface runoff prediction, whereas the HRU-based LSTM model better predicted patterns and slopes in sub-surface flow in comparison with other models.

How to cite: Abbas, A., Baek, S., Kim, M., Ligaray, M., Ribolzi, O., Silvera, N., Min, J.-H., Boithias, L., and Cho, K. H.: Application of deep recurrent neural networks for modeling surface and sub-surface flow at high temporal resolution, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-6216, https://doi.org/10.5194/egusphere-egu2020-6216, 2020.

D167 |
Jihane Elyahyioui, Valentijn Pauwels, Edoardo Daly, Francois Petitjean, and Mahesh Prakash

Flooding is one of the most common and costly natural hazards at global scale. Flood models are important in supporting flood management. This is a computationally expensive process, due to the high nonlinearity of the equations involved and the complexity of the surface topography. New modelling approaches based on deep learning algorithms have recently emerged for multiple applications.

This study aims to investigate the capacity of machine learning to achieve spatio-temporal flood modelling. The combination of spatial and temporal input data to obtain dynamic results of water levels and flows from a machine learning model on multiple domains for applications in flood risk assessments has not been achieved yet. Here, we develop increasingly complex architectures aimed at interpreting the raw input data of precipitation and terrain to generate essential spatio-temporal variables (water level and velocity fields) and derived products (flood maps) by training these based on hydrodynamic simulations.

An extensive training dataset is generated by solving the 2D shallow water equations on simplified topographies using Lisflood-FP.

As a first task, the machine learning model is trained to reproduce the maximum water depth, using as inputs the precipitation time series and the topographic grid. The models combine the spatial and temporal information through a combination of 1D and 2D convolutional layers, pooling, merging and upscaling. Multiple variations of this generic architecture are trained to determine the best one(s). Overall, the trained models return good results regarding performance indices (mean squared error, mean absolute error and classification accuracy) but fail at predicting the maximum water depths with sufficient precision for practical applications.

A major limitation of this approach is the availability of training examples. As a second task, models will be trained to bring the state of the system (spatially distributed water depth and velocity) from one time step to the next, based on the same inputs as previously, generating the full solution equivalent to that of a hydrodynamic solver. The training database becomes much larger as each pair of consecutive time steps constitutes one training example.

Assuming that a reliable model can be built and trained, such methodology could be applied to build models that are faster and less computationally demanding than hydrodynamic models. Indeed, in with the synthetic cases shown here, the simulation times of the machine learning models (< seconds) are far shorter than those of the hydrodynamic model (a few minutes at least). These data-driven models could be used for interpolation and forecasting. The potential for extrapolation beyond the range of training datasets will also be investigated (different topography and high intensity precipitation events). 

How to cite: Elyahyioui, J., Pauwels, V., Daly, E., Petitjean, F., and Prakash, M.: Efficient simulation of flood events using machine learning, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-6254, https://doi.org/10.5194/egusphere-egu2020-6254, 2020.

D168 |
Daniel Klotz, Frederik Kratzert, Alden K. Sampson, Günter Klambauer, Sepp Hochreiter, and Grey Nearing

Accurate streamflow forecasts are important for many operational purposes, like hydropower operation or flood risk management. It is obvious that for data-driven models best prediction performance would be obtained if recent streamflow observations were used as an additional model input. Therefore, there exists a certain imperative which demands to use forecasting models that use discharge signals whenever available.


Forecasting models are, however, not well suited when continuous measurement of discharge can not be guaranteed or for applications in ungauged settings. Regarding the former, missing data can have long lasting repercussions on data-driven models if large data-windows are used for the input. Regarding the latter, data-driven forecast models are not applicable at all. Additionally, we would like to point out that data-driven simulation models need to represent the underlying hydrological processes more closely since the setup explicitly reflects the rainfall-runoff relationship. To conclude, in many contexts, it is more appropriate to use process or simulation models, which do not use discharge as input.


Despite the above mentioned difficulties of forecasting models it would nevertheless be beneficial to integrate, whenever available, past runoff information in simulation models in order to improve their accuracy. To this end, multiple potential approaches and strategies are available. In the context of conceptual or physically based rainfall-runoff models, recent runoff information is usually exploited by data assimilation/updating approaches (e.g. input-, state-, parameter- or output-updating). In this contribution we concentrate on input-updating approaches, since it allows to adjust the system for a forecasting period even if no explicit process can be attached to the system states.


We propose and examine different input-updating techniques for DL-based runoff models that can be used as baselines for future studies on data-assimilation tasks and which can be used with arbitrary differentiable model. To test the proposed approaches, we perform a series of experiments on a large set of basins throughout the continental United States. The results show that even simple updating techniques can strongly improve the forecasting accuracy.   


How to cite: Klotz, D., Kratzert, F., Sampson, A. K., Klambauer, G., Hochreiter, S., and Nearing, G.: Learning from mistakes: Online updating for deep learning models. , EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-8853, https://doi.org/10.5194/egusphere-egu2020-8853, 2020.

D169 |
Frederik Kratzert, Daniel Klotz, Günter Klambauer, Grey Nearing, and Sepp Hochreiter

Simulation accuracy among traditional hydrological models usually degrades significantly when going from single basin to regional scale. Hydrological models perform best when calibrated for specific basins, and do worse when a regional calibration scheme is used. 

One reason for this is that these models do not (have to) learn hydrological processes from data. Rather, they have a predefined model structure and only a handful of parameters adapt to specific basins. This often yields less-than-optimal parameter values when the loss is not determined by a single basin, but by many through regional calibration.

The opposite is true for data driven approaches where models tend to get better with more and diverse training data. We examine whether this holds true when modeling rainfall-runoff processes with deep learning, or if, like their process-based counterparts, data-driven hydrological models degrade when going from basin to regional scale.

Recently, Kratzert et al. (2018) showed that the Long Short-Term Memory network (LSTM), a special type of recurrent neural network, achieves comparable performance to the SAC-SMA at basin scale. In follow up work Kratzert et al. (2019a) trained a single LSTM for hundreds of basins in the continental US, which outperformed a set of hydrological models significantly, even compared to basin-calibrated hydrological models. On average, a single LSTM is even better in out-of-sample predictions (ungauged) compared to the SAC-SMA in-sample (gauged) or US National Water Model (Kratzert et al. 2019b).

LSTM-based approaches usually involve tuning a large number of hyperparameters, such as the number of neurons, number of layers, and learning rate, that are critical for the predictive performance. Therefore, large-scale hyperparameter search has to be performed to obtain a proficient LSTM network.  

However, in the abovementioned studies, hyperparameter optimization was not conducted at large scale and e.g. in Kratzert et al. (2018) the same network hyperparameters were used in all basins, instead of tuning hyperparameters for each basin separately. It is yet unclear whether LSTMs follow the same trend of traditional hydrological models to degrade performance from basin to regional scale. 

In the current study, we performed a computational expensive, basin-specific hyperparameter search to explore how site-specific LSTMs differ in performance compared to regionally calibrated LSTMs. We compared our results to the mHM and VIC models, once calibrated per-basin and once using an MPR regionalization scheme. These benchmark models were calibrated individual research groups, to eliminate bias in our study. We analyse whether differences in basin-specific vs regional model performance can be linked to basin attributes or data set characteristics.


Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M.: Rainfall–runoff modelling using Long Short-Term Memory (LSTM) networks, Hydrol. Earth Syst. Sci., 22, 6005–6022, https://doi.org/10.5194/hess-22-6005-2018, 2018. 

Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., and Nearing, G.: Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets, Hydrol. Earth Syst. Sci., 23, 5089–5110, https://doi.org/10.5194/hess-23-5089-2019, 2019a. 

Kratzert, F., Klotz, D., Herrnegger, M., Sampson, A. K., Hochreiter, S., & Nearing, G. S.: Toward improved predictions in ungauged basins: Exploiting the power of machine learning. Water Resources Research, 55. https://doi.org/10.1029/2019WR026065, 2019b.

How to cite: Kratzert, F., Klotz, D., Klambauer, G., Nearing, G., and Hochreiter, S.: The performance of LSTM models from basin to continental scales, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-8855, https://doi.org/10.5194/egusphere-egu2020-8855, 2020.

D170 |
Hyunjun Ahn, Sunghun Kim, Joohyung Lee, and Jun-Haeng Heo

In the extremes hydrology field, it is essential to find the probability distribution model that is most appropriate for the sample data to estimate the reasonable probability quantile. Depending on the assumed probability distribution model, the probability quantile could be estimated with quite different values. The probability plot correlation coefficient (PPCC) test is one of the goodness-of-fit tests for finding suitable probability distributions for a given sample. The PPCC test determines whether assumed probability distributions are acceptable for the sample data using correlation coefficients between sample data and theoretical quantiles of assumed probability distributions. The critical values for identification are presented as a two-dimensional table, depending on the sample size and the shape parameters of models, for a three-parameter probability distribution. In this study, the applicability and utility of machine learning in the hydrology field were examined. For the usability of the PPCC test, a regression equation was derived using a machine learning algorithm with two variables: sample size and shape parameter.

How to cite: Ahn, H., Kim, S., Lee, J., and Heo, J.-H.: Regression equations of probability plot correlation coefficient test statistics using machine learning, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-12315, https://doi.org/10.5194/egusphere-egu2020-12315, 2020.

D171 |
Mohamed Chafik Bakey and Mathieu Serrurier

Precipitation nowcasting is the prediction of the future precipitation rate in a given geographical region with an anticipation time of a few hours at most. It is of great importance for weather forecast users, for activitites ranging from outdoor activities and sports competitions to airport traffic management. In contrast to long-term precipitation forecasts which are traditionally obtained from numerical weather prediction models, precipitation nowcasting needs to be very fast. It is therefore more challenging to obtain because of this time constraint. Recently, many machine learning based methods had been proposed. In this work, we develop an original deep learning approach. We formulate precipitation nowcasting issue as a video prediction problem where both input and prediction target are image sequences. The proposed model combines a Long Short-Term Memory network (LSTM) with a convolutional encoder-decoder network (U-net). Experiments show that our method captures spatiotemporal correlations and yields meaningful forecasts

How to cite: Bakey, M. C. and Serrurier, M.: Precipitation Nowcasting using Deep Neural Network, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-21631, https://doi.org/10.5194/egusphere-egu2020-21631, 2020.