Large-sample hydrology: characterizing and understanding hydrologic diversity and catchment organization

Large-sample studies lead to generalizable insights about hydrologic similarity, understanding of dominant hydrologic processes and modelling capabilities. Studies that investigate the organization and response of single catchments (e.g. well-monitored experimental catchments, innovative process models) can provide a testing ground for hydrologic theories that can scaled up to larger samples. Combining detailed local knowledge with large data samples can provide insights unavailable to either approach alone, about e.g. hydrologic organization across large spatial scales or across varied hydroclimatic conditions.

This session provides the opportunity for researchers to highlight recent data and model-based efforts on catchment organization, diversity and response. We specifically encourage studies that seek to advance understanding of the following topics:

1. Data mobilization for hydrologic similarity and regionalization:
Can currently available global datasets of hydrologically relevant information (e.g. soil properties, land use, soil moisture estimates, meteorological re-analysis) effectively be used to define hydrologic similarity and thus improve the prediction in ungauged or scarcely gauged basins?

2. Testing of hydrologic theories:
To what extent can hydrologic theory developed in well-monitored experimental catchments be transferred to larger samples of relatively data-scarce catchments?

3. Modelling capabilities:
What can large sample hydrology reveal about the strengths and weaknesses of current modelling capabilities and how can large sample approaches be used to improve and constrain modelling efforts?

4. Explaining water use dynamics:
How can we use large sample hydrology to better understand water resource use, allocation and future availability, and inform sustainable management of these resources?

5. Development and improvement of large-sample data sets:
How can we overcome current challenges on unequal geographical representation of catchments, quantification of uncertainty, catchment heterogeneity and inclusion of human interaction with the global water cycle?

A splinter meeting is planned to discuss development and improvement of large-sample data sets, titled “Large sample hydrology: facilitating the production and exchange of data sets worldwide”. See the final program for location and timing.

The session and splinter meeting are organized as part of the Panta Rhei Working Group on large-sample hydrology.

Convener: Wouter KnobenECSECS | Co-conveners: Daniele Ganora, Nans AddorECSECS, Stacey Archfield, Sara LinderssonECSECS, Sandra PoolECSECS, Nicolas VasquezECSECS
vPICO presentations
| Wed, 28 Apr, 09:00–10:30 (CEST)

vPICO presentations: Wed, 28 Apr

Chairpersons: Wouter Knoben, Daniele Ganora
Hydrologic theory and catchment understanding
Sophie Comer-Warner, Nicolai Brekenfeld, Paul Romeijn, Sami Ullah, Daren Gooddy, Nicholas Kettridge, Benjamin Marchant, David Hannah, Feng Mao, and Stefan Krause

Climate change during the Anthropocene has caused many disturbances to Earth’s system, including altering patterns of precipitation and temperature. This has led to hydrological extremes with increases in both floods and droughts globally. We know and recognise this large effect on the global water cycle, but the consequent influence on biogeochemistry and greenhouse gas production has received less attention. Changes in greenhouse gas emissions due to increases in hydrological extremes may be an unrecognised climate feedback, having large implications for future climate and in turn, catchment hydrology. Here we present a synthesis from field studies and a review of the literature to investigate the effects of hydrological extremes on greenhouse gas production and emissions. We focus on variations in greenhouse gas emissions as a result of changes in both discharge and temperature, which are affected by hydrological extremes.

How to cite: Comer-Warner, S., Brekenfeld, N., Romeijn, P., Ullah, S., Gooddy, D., Kettridge, N., Marchant, B., Hannah, D., Mao, F., and Krause, S.: Closing the climate-hydrology feedback loop: Variations in greenhouse gas fluxes resulting from changes in catchment hydrology due to human-induced climate change, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-1617, https://doi.org/10.5194/egusphere-egu21-1617, 2021.

David G. Litwin, Ciaran J. Harman, Gregory E. Tucker, and Katherine R. Barnhart

Geomorphic properties of watersheds influence runoff generation, which drives landscape evolution over long timescales. Despite this strong process feedback, our understanding of how runoff generation affects long-term catchment evolution remains rudimentary. In most humid landscapes, storm runoff arises from shallow subsurface flow and from precipitation on saturated areas. Catchment geomorphology drives these runoff mechanisms, as landscape relief generates hydraulic gradients from hillslopes to streams, and regolith thickness and permeability affect flow partitioning and water storage capacity. However, there are few studies of how runoff coupled to dynamic shallow groundwater affects landscape form. In this study, we present a new groundwater-landscape evolution model and introduce a nondimensional framework to explore how subsurface-mediated runoff generation affects long-term catchment evolution. The model solves hydraulic groundwater equations to predict the water table location given prescribed recharge. Water in excess of the subsurface capacity for transport becomes overland flow, which may detach and transport sediment, affecting the landscape form that in turn affects runoff generation. We show that (1) two input parameters fully describe the possible steady state landscapes that coevolve under steady recharge, (2) subsurface flow capacity exerts a fundamental control on hillslope length and relief of these landscapes, and (3) three topographic metrics derived from the governing equations, steepness index, Laplacian curvature, and topographic wetness index, form a natural basis for evaluating the resulting coevolved landscapes. We derive a theoretical relationship using these metrics that allows us to recover the key model input parameters (including subsurface transmissivity) from topographic analysis of the landscape. These results open possibilities for topographic analysis of humid upland landscapes that could inform quantitative understanding of hydrological processes at the landscape scale.

How to cite: Litwin, D. G., Harman, C. J., Tucker, G. E., and Barnhart, K. R.: A hydrogeomorphic perspective on emergent topographic properties at landscape equilibrium, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-5863, https://doi.org/10.5194/egusphere-egu21-5863, 2021.

Nicolas Cornette, Clément Roques, Alexandre Boisson, Josette Launay, Guillaume Pajot, and Jean-Raynald de Dreuzy

Surface/subsurface interactions and geological heterogeneity have important effects on the dynamics of streamflows. Surface/subsurface interactions speed up transfers through the development of seepage zones, which reduce the response time of the aquifer and increase the proportion of rapid infiltration excess overland flow. On top of it, geological heterogeneity modulates spatially the extent of the seepage zones as well as the intensity of drainage of the underlying aquifer.

We investigated the combined effect of the surface/subsurface interactions and geological heterogeneity in a crystalline basement region under temperate climate (Brittany, France), where the limited aquifer capacities, the hydraulic conductivity enhanced by weathering and fracturing and the significant recharge rate promote surface/subsurface interactions. We analysed 40-year of discharge data monitored on two catchments (Arguenon 104 km2 and Aber Plabennec 27.4 km2) using 1D hillslope models (hs1D). The hs1D hillslope model resolves the vertically integrated Boussinesq subsurface flows with a spatially and temporally varying saturation-limited boundary condition on equivalent 1D hillslope structures. We specifically analysed the effect of accounting for heterogeneity on improving the discharge predictions, accounting for the presence of 2 equivalent hillslope with different hydraulic properties. This heterogeneity was defined based of the presence of two main geological lithologies in the catchments. Calibration was performed by a systematic parameter space exploration.

The calibrated models display significant differences between the two catchments. In the Aber Plabennec catchment, the homogeneous and heterogeneous hillslope models had very close performances showing an effective geological homogenization of the hydraulic conductivity and porosity. In the Arguenon catchment, the heterogeneous model outperformed the homogeneous model with a 46% increase of the Nash-log criterion showing persistant and significant differences in hydraulic conductivities and porosity. Successful calibration in both cases demonstrated by Nash-log values larger than 0.75-0.8 showed the overall relevance of the hillslope approach and its capacity to check for the presence of hydraulic heterogeneity at the catchment scale. Differences between catchments hints on the potential identification of hydrogeological properties at the regional scale by the combined use of the geological map and stream discharges.

How to cite: Cornette, N., Roques, C., Boisson, A., Launay, J., Pajot, G., and de Dreuzy, J.-R.: Hydrogeological controls on stream discharge dynamics in bedrock catchments: exploring the combined effects of seepage development and heterogeneity, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-7360, https://doi.org/10.5194/egusphere-egu21-7360, 2021.

Melike Kiraz, Thorsten Wagener, and Gemma Coxon

Studying large samples of catchments has been an effective means for comparative hydrology as it provides a wide range of hydrological conditions which can be used to learn similarities and differences between places. Such analyses typically include an attempt to organize catchments along some gradient (e.g. climate) or in clusters (e.g. geology) using catchment descriptors (e.g. an aridity index). Various past studies have pointed to the problem that available catchment descriptors are often not sufficient to capture hydrologically relevant catchment behaviours. It is further widely acknowledged that the water balance of many catchments is not closed. Several hypotheses for the causes of this lack of closed water balance are stated in literature.

If we assume that the dominant control on water balance is climate, then catchments’ water balances should change smoothly in space (since the climate varies smoothly). If they do not, then something else must be controlling this behaviour. We expect that size, location and geology might play important role in the water balances of UK catchments. We aim to study the differences in water balance between catchments to understand the role of catchment location. We test different hypotheses while considering the local neighborhood of 669 UK catchments from the CAMELS-GB dataset.

How to cite: Kiraz, M., Wagener, T., and Coxon, G.: Location, location, location – Considering local neighborhood when analyzing large samples of UK catchments, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-9468, https://doi.org/10.5194/egusphere-egu21-9468, 2021.

Hydrologic similarity and regionalization
Joseph Janssen and Ali Ameli

Expanding the scientific understanding of global hydrological processes is a key research area for hydrologists. Research in this area can allow hydrologists to make better predictions in ungauged basins and catchments under climate change scenarios. Though hydrological processes are largely understood at a laboratory-scale, catchment-scale processes are much more complex and unknown. Previous studies at the catchment-scale have shown catchment geology is largely irrelevant in determining components of streamflow. Laboratory-scale experiments, however, have revealed that this is unlikely. This contradiction indicates the current techniques for determining hydrological variable importance in the literature are insufficient. In this paper, we quantify the influence of the interaction amongst climatic, geological, and topographical features on a large set of hydrological signatures in snow-dominated regions across North America, using Stable Extrapolative Marginal Contribution Feature Importance. The preliminary results show that when we consider interaction effects among climatic and geophysical features, and remove the influence of correlation, geological features show considerable importance at the catchment scale. We contend that this study contributes to the scientific understanding of catchment-scale hydrological processes, especially in cold, ungauged basins.

How to cite: Janssen, J. and Ameli, A.: The importance of geology when estimating catchment-scale streamflow characteristics: Application of a new technique for hydrologic similarity and regionalization, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-13979, https://doi.org/10.5194/egusphere-egu21-13979, 2021.

Yaqian Yang and Jintao Liu

In the mountainous basins with less anthropogenic influence, the hydrological function is mainly affected by climate and landscape, which makes it possible to measure hydrological similarity indirectly by geographical features. Due to the mechanisms of runoff generation can vary geographically, in this study, a simple stepwise clustering scheme was proposed to explore the role of geographical features at different spatial hierarchy in indicating hydrological response. Research methods mainly include (1) Stepwise regression was used to quantitatively show the correlation between 35 geographical features and 35 flow features and identify the important explanatory variables for hydrological response; (2) 64 basins were divided by stepwise clustering scheme, and the overall ability of the scheme to capture hydrological similarity was tested by comparing the optimal parameters; (3) The hydrological similarity of basin groups was measured by the leave-one cross validation of hydrological model parameters. The results showed that: (1) Rainfall features, elevation, slope and soil bulk density are the main explanatory variables. (2) The NSE of basin groups based on stepwise clustering is 0.64, reaches 80% of the optimal parameter sets (NSE=0.80). The NSE of 90% basins is greater than 0.5, 80% is greater than 0.6, and 49% is greater than 0.7. (3) In humid areas, the hydrological responses of the basins with more uniform monthly rainfall and more abundant summer rainfall are more similar, e.g., the NSE of Class 4 is 0.77. Under similar rainfall patterns, the hydrological responses of the basins with higher average altitude, greater slope, more convergent of shape and richer vegetation are more similar, e.g., the NSE of Class 3-2 is 0.72 and that of Class 1-2 is 0.70. In the case of similar rainfall patterns and landforms, the hydrological responses of the basins with smaller soil bulk density are more similar, e.g., the NSE of Class 3-2-2 is 0.80. In conclusion, the stepwise clustering enhances the interpretability of basin classification, and the effect of different geographical features on hydrological response can show the applicability of hydrological simulation in ungauged basins.

How to cite: Yang, Y. and Liu, J.: Understanding the role of different geographical features in the hydrological response of humid mountainous areas through a stepwise clustering scheme, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-3830, https://doi.org/10.5194/egusphere-egu21-3830, 2021.

Yuka Muto, Takeyoshi Chibana, and Masafumi Yamada

In order to conduct an appropriate management in each catchment, it is important to understand how the difference in geological conditions affect the relationship between precipitation and flow regimes.

Considering the differences in geological characteristics of catchments, this study aims 1)to clarify the period for calculating the total precipitation that is most influential to several levels of daily flow respectively and 2)to clarify the contribution of the change in the total precipitation of ‘the most influential period’ to the change in flow.

In this study, 63 mountainous catchments (dam catchments) within the Japanese Archipelago were selected as target areas. First, the 63 catchments were divided into 4 groups according to their geological characteristics. Second, from the observed data of daily flow lasting 26 years (from 1993 to 2018), 6 types of daily flow which represent flow of different scales within a year (1, 10, 25, 50, 75, 95 percentiles of daily flow within a year) were searched. In each geological classification, correlation coefficients between each 6 type of flow and total precipitation of various periods (from 2 days to 365 days) were calculated. Finally, for each geological classification and each type of flow, single regression analyses were conducted, setting the rate of change in flow amount as the objective variable, and the rate of change in total precipitation amount of the appropriate period as the explanatory variable.

As a result, in the analysis of correlation coefficients, significant differences among different geological classifications were seen for lower type of flows but not for higher type of flows. For catchments of volcanic rocks in the Quaternary period, total precipitation of 365 days before the flow occurrence had the highest correlation coefficient with lower type of flows. On the other hand, for catchments of sedimentary rocks in the Mesozoic or Paleozoic era, the most influential period was approximately 45 days, which was the shortest.

Also, increasing trends in flow (i.e. the rate of change in flow > 1.0) during the target period were seen regardless of the geological classification or the type of flow. However, from the simple regression analysis, the significant effect of the change in precipitation to the change in flow was only seen for annual maximum flow of catchments of sedimentary rocks from the Mesozoic or Paleozoic era. Except this specific geological characteristic and flow type, there is a possibility that other conditions of the catchments (e.g. change in land use) have larger effect to the change in flow compared to the change in precipitation.

In the analyses mentioned above, the effect of snowfall is not considered. Therefore, in the presentation, the difference between snow covered regions and others are compared in addition.

How to cite: Muto, Y., Chibana, T., and Yamada, M.: The Effect of Total Precipitation of Various Periods to Flow Regimes in Mountainous Catchments in Japan : Considering the Geological Characteristics of Catchments, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-14051, https://doi.org/10.5194/egusphere-egu21-14051, 2021.

Sara A. Goeking and David G. Tarboton

Forested watersheds across the western US have experienced recent widespread disturbance and tree mortality due to a combination of heat, drought, and epidemic insect and disease outbreaks. Hydrologic response has included both increases and decreases in the fraction of annual precipitation that is partitioned to streamflow versus evapotranspiration (ET). We used a large-sample hydrology approach to address two questions: First, how have water budget components changed during this period of high forest disturbance, and second, does streamflow response vary with disturbance severity, incoming solar radiation, and/or aridity? From previous studies, streamflow and runoff ratio are expected to increase with forest disturbance due to reduced ET, and conversely increases in forest density are expected to reduce streamflow. We statistically evaluated whether these expectations were met, and where and why contradictory responses occurred, using trend and regression analysis. We constructed annual water budgets for 211 watersheds in the western US from daily observations in the CAMELS dataset, which includes streamflow and climate data as well as watershed characteristics such as mean incoming solar radiation and aridity (i.e., ratio of mean annual potential ET to mean annual precipitation, or PET/P). Forest disturbance was quantified as percentage change in live tree volume and mean annual rate of tree mortality, using data collected by the US Forest Service’s Forest Inventory and Analysis program. While most water budget components and forcing variables did not exhibit consistent trends, many watersheds experienced significant increases in temperature and PET. Unexpected trends in runoff ratio occurred in two scenarios: First, runoff ratio decreased following forest disturbance in many water-limited watersheds (i.e., PET/P>1) of the southwestern US; and second, both runoff ratios and forest densities increased in some energy-limited watersheds of the Pacific Northwest. Water-limited watersheds and those with high solar radiation experienced more forest disturbance than energy-limited watersheds. We used hydrologic time trend analysis to quantify the magnitude of streamflow change. A linear regression model including precipitation and temperature as inputs was calibrated and validated using the pre-disturbance time period (1980-2006, odd years and even years, respectively; r2val=0.954), and then applied to the post-disturbance time period (2007-2019), where model residuals are assumed to represent change in streamflow due to factors not included in the model, i.e., forest change. Among the 65 watersheds with significant streamflow change, the magnitude of change was moderately correlated with both disturbance severity and solar radiation. Decreased post-disturbance streamflow occurred mainly in watersheds with low to moderate tree mortality and high incoming solar radiation. We used multiple linear regression to identify important predictors of streamflow change. Pre-disturbance streamflow, change in precipitation and PET, solar radiation, and the interaction of solar radiation and tree mortality were all highly significant predictors (p

How to cite: Goeking, S. A. and Tarboton, D. G.: Assessing annual streamflow response to forest disturbance in the western US: A large-sample hydrology approach, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-8070, https://doi.org/10.5194/egusphere-egu21-8070, 2021.

Bailey Anderson, Louise Slater, Simon Dadson, and Annalise Blum

There is still limited quantitative understanding of the effects of tree cover and urbanisation on streamflow at large scales, making it difficult to generalize these relationships. We use the globally consistent European Space Agency (ESA) Climate Change Initiative (CCI) Global Land Cover dataset to estimate the relationships between streamflow, calculated as high (Q0.99), median (Q0.50), and low (Q0.01) flow quantiles, and urbanization or tree cover changes in 2865 catchments between the years 1992 through 2018. We apply three statistical modelling approaches and examine the consistencies and inconsistencies between them. First, we use distributional regression models -- generalized additive models for location, scale, and shape (GAMLSS) -- at each site and assess goodness-of-fit. Model fits suggested a strong association between land cover, especially urban area, and low and median flows at sites with statistically significant trends in streamflow. We then examine the sign of the distributional regression model coefficients to determine whether the inclusion of a land cover variable in the regression models results in a relative increase or decrease in flow, regardless of the overall direction of trends in streamflow. Finally, we use fixed effects panel regression models to estimate the average effect across all sites. Panel regression results suggested that a 1% increase in urban area corresponds to between a < 1% and 2.1% increase in streamflow for all quantiles. Results for the tree cover panel regression models were not significant. We highlight the value of statistical approaches for large-sample attribution of hydrological change, while cautioning that considerable variability exists across catchments and modelling approaches.

How to cite: Anderson, B., Slater, L., Dadson, S., and Blum, A.: Influence of tree cover and urban area on streamflow in the United States using multiple statistical attribution techniques, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-2282, https://doi.org/10.5194/egusphere-egu21-2282, 2021.

Mattia Neri, Paulin Coulibaly, and Elena Toth

Catchment classification is one of the essential steps for transferring information between similar watersheds, through the identification of the dominant hydrological processes and their main characteristics. The delineation of similar groups of basins is required for several regionalisation applications and how to assess the hydrological similarity generally depends on the specific purpose of the study and on the features to be regionalised. In some analyses, such as for example the regionalisation of rainfall-runoff models, the similarity should reflect the interaction between meteorological forcings and river streamflow time series, in particular at fine temporal scale, in order to reproduce the catchment behaviour in the rainfall-runoff transformation processes. Previous hydrological research has identified basins with similar meteorological forcings (i.e. similarity of climate) or with similar streamflow time-series (i.e. similarity of runoff response), but no studies have so far considered the interaction between the entire time-series of forcing data (e.g. precipitation) and streamflow, quantifying it through measures to be used as similarity metrics.

One of the approaches that may be applied for this purpose is the use of the concepts belonging to information theory, that are based on the notion of entropy, i.e. the content of information of a signal (as a time-series), or, in the multivariate case, the content of information shared between more variables. The present study proposes the use of a multi-variate entropy-based measure, the so-called transfer entropy, a time-asymmetric quantity which analyses the interaction between different signals.

In this study, the concept of transfer entropy is applied for identifying the dominant hydrological processes occurring in a catchment, measuring the transfer of information from different meteorological forcings over the catchment (such as rainfall, snowmelt and evapotranspiration) to the corresponding streamflow time-series at the basin outlet. The resulting similarity measure is then used for grouping catchments with similar dynamics.

In a first step, the different amounts of information transferred from the meteorological forcing variables to observed runoff are estimated through the computation of the transfer entropy. The transfer entropy values are then used as signatures to characterise catchment dynamics, and a classification of the basins inside a study region is obtained assuming that similar values of transfer entropy for the considered forcing variables identify similar basins.

The methodology is tested for two study regions: the first is Austria, where a very densely-gauged set of catchments is available; the second is the conterminous US (CAMELS dataset), characterised by sparser gauging stations and a much higher hydroclimatic variability.

The outcomes of the approach are evaluated against a set of “traditional” catchment signatures, demonstrating the potential of transfer entropy as an additional promising instrument for assessing hydrological similarity and for quantifying the connection between different governing processes.

How to cite: Neri, M., Coulibaly, P., and Toth, E.: Exploring the potential of transfer entropy for identifying similarity of catchment dynamics, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-10152, https://doi.org/10.5194/egusphere-egu21-10152, 2021.

Impact of the number of donor catchments and the efficiency threshold on regionalization performance of hydrological models
Wen-yan Qi, Chen Jie, Lu Li, Chong-Yu Xu, Yi-heng Xiang, Shao-bo Zhang, and Hui-Min Wang
Modelling capabilities
Ilias Pechlivanidis, Louise Crochemore, and Marc Girons Lopez

The scientific community has made significant progress towards improving the skill of hydrological forecasts; however, most investigations have normally been conducted at single or in a limited number of catchments. Such an approach is indeed valuable for detailed process investigation and therefore to understand the local conditions that affect forecast skill, but it is limited when it comes to scaling up the underlying hydrometeorological hypotheses. To advance knowledge on the drivers that control the quality and skill of hydrological forecasts, much can be gained by comparative analyses and from the availability of statistically significant samples. Large-scale modelling (at national, continental or global scales) can complement the in-depth knowledge from single catchment modelling by encompassing many river systems that represent a breadth of physiographic and climatic conditions. In addition to large sample sizes which cover a gradient in terms of climatology, scale and hydrological regime, the use of machine learning techniques can contribute to the identification of emerging spatiotemporal patterns leading to forecast skill attribution to different regional physiographic characteristics.

Here, we draw on two seasonal hydrological forecast skill investigations that were conducted at the national and continental scales, providing results for more than 36,000 basins in Sweden and Europe. Due to the large generated samples, we are capable of demonstrating that the quality of seasonal streamflow forecasts can be clustered and regionalized, based on a priori knowledge of the local hydroclimatic conditions. We show that the quality of seasonal streamflow forecasts is linked to physiographic and hydroclimatic descriptors, and that the relative importance of these descriptors varies with initialization month and lead time. In our samples, hydrological similarity, temperature, precipitation, evaporative index, and precipitation forecast biases are strongly linked to the quality of streamflow forecasts. This way, while seasonal river flow can generally be well predicted in river systems with slow hydrological responses, predictability tends to be poor in cold and semiarid climates in which river systems respond immediately to precipitation signals.

How to cite: Pechlivanidis, I., Crochemore, L., and Girons Lopez, M.: Why is large sample hydrology important in hydrological forecasting? , EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-4871, https://doi.org/10.5194/egusphere-egu21-4871, 2021.

Paul C. Astagneau, François Bourgin, Vazken Andréassian, and Charles Perrin

To improve the predictive capability of a model, one must identify situations where it fails to provide satisfactory results. We tried to identify the deficiencies of a lumped rainfall-runoff model used for flood simulation (the hourly GR5H-I model) by evaluating it over a large set of 229 French catchments and 11,054 flood events. Evaluating model simulations separately for individual flood events allowed us identifying a seasonal trend: while the model yielded good performance in terms of aggregated statistics, grouping results by season showed clear underestimations of most of the floods occurring in summer. The largest underestimations of flood volumes were identified when high-intensity precipitation events occurred and when the precipitation field was highly spatially variable. Low antecedent soil moisture conditions were also found to be strongly correlated with model bias. Overall, this study pinpoints the need to better account for short-duration processes to improve the GR5H-I model for flood simulation.

How to cite: Astagneau, P. C., Bourgin, F., Andréassian, V., and Perrin, C.: Flood simulation errors show an unexpected seasonal trend: results obtained on a set of 229 catchments and 11,054 flood events, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-1045, https://doi.org/10.5194/egusphere-egu21-1045, 2021.

Jerom Aerts, Albrecht Weerts, Willem van Verseveld, Pieter Hazenberg, Niels Drost, Rolf Hut, and Nick van de Giesen

In this study, we investigate the effect of spatial resolution discretization at 3km, 1km, and 200m by evaluating the streamflow estimation of the model. A hypothesis driven approach is used to investigate why changes in states and fluxes are taking place at different spatial resolutions and how they relate to model performance. These changes are evaluated in the context of landscape and climate characteristics as well as hydrological signatures. Answering the research question: can landscape, climate and hydrological characteristics dictate appropriate spatial modelling resolution a priori?

We use a spatially distributed wflow_sbm model (Imhoff et al., 2020, code: https://zenodo.org/record/4291730) together with the CAMELS dataset (Addor et al., 2017), covering the Continental United States. The wflow_sbm model is chosen due to flexibility in the spatial resolution of the watershed discretization while maintaining run time performance suitable for large-sample studies. The flexibility in spatial resolution is achieved by the use of point-scale (pedo)transfer functions (PTFs) with upscaling rules to global datasets to ensure flux matching across scales (Imhoff et al., 2020; Samaniego et al., 2010, 2017). The model relies on open datasets for parameter estimation and requires minimal calibration efforts as it is most sensitive to two model parameters, rooting depth and horizontal conductivity .

This study is carried out within the eWaterCycle framework; allowing for a FAIR by design research setup that is scalable in terms of case study areas and hydrological models.

How to cite: Aerts, J., Weerts, A., van Verseveld, W., Hazenberg, P., Drost, N., Hut, R., and van de Giesen, N.: Large-sample based evaluation of the spatial resolution discretization of the wflow_sbm model for the CAMELS dataset, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-10680, https://doi.org/10.5194/egusphere-egu21-10680, 2021.

Development and improvement of large-sample data sets
Daniel Power, Rafael Rosolem, Miguel Rico-Ramirez, Darin Desilets, and Sharon Desilets

Despite its importance in many hydrological and environmental applications, direct estimates of soil moisture at the field-scale is still challenging. The spatial gap between point scale sensors and satellite derived products is becoming increasingly important to consider in the push for hyper-resolution (sub)kilometre-hydrometeorological models. Cosmic-Ray Neutron Sensors (CRNS) can help to bridge this spatial gap. CRNS provide estimates of field-scale (sub-kilometre) root-zone integrated soil moisture typically at hourly intervals. They achieve this by counting fast neutrons which are produced in the atmosphere from incoming cosmic rays. Fast neutrons are mitigated primarily by hydrogen atoms, and it is this relationship that allows us to estimate field averaged soil moisture. National networks of CRNS are available in the USA, Australia, the UK, and Germany, along with individual sites across the globe. As these networks have expanded, so has our knowledge on best practices for calibration and correction of the sensor measurements. However, there continues to be a divergence and lack of harmonization in some processing data methods leading to an additional uncertainty when comparing sensors in different networks. This can undermine efforts to employ large-sample hydrological analysis of CRNS across a wide range of climate and biomes. To provide an easily accessible platform for multi-site comparison worldwide, we developed the Cosmic Ray Sensor Python tool (crspy). Crspy is an open-source Python package which is designed to process CRNS data from global networks in a uniform and harmonized way (https://www.github.com/danpower101/crspy). Additionally, crspy has been developed for multi-site ‘big-data’ analysis in hydrology. Our crspy tool produces detailed information in the form of metadata for each site, using both site specific data as well as global data products to give information on soil properties (SoilGridsv2), land cover/aboveground biomass (ESA CCI) and climate data (ERA5-land). Our preliminary analysis and tool development was carried out using data from more than 100 sites globally from the public domain. We will present an analysis of this large sample of data, utilising the harmonized soil moisture readings along with detailed metadata for each site. We aim to increase our understanding of the dominant mechanisms controlling soil moisture dynamics which will undoubtedly be useful in multiple areas of research such as catchment classification, agriculture and irrigation, and hydrological model development.

How to cite: Power, D., Rosolem, R., Rico-Ramirez, M., Desilets, D., and Desilets, S.: Understanding soil moisture dynamics through cosmic rays: a global analysis, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-13250, https://doi.org/10.5194/egusphere-egu21-13250, 2021.

Jamie Brown, Rafael Rosolem, Ross Woods, Humberto Rocha, and Debora Roberti

In the past decade, the scientific community has seen an increase in the number of global hydrometeorological products. This has been possible with efforts to push global hydrological and land surface modelling to hyper-resolution applications. As the resolution of these datasets increase, so does the need to compare their estimates against local in-situ measurements. This is particularly important for Brazil, whose large continental scale domain results in a wide range of climate and biomes. In this study, high-resolution (0.1-0.25 deg) global and regional meteorological datasets are compared against flux tower observations at 11 sites across Brazil (for periods between 1999-2010), covering Brazil’s main land cover types (tropical rainforest, woodland savanna, various croplands, and tropical dry forests) to assess the quality of four global reanalysis products [ERA5-Land, GLDAS2.0, GLDAS2.1, and MSWEPv2.2] and one regional gridded dataset developed from local interpolation of meteorological variables across the country [Brazilian National  Meteorological Database (referred here as Xavier)]. Whilst the only measured variable for MSWEP was precipitation, all other gridded datasets also included surface meteorological variables such as air temperature, wind speed, pressure, downward shortwave and longwave radiation, and specific humidity. Data products were evaluated for their ability to reproduce the daily and monthly meteorological observations at flux towers. A ranking system for data products was developed based on the mean squared error. To identify the possible causes for these errors further analysis was undertaken to determine the contributions of correlation, bias, and variation to the MSE. Results show that, for precipitation, MSWEP outperforms the other datasets at daily scales but at a monthly scale XAVIER performs best. For all other variables, ERA5-Land achieved the best ranking (smallest) errors at the daily scale and averaged the best rank for all variables at the monthly scale. GLDAS2.0 performed least well at both temporal scales, however the newer version (GLDAS2.1) was an improvement of its older version for almost every variable. Xavier wind speed and GLDAS2.0 solar radiation outperformed the other datasets at a monthly scale. The largest contribution to the MSE at the daily scale for all datasets and variables was the correlation contribution whilst at the monthly scale it was the bias contribution. ERA5-Land is recommended when using multiple hydro-meteorological variables to force land-surface models within Brazil.

How to cite: Brown, J., Rosolem, R., Woods, R., Rocha, H., and Roberti, D.: Evaluation of high-resolution meteorological global data products using flux tower observations across Brazil, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-15387, https://doi.org/10.5194/egusphere-egu21-15387, 2021.

Keirnan Fowler, Suwash Chandra Acharya, Nans Addor, Chihchung Chou, and Murray Peel

Large samples of catchments are becoming increasingly important to gain generalisable insights from hydrological research.  Such insights are facilitated by freely available large sample hydrology datasets, with one example being the CAMELS (Catchment Attributes and Meteorology for Large-sample Studies) series of datasets.  Here we present CAMELS-AUS, the Australian edition of CAMELS. CAMELS-AUS comprises data for 222 unregulated catchments, combining hydrometeorological timeseries (streamflow and 18 climatic variables) with 134 attributes related to geology, soil, topography, land cover, anthropogenic influence, and hydroclimatology. The CAMELS-AUS catchments have been monitored for decades (more than 85 % have streamflow records longer than 40 years) and are relatively free of large scale changes, such as significant changes in landuse. Rating curve uncertainty estimates are provided for most (75 %) of the catchments and multiple atmospheric datasets are included, offering insights into forcing uncertainty. This dataset, the first of its kind in Australia, allows users globally to freely access catchment data drawn from Australia's unique hydroclimatology, particularly notable for its large interannual variability. Combined with arid catchment data from the CAMELS datasets for the USA and Chile, CAMELS-AUS constitutes an unprecedented resource for the study of arid-zone hydrology. CAMELS-AUS is freely downloadable from and the corresponding paper is available at https://essd.copernicus.org/preprints/essd-2020-228/.

How to cite: Fowler, K., Chandra Acharya, S., Addor, N., Chou, C., and Peel, M.: CAMELS-AUS: Hydrometeorological time series and landscape attributes for 222 catchments in Australia, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-10428, https://doi.org/10.5194/egusphere-egu21-10428, 2021.

Vazken Andréassian, Olivier Delaigue, Charles Perrin, Bruno Janet, and Nans Addor

Over the last decades, the development of large sample hydrology has allowed a generalization of sound model evaluation and testing practices (Andréassian et al., 2006; Gupta et al. 2014), based on various types of split-sample tests. This presentation aims at illustrating some of these tests, while introducing at the same time a French dataset that we have been working with for many years. This dataset has been assembled at INRAE (HYCAR research unit), based on an automatized assembling of national data products (Delaigue et al. 2020). CAMELS-FR will provide daily hydro-meteorological time series (streamflow, solid and liquid precipitation, potential evapotranspiration, temperature, etc.) covering the 1958-2020 period. Catchment characteristics such as land cover, topography (i.e. elevation and slope distributions, drainage density, topographic index, etc.) will be provided, with information about possible regulations upstream, and with some a priori information on data quality. Graphical summary sheets for each catchment are already available.

This approach is part of the CAMELS international initiative (Addor et al., 2017), whose purpose is to facilitate reproducible hydrological research by the use of large sample catchment datasets, and the CAMELS-FR dataset will be made available to scientific users in partnership with data owners.


Addor, N., Newman, A. J., Mizukami, N., Clark, M. P. (2017). The CAMELS data set: catchment attributes and meteorology for large-sample studies, Hydrol. Earth Syst. Sci., 21, 5293–5313, https://doi.org/10.5194/hess-21-5293-2017

Andréassian, V., Hall, A., Chahinian, N., Schaake, J. (2006). Introduction and Synthesis: Why should hydrologists work on a large number of basin data sets? IAHS Publication, 307, 1-5, https://hal.inrae.fr/hal-02588687.

Delaigue, O., Génot, B., Lebecherel, L., Brigode, P., Bourgin, P.Y. (2020). Database of watershed-scale hydroclimatic observations in France. Université Paris-Saclay, INRAE, HYCAR Research Unit, Hydrology group, Antony, https://webgr.inrae.fr/base-de-donnees.

Gupta, H.V., Perrin, C., Blöschl, G., Montanari, A., Kumar, R., Clark, M., Andréassian, V. (2014). Large-sample hydrology: A need to balance depth with breadth. Hydrology and Earth System Sciences, 18(2), 463–477, doi: https://doi.org/10.5194/hess-18-463-2014.

How to cite: Andréassian, V., Delaigue, O., Perrin, C., Janet, B., and Addor, N.: CAMELS-FR: A large sample, hydroclimatic dataset for France, to support model testing and evaluation, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-13349, https://doi.org/10.5194/egusphere-egu21-13349, 2021.

Christoph Klingler, Mathew Herrnegger, Frederik Kratzert, and Karsten Schulz

Open large-sample datasets are important for various reasons: i) they enable large-sample analyses, ii) they democratize access to data, iii) they enable large-sample comparative studies and foster reproducibility, and iv) they are a key driver for recent developments of machine-learning based modelling approaches.

Recently, various large-sample datasets have been released (e.g. different country-specific CAMELS datasets), however, all of them contain only data of individual catchments distributed across entire countries and not connected river networks.

Here, we present LamaH, a new dataset covering all of Austria and the foreign upstream areas of the Danube, spanning a total of 170.000 km² in 9 different countries with discharge observations for 882 gauges. The dataset also includes 15 different meteorological time series, derived from ERA5-Land, for two different basin delineations: First, corresponding to the entire upstream area of a particular gauge, and second, corresponding only to the area between a particular gauge and its upstream gauges. The time series data for both, meteorological and discharge data, is included in hourly and daily resolution and covers a period of over 35 years (with some exceptions in discharge data for a couple of gauges).

Sticking closely to the CAMELS datasets, LamaH also contains more than 60 catchment attributes, derived for both types of basin delineations. The attributes include climatic, hydrological and vegetation indices, land cover information, as well as soil, geological and topographical properties. Additionally, the runoff gauges are classified by over 20 different attributes, including information about human impact and indicators for data quality and completeness. Lastly, LamaH also contains attributes for the river network itself, like gauge topology, stream length and the slope between two sequential gauges.

Given the scope of LamaH, we hope that this dataset will serve as a solid database for further investigations in various tasks of hydrology. The extent of data combined with the interconnected river network and the high temporal resolution of the time series might reveal deeper insights into water transfer and storage with appropriate methods of modelling.

How to cite: Klingler, C., Herrnegger, M., Kratzert, F., and Schulz, K.: LamaH: Large-sample Data for Hydrology in Central Europe, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-14335, https://doi.org/10.5194/egusphere-egu21-14335, 2021.

Nathalie Rouché, Yves Tramblay, Jean-Emmanuel Paturel, Gil Mahé, Jean-François Boyer, Ernest Amoussou, Ansoumana Bodian, Honoré Dacosta, Hamouda Dakhlaoui, Alain Dezetter, Denis Hughes, Lahoucine Hanich, Christophe Peugeot, Raphael Tshimanga, and Patrick Lachassagne

The African continent is probably the one with the lowest density of hydrometric stations currently measuring river discharge, despite the fact that the number of stations was quite important until the 70s. In addition, there is a major issue of data availability, since the different existing datasets are scattered across vast regions, heterogeneous and often with a large amount of missing data in the time series. The aim of this African Dataset of Hydrometric Indices (ADHI) is to provide a set of hydrometric indices computed from an unprecedented large set of daily discharge data in Africa. The ADHI database is based on a new streamflow dataset of 1466 gauging stations with an average record length of 33 years and for over 100 stations complete records are available over 50 years. ADHI is compiling data from different sources carefully checked, based on the historical databases of ORSTOM / IRD and the GRDC, including also other contributions from different countries and basin agencies. The criterion for a station to be included in ADHI is to have a minimum of 10 full years of daily discharge data between 1950 and 2018 with less than 5% missing data. Some time series originating from different sources were concatenated, after making sure the rating curves applied on the different time periods to compute river discharge were similar. Data records were scrutinized to identify suspicious discharge records and time periods where gap-filling methods have been applied to the original records. The selected stations are spread across the whole African continent, with the highest density in Western and Southern Africa and the lowest density in Eastern Africa. They are representative of most of the climate zones of Africa according the Köppen-Geiger climate classification. From this dataset, a large range of hydrological indices and flow signatures have been computed and made available to the scientific community (https://doi.org/10.23708/LXGXQ9). They are representing mean flow characteristics and extremes (low flows and floods) but also catchment characteristics, allowing to study the long-term evolution of hydrology in Africa and support the modelling efforts that aim at reducing the vulnerability of African countries to hydro-climatic variability.

How to cite: Rouché, N., Tramblay, Y., Paturel, J.-E., Mahé, G., Boyer, J.-F., Amoussou, E., Bodian, A., Dacosta, H., Dakhlaoui, H., Dezetter, A., Hughes, D., Hanich, L., Peugeot, C., Tshimanga, R., and Lachassagne, P.: ADHI: The African Database of Hydrometric Indices (1950-2018), EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-9283, https://doi.org/10.5194/egusphere-egu21-9283, 2021.

Holger Virro, Giuseppe Amatulli, Alexander Kmoch, Longzhu Shen, and Evelyn Uuemaa

Recent advances in implementing machine learning (ML) methods in hydrology have given rise to a new, data-driven approach to hydrological modeling. Comparison of physically based and ML approaches has shown that ML methods can achieve a similar accuracy to the physically based ones and outperform them when describing nonlinear relationships. Global ML models have been already successfully applied for modeling hydrological phenomena such as discharge.

However, a major problem related to large-scale  water quality modeling has been the lack of available observation data with a good spatiotemporal coverage. This has affected the reproducibility of previous studies and the potential improvement of existing models. In addition to the observation data itself, insufficient or poor quality metadata has also discouraged researchers to integrate the already available datasets. Therefore, improving both, the availability, and quality of open water quality data would increase the potential to implement predictive modeling on a global scale.

We aim to address the aforementioned issues by presenting the new Global River Water Quality Archive (GRQA) by integrating data from five existing global and regional sources:

  • Canadian Environmental Sustainability Indicators program (CESI)
  • Global Freshwater Quality Database (GEMStat)
  • GLObal RIver Chemistry database (GLORICH)
  • European Environment Agency (Waterbase)
  • USGS Water Quality Portal (WQP)

The resulting dataset contains a total of over 14 million observations for 41 different forms of some of the most important water quality parameters, focusing on nutrients, carbon, oxygen and sediments. Supplementary metadata and statistics are provided with the observation time series to improve the usability of the dataset. We report on developing a harmonized schema and reproducible workflow that can be adapted to integrate and harmonize further data sources. We conclude our study with a call for action to extend this dataset and hope that the provided reproducible method of data integration and metadata provenance shall lead as an example.

How to cite: Virro, H., Amatulli, G., Kmoch, A., Shen, L., and Uuemaa, E.: GRQA: Global River Water Quality Archive, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-3865, https://doi.org/10.5194/egusphere-egu21-3865, 2021.

Bernhard Lehner, Achim Roth, Martin Huber, Mira Anand, Günther Grill, Nicole Osterkamp, Raphael Tubbesing, Leena Warmedinger, and Michele Thieme

Since its introduction in 2008, the HydroSHEDS database (www.hydrosheds.org) has transformed large-scale hydro-ecological research and applications worldwide by offering standardized spatial units for hydrological assessments. At its core, HydroSHEDS provides digital hydrographic information that can be applied in Geographic Information Software (GIS) or hydrological models to delineate river networks and catchment boundaries at multiple scales, from local to global. Its various data layers form the basis for applications in a wide range of disciplines including environmental, conservation, socioeconomic, human health, and sustainability studies.

Version 1 of HydroSHEDS was derived from the digital elevation model of the Shuttle Radar Topography Mission (SRTM) at a pixel resolution of 3 arc-seconds (~90 meters at the equator). It was created using customized processing and optimization algorithms and a high degree of manual quality control. Results are available at varying resolutions, ranging from 3 arc-seconds (~90 m) to 5 minutes (~10 km), and in nested sub-basin structures, making the data uniquely suitable for applications at multiple scales. A suite of related data collections and value-added information, foremost the HydroATLAS compilation of over 50 hydro-environmental attributes for every river reach and sub-basin, continuously enhance the versatility of the HydroSHEDS family of products. Yet version 1 of HydroSHEDS shows some important limitations. In particular, coverage above 60° northern latitude (i.e., largely the Arctic) is missing for the 3 arc-second product and is of low quality for coarser products because no SRTM elevation data are available for this region. Also, some areas are affected by inherent data gaps or other errors that could not be fully resolved at the time of creating version 1 of HydroSHEDS.

Today, the TanDEM-X dataset (TerraSAR-X add-on for Digital Elevation Measurement), created in partnership between the German Aerospace Agency (DLR) and Airbus, offers a new digital elevation model covering the entire global land surface including northern latitudes. In a collaborative project, this dataset is used to extract HydroSHEDS v2.0, following the same basic specifications as version 1. DLR is processing the original 12 m resolution TanDEM-X data to create a hydrologically pre-conditioned version at 3 arc-second resolution. In this step, corrections with high-resolution vegetation and settlement maps are applied to reduce distortions caused by vegetation cover and in built-up areas. Following this preprocessing, refined hydrological optimization and correction algorithms are used to derive the drainage pathways, including improved ‘stream-burning’ techniques that incorporate recent data products such as high-resolution terrestrial open water masks and improved tracing of drainage pathways as center lines in global lake and river maps. The resulting HydroSHEDS v2.0 database will provide river networks and catchment boundaries at full global coverage. Release of the data under a free license is scheduled for 2022, with regions above 60° northern latitude being completed first in 2021.

How to cite: Lehner, B., Roth, A., Huber, M., Anand, M., Grill, G., Osterkamp, N., Tubbesing, R., Warmedinger, L., and Thieme, M.: HydroSHEDS v2.0 – Refined global river network and catchment delineations from TanDEM-X elevation data, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-9277, https://doi.org/10.5194/egusphere-egu21-9277, 2021.