EGU21-14335, updated on 04 Mar 2021
https://doi.org/10.5194/egusphere-egu21-14335
EGU General Assembly 2021
© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.

LamaH: Large-sample Data for Hydrology in Central Europe

Christoph Klingler1, Mathew Herrnegger1, Frederik Kratzert2, and Karsten Schulz1
Christoph Klingler et al.
  • 1Institute for Hydrology and Water Management (HyWa), University of Natural Resources and Life Sciences, Vienna, Austria (christoph.klingler@boku.ac.at)
  • 2ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University, Linz, Austria

Open large-sample datasets are important for various reasons: i) they enable large-sample analyses, ii) they democratize access to data, iii) they enable large-sample comparative studies and foster reproducibility, and iv) they are a key driver for recent developments of machine-learning based modelling approaches.

Recently, various large-sample datasets have been released (e.g. different country-specific CAMELS datasets), however, all of them contain only data of individual catchments distributed across entire countries and not connected river networks.

Here, we present LamaH, a new dataset covering all of Austria and the foreign upstream areas of the Danube, spanning a total of 170.000 km² in 9 different countries with discharge observations for 882 gauges. The dataset also includes 15 different meteorological time series, derived from ERA5-Land, for two different basin delineations: First, corresponding to the entire upstream area of a particular gauge, and second, corresponding only to the area between a particular gauge and its upstream gauges. The time series data for both, meteorological and discharge data, is included in hourly and daily resolution and covers a period of over 35 years (with some exceptions in discharge data for a couple of gauges).

Sticking closely to the CAMELS datasets, LamaH also contains more than 60 catchment attributes, derived for both types of basin delineations. The attributes include climatic, hydrological and vegetation indices, land cover information, as well as soil, geological and topographical properties. Additionally, the runoff gauges are classified by over 20 different attributes, including information about human impact and indicators for data quality and completeness. Lastly, LamaH also contains attributes for the river network itself, like gauge topology, stream length and the slope between two sequential gauges.

Given the scope of LamaH, we hope that this dataset will serve as a solid database for further investigations in various tasks of hydrology. The extent of data combined with the interconnected river network and the high temporal resolution of the time series might reveal deeper insights into water transfer and storage with appropriate methods of modelling.

How to cite: Klingler, C., Herrnegger, M., Kratzert, F., and Schulz, K.: LamaH: Large-sample Data for Hydrology in Central Europe, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-14335, https://doi.org/10.5194/egusphere-egu21-14335, 2021.