EGU23-14515
https://doi.org/10.5194/egusphere-egu23-14515
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Intaking DKRZ ESM data collections

Fabian Wachsmann
Fabian Wachsmann
  • Deutsches Klimarechenzentrum, Datenmanagement, Hamburg, Germany (wachsmann@dkrz.de)

In this showcase, we present to you how Intake and its plugin Intake-ESM are utilized at DKRZ to provide highly FAIR data collections from different projects, stored on different types of storages in different formats.

The Intake Plugin Intake-ESM allows users to not only find the data of interest, but also load them as analysis-ready-like Xarray datasets. We utilize this tool to provide users with access to many available data collections at our institution from only one single access point, the main DKRZ intake catalog at www.dkrz.de/s/intake. The functionality of this package works independently of data standards and formats and therefore enables full metadata-driven data access including data processing. Intake-esm catalogs increase the FAIRness of the data collections in all aspects but especially in terms of Accessibility and Interoperability.

Started with a collection of DKRZ’s CMIP6 Data Pool, DKRZ now hosts catalogs for more than 10PB of data on different local storages. The Intake-ESM package has been well integrated into ESM data provisioning workflows.

  • Early sharing and making accessible: The co-developed inhouse ICON model generates an intake-esm catalog on each run.
  • Uptake from other technologies: E.g., intake-esm catalogs serve as templates for the more advanced DKRZ STAC Catalogs. 
  • Making accessible all storage types: tools used for writing data to the local institutional cloud allow users to create Intake-ESM catalogs for the written data.
  • Data archiving: Catalogs for projects in the archive can be created from its metadata database.

For future activities, we plan to make use of new functionalities like the support for kerchunked data and the derived variable registry.

The DKRZ data management team develops and maintains local services around intake-esm for a positive user experience. In this showcase, we will present excerpts of seminars, workflows and integrations.

How to cite: Wachsmann, F.: Intaking DKRZ ESM data collections, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-14515, https://doi.org/10.5194/egusphere-egu23-14515, 2023.

Supplementary materials

Supplementary material link