EGU23-15367
https://doi.org/10.5194/egusphere-egu23-15367
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

CAT4KIT: A cross-institutional data catalog framework for the FAIRification of environmental research data

Christof Lorenz1, Mostafa Hadizadeh1, Sabine Barthlott2, Romy Fösig3, Uğur  Çayoğlu4, Robert Ulrich5, and Felix Bach6
Christof Lorenz et al.
  • 1Karlsruhe Institute of Technology (KIT), Institute of Meteorology and Climate Research - Atmospheric Environmental Research (IMK-IFU), Garmisch-Partenkirchen, Germany
  • 2Karlsruhe Institute of Technology (KIT), Institute of Meteorology and Climate Research - Atmospheric Trace Gases and Remote Sensing (IMK-ASF), Karlsruhe, Germany
  • 3Karlsruhe Institute of Technology (KIT), Institute of Meteorology and Climate Research - Atmospheric Aerosol Research (IMK-AAF), Karlsruhe, Germany
  • 4Karlsruhe Institute of Technology (KIT), Steinbuch Centre for Computing (SCC), Karlsruhe, Germany
  • 5Karlsruhe Institute of Technology (KIT), KIT Library (BIB), Karlsruhe, Germany
  • 6Leibniz Institute for Information Infrastructure (FIZ), Karlsruhe, Germany

A contemporary and flexible Research Data Management (RDM) framework is required to make environmental research data Findable, Accessible, Interoperable, and Reusable (FAIR) and, hence, provide the foundation for open and reproducible earth system sciences. While data-sets that accompany scientific articles are typically published via large data repositories like Pangaea or Zenodo, intermediate, day-to-day, or actively-used data (e.g., data from research projects or prototypical data) is still exchanged via simple cloud storage services and email. And while the FAIR principles require data to be openly findable and accessible, it is often only available within closed and restricted infrastructures and local file systems.

Our research project Cat4KIT hence aims to develop a cross-institutional catalog and RDM framework for the FAIRification of such day-to-day research data. This framework is comprised of four modules / services for

  • providing access to data on storage systems through well-defined and standardized interfaces 

  • harvesting and transforming (meta)data into standardized formats

  • making (meta)data accessible to the public using well-defined and standardized catalog services and interfaces

  • enabling users to search, filter, and explore data from decentralized research data infrastructures.

We develop, implement and evaluate each of these four modules within an inter-institutional consortium consisting of scientists, software developers and potential end-users. This allows us to include a wide-range of research data from multi-dimensional climate model outputs to high-frequency in-situ measurements. We emphasize the application of existing open-source solutions and community standards for data interfaces (THREDDS, STA, S3), (meta)data schemes, and catalog services (Spatio-Temporal Assets Catalog - STAC) in order to ensure an easy integration of research data into the Cat4KIT-framework and a straightforward extension to further research data infrastructures.

In our presentation, we demonstrate the current status of our Cat4KIT-framework as an inter-institutional research data management and catalog platform for the FAIRification of day-to-day research data.

How to cite: Lorenz, C., Hadizadeh, M., Barthlott, S., Fösig, R.,  Çayoğlu, U., Ulrich, R., and Bach, F.: CAT4KIT: A cross-institutional data catalog framework for the FAIRification of environmental research data, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-15367, https://doi.org/10.5194/egusphere-egu23-15367, 2023.

Supplementary materials

Supplementary material file