EGU23-5047
https://doi.org/10.5194/egusphere-egu23-5047
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

WDCC - Improvement of FAIRness of an established repository

Eileen Hertwig, Andrea Lammert, Heinke Höck, Andrej Fast, and Hannes Thiemann
Eileen Hertwig et al.
  • Deutsches Klimarechenzentrum (DKRZ), Data Management, Germany (hertwig@dkrz.de)

The World Data Center for Climate (WDCC) provides access to and offers long-term archiving for datasets relevant for climate and Earth System research in a highly standardized manner following the FAIR principles. The focus is on climate simulation data. The WDCC services are aimed at both scientists who produce data (e.g. to fulfill the guidelines of good scientific practice) and scientists who re-use published data for new research.

The WDCC is hosted by the German Climate Computing Center (DKRZ) in Hamburg, Germany. The repository is an accredited regular member of the World Data System (WDS) since 2003. WDCC is certified as a Trustworthy Data Repository by CoreTrustSeal (https://www.coretrustseal.org).

The WDCC was actively involved in the development of mechanisms to publish scientific datasets as citable entities. The first Datacite DOI ever assigned to a dataset was for a WDCC dataset in 2004 (http://dx.doi.org/10.1594/WDCC/EH4_OPYC_SRES_A2). Since then dataset collections in WDCC can be published with a DOI. In 2022, in compliance with the FAIR principles, the WDCC has also implemented the assignment of PIDs, persistent identifiers, for individual datasets. A PID is a long-lasting reference to a dataset (or other digital object) that is designed to always provide access to the object or to a representation of it, even if the actual URLs of the objects may change over time.

To meet user’s needs it is essential to ensure high quality of data, which means making sure that datasets in the repository are really Findable, Accessible, Interoperable, and Reusable (FAIR). The FAIRness of the WDCC has been systematically assessed in Peters-von Gehlen et al. (2022). Furthermore, to monitor the development of FAIRness in WDCC a FUJI-test is performed for all new dataset collections which are assigned a DOI.

Datasets are easier to find for the users when the corresponding metadata is machine-readable and a standardized vocabulary is used. The WDCC has implemented the schema.org standard, a machine-actionable metadata using JSON-LD format on the landing page of WDCC data publications. These embedded structured metadata in the landing page enhance interoperability across data catalogs and makes the data more discoverable.

WDCC actively participated in the AtMoDat project (https://www.atmodat.de/) and has started to publish datasets following the ATMODAT standard and with the EASYDAB label. The ATMODAT standard specifies requirements for rich metadata with controlled vocabularies, structured landing pages (human- and machine-readable), and the format and structure of the data files.

 

References:

Peters-von Gehlen, K., Höck, H., Fast, A., Heydebreck, D., Lammert, A. and Thiemann, H., 2022. Recommendations for Discipline-Specific FAIRness Evaluation Derived from Applying an Ensemble of Evaluation Tools. Data Science Journal, 21(1), p.7. DOI: http://doi.org/10.5334/dsj-2022-007

How to cite: Hertwig, E., Lammert, A., Höck, H., Fast, A., and Thiemann, H.: WDCC - Improvement of FAIRness of an established repository, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-5047, https://doi.org/10.5194/egusphere-egu23-5047, 2023.