EGU23-13223, updated on 03 Jan 2024
https://doi.org/10.5194/egusphere-egu23-13223
EGU General Assembly 2023
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Exploiting Curated, Domain-Specific Repositories to Facilitate Globally Interoperable Databases: the GEOROC Use-Case for Global Geochemical Data

Marthe Klöcking1, Adrian Sturm2, Bärbel Sarbas1, Leander Kallas1, Stefan Möller-McNett1, Jens Nieschulze3, Kerstin Lehnert4, Kirsten Elger5, Wolfram Horstmann2, Daniel Kurzawe2, Matthias Willbold1, and Gerhard Wörner1
Marthe Klöcking et al.
  • 1Geoscience Centre Göttingen, University of Göttingen, Göttingen, Germany (marthe.kloecking@uni-goettingen.de)
  • 2Göttingen State and University Library, Göttingen, Germany
  • 3eResearch Alliance, University of Göttingen, Göttingen, Germany
  • 4Lamont-Doherty Earth Observatory, Columbia University, Palisades, USA
  • 5GFZ German Research Centre for Geosciences, Potsdam, Germany

The GEOROC database is a leading, open-access source of geochemical and isotopic datasets of igneous and metamorphic rocks and minerals. It was established 24 years ago and currently provides access to curated compilations of rock and mineral compositions from >20,600 publications (>32 million single data values). The Digital Geochemical Data Infrastructure (DIGIS) initiative for GEOROC 2.0 is now building a connected platform capable of supporting the diverse demands of digital, data-based geochemical research: including modern solutions to data submission, discovery and access.

One of the challenges for maintaining a high quality, up-to-date database such as GEOROC is consistent data entry. Historically, data were compiled manually from the academic literature by trained curators. This manual data entry process is slow, resource-intensive and prone to errors. Exacerbated by the lack of best-practices or standards for analytical geochemical data reporting, the quality and completeness of data and metadata compiled in this way are highly variable. A possible solution to this challenge is offered by domain-specific repositories: in part driven by demands of some funders and publishers to make all research data publicly available, data producers increasingly publish their research datasets, affording repositories a unique opportunity to impose consistent standards and quality. Following these developments, DIGIS established a domain repository with DOI minting capabilities in 2021 to support independent data submission by authors. In principle, these data submissions may comprise new analytical results as well as compilations of previously published data (“expert datasets”). DIGIS also uses its repository for versioning of the GEOROC data compilations and to provide distinct, citable objects to the researchers that use GEOROC compilations for their work (so-called “precompiled files”, a collection of pre-formatted results of the most popular search queries to the GEOROC database are regularly updated and re-published). However, whilst all data submissions by authors are required to fulfill the scope of the GEOROC database, new analytical data need to meet additional quality requirements: the repository enforces a strict template to ensure consistent reporting of all relevant sample and method/analysis metadata. These templates can then be automatically harvested from the repository directly into the GEOROC database, with the added guarantee that new data entries are a) approved by the owners of the datasets, and b) follow a consistent data reporting and quality standard.

To encourage user uptake of both the repository and the compilations available in the GEOROC database, DIGIS is working closely with IEDA2 and EarthChem towards developing a common infrastructure for geochemical data. One goal of this collaboration is a single repository submission platform that asserts the same requirements for data and metadata quality of submitted datasets. In addition, DIGIS has also partnered with GFZ Data Services as their trusted domain repository. Finally, through the OneGeochemistry initiative, all three partners are working towards global community-endorsed best practices for geochemical data publication. Ultimately, these efforts will facilitate greater interoperability between globally distributed geochemical data systems, enabling more user-friendly delivery of data publication and compilation services to the research community.

How to cite: Klöcking, M., Sturm, A., Sarbas, B., Kallas, L., Möller-McNett, S., Nieschulze, J., Lehnert, K., Elger, K., Horstmann, W., Kurzawe, D., Willbold, M., and Wörner, G.: Exploiting Curated, Domain-Specific Repositories to Facilitate Globally Interoperable Databases: the GEOROC Use-Case for Global Geochemical Data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-13223, https://doi.org/10.5194/egusphere-egu23-13223, 2023.