- 1GFZ Helmholtz Centre for Geosciences, Potsdam, Germany (kirsten.elger@gfz.de)
- 2University of Applied Sciences Potsdam, Potsdam, Germany
The availability of reusable data and their associated metadata is increasingly demanded to address global societal challenges. Research data repositories and databases are the primary access points for geosciences data, and especially domain repositories are known to publish well documented and reusable data. This is due to a thorough data and metadata curation provided by the repository staff that usually includes domain scientists. Overall, the documented publication of a complex data set via a domain repository often takes time and additional preparation by the scientists, but the results clearly show a significant increase of the metadata and data quality, including the provision of cross-references to other publications, datasets, code and originating physical samples.
The largest challenge for domain repositories is to provide incentives to the researchers that reduce their workload and in the same time ensure a high quality of metadata and data documentation already at an early stage of a planned data publication. This challenge is especially high in repositories with a focus on the highly variable and usually small data from so-called “long-tail communities”. GFZ Data Services is a domain repository for DOI-referenced geosciences data and scientific software, hosted at the GFZ Helmholtz Centre for Geosciences. The repository has both a focus on the curation of long-tail data, and offers data publication services for international projects and services in the geosciences. To support researchers with the provision of descriptive metadata and receive structured data documentation, GFZ Data Services has developed an online metadata editor and data description templates. This presentation will focus on these support tools and demonstrate how both help the researchers and in the same time reduce the data curation workload.
A major focus will lay on our new metadata editor that is currently jointly developed between the University of Applied Sciences Potsdam and GFZ Data Services. The new metadata editor will enhance the support of users in data entry, so that the manual curation effort by the GFZ Data Services is reduced, and the metadata quality is improved at the same time. Technically, it has a responsive design and offers a dark mode. New facets include the ability to retrieve specific information, e.g., affiliations from the ROR API via a dropdown menu. Keywords are made uniquely identifiable through the automatic storage of schema names and uniform resource identifiers of the specific terms. All integrated thesauri can be updated via API calls. Real time validation of the input fields prevents the submission of incomplete or incorrect entries, so that significantly less work is required in data curation. The integrated help guide supports users to fill in the input fields.
The data description templates collect additional technical description in a structured form and are essential for data reuse. They are available in “commented” and “usable” versions and ensure that the descriptions meet our requirements (for many researchers the data documentation is new), offer clear instructions and even reduce the workload of the curators, because the descriptions are already provided at a very high level of content.
How to cite: Meistring, M., Ehrmann, H., Franz, J., Frenzel, S., Mohammed, A., and Elger, K.: How domain repositories support reusable data: metadata tools from GFZ Data Services, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-20132, https://doi.org/10.5194/egusphere-egu25-20132, 2025.