EGU23-13473
https://doi.org/10.5194/egusphere-egu23-13473
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Connecting the Long Tail: sharing and describing heterogeneous data via common metadata standards

Otto Lange1, Laurens Samshuijzen2, Kirsten Elger3, Simone Frenzel3, Ronald Pijnenburg2, Richard Wessels2, Geertje ter Maat2, and Martyn Drury2
Otto Lange et al.
  • 1University Library, Utrecht University, Utrecht, Netherlands
  • 2Faculty of Geosciences, Utrecht University, Utrecht, Netherlands
  • 3Library and Information Services, GFZ German Research Centre for Geosciences, Potsdam, Germany

The EPOS Multi-scale Laboratories (MSL) community includes a wide range of world-class solid Earth science laboratory infrastructures and as such it provides a multidisciplinary- and coherent platform for both virtual access to data and physical access to sophisticated research equipment. The MSL laboratories provide facilities for highly-specialized experimental research that results in experimental and analytical data underlying publications about phenomena ranging from the molecular to the continental scale.

From the perspective of the intended FAIRness of these laboratory data, the challenge for the MSL community has been to develop a data management paradigm that on one hand acknowledges the uniqueness of many of the data collections involved, and on the other hand maximizes their findability through metadata dissemination via common standards into larger cross-disciplinary communities. Furthermore, besides provenance information about the data themselves, harmonized information about research groups and experimental assets must be considered as increasingly important for feeding the network relations that may help in making sense of scientific impact. 

As part of the MSL Data Publication Chain, the MSL community has developed a standardised workflow that allows easy metadata exchange based on common formats (e.g., flavors of DCAT-AP, DataCite 4.x, and ISO19115), whereas at the same time it integrates dedicated ontologies to give access to the richness of specialized terminology with respect to the MSL subdomains (e.g., analogue modelling, paleomagnetism, rock physics, geochemistry). Community developed controlled vocabularies act as the binding agent between data, equipment, and the experiment itself, while at the same time processing tools like a user-friendly metadata editor and a CKAN-based MSL data publication portal provide the building blocks for the chain towards cross-disciplinary sustainable dissemination.

We will demonstrate how the MSL data management paradigm exploits both the strength of controlled terminology and the availability of good agnostic common standards in an approach for managing heterogeneous data coming from long tail communities.

How to cite: Lange, O., Samshuijzen, L., Elger, K., Frenzel, S., Pijnenburg, R., Wessels, R., ter Maat, G., and Drury, M.: Connecting the Long Tail: sharing and describing heterogeneous data via common metadata standards, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-13473, https://doi.org/10.5194/egusphere-egu23-13473, 2023.