EGU2020-22533
https://doi.org/10.5194/egusphere-egu2020-22533
EGU General Assembly 2020
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Best Practices: The Value and Dilemma of Domain Repositories

Kerstin Lehnert, Lucia Profeta, Annika Johansson, and Lulin Song
Kerstin Lehnert et al.
  • Lamont-Doherty Earth Observatory of Columbia University, US

Modern scientific research requires open and efficient access to well-documented data to ensure transparency and reproducibility, and to build on existing resources to solve scientific questions of the future. Open access to the results of scientific research - publications, data, samples, code - is now broadly advocated and implemented in policies of funding agencies and publishers because it helps build trust in science, galvanizes the scientific enterprise, and accelerates the pace of discovery and creation of new knowledge. Domain specific data facilities offer specialized services for data curation that are tailored to the needs of scientists in a given domain, ensuring rich, relevant, and consistent metadata for meaningful discovery and reuse of data, as well as data formats and encodings that facilitate data access, data integration, and data analysis for disciplinary and interdisciplinary applications. Domain specific data facilities are uniquely poised to implement best practices that ensure not only the Findability and Accessibility of data under their stewardship, but also their Interoperability and Reusability, which requires detailed data type specific documentation of methods, including data acquisition and processing steps, uncertainties, and other data quality measures. 

The dilemma for domain repositories is that the rigorous implementation of such Best Practices requires substantial effort and expertise, which becomes a challenge when usage of the repository outgrows its resources. Rigorous implementation of Best Practices can also cause frustration of users, who are asked to revise and improve their data submissions, and may make them deposit their data in other, often general repositories that do not perform such rigorous review and therefore minimize the burden of data deposition. 

We will report on recent experiences of EarthChem, a domain specific data facility for the geochemical and petrological science community. EarthChem is recommended by publishers as a trusted repository for the preservation and open sharing of geochemical data. With the implementation of the FAIR Data principles at multiple journals that publish geochemical and petrological research over the past year, the number, volume, and diversity of data submitted to the EarthChem Library has grown dramatically and is challenging existing procedures and resources that do not scale to the new level of usage. Curators are challenged to meet expectations of users for immediate data publication and DOI assignment, and to process submissions that include new data types, are poorly documented, or contain code, images, and other digital content that is outside the scope of the repository. We will discuss possible solutions ranging from tiered data curation support, collaboration with other data repositories, and engagement with publishers and editors to enhance guidance and education of authors.

 

 

How to cite: Lehnert, K., Profeta, L., Johansson, A., and Song, L.: Best Practices: The Value and Dilemma of Domain Repositories, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-22533, https://doi.org/10.5194/egusphere-egu2020-22533, 2020.

Displays

Display file