MAL16-ESSI | Ian McHarg Medal Lecture by Lesley Wyborn and ESSI Division Outstanding ECS Award Lecture by Marthe Klöcking
Tue, 19:00
Ian McHarg Medal Lecture by Lesley Wyborn and ESSI Division Outstanding ECS Award Lecture by Marthe Klöcking
Convener: Jens Klump
Orals
| Tue, 29 Apr, 19:00–20:00 (CEST)
 
Room G1
Tue, 19:00

Orals: Tue, 29 Apr | Room G1

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears just before the time block starts.
Chairperson: Jens Klump
19:00–19:05
19:05–19:25
|
EGU25-9936
|
ECS
|
solicited
|
On-site presentation
Marthe Klöcking

Data are a fundamental building block of science. Ever-increasing volumes and diversity of data are allowing us to solve ever more complicated research questions; yet they are also creating new challenges around efficient data management and storage. This talk focuses on geochemical data, that are relatively low in volume compared to other Earth System Science disciplines, but are highly diverse due to the large range of materials analysed and analytical techniques employed. Modern geochemical research increasingly draws on large compilations of data previously collected by multiple authors using multiple analytical methods, over years and decades. Harmonising data from such diverse sources, and ensuring consistency and comparable data quality, is a non-trivial task that requires significant investment of time and resources. As a consequence, data compilations are increasingly published in high-ranking journals. Yet often they are singular, one-time efforts for specific projects by individual authors that quickly become outdated and lose relevance. In contrast, curated synthesis databases, such as the GEOROC database for igneous geochemical rock and mineral compositions, are continuously being updated and can offer long-term consistent curation over decades. By providing free access to, and customisable search of, their comprehensive data and metadata collections, they enable the compilation of a diverse range of smaller, targeted datasets that can form the basis of many different research projects across multiple (sub)disciplines. Long-term synthesis databases are an invaluable resource for the geochemical and broader scientific community. However, despite their broad relevance and usage, many such community databases struggle to secure the required resources for database maintenance and continuous technical developments to cater to changing scientific demands. This burden can be partly alleviated through integration of databases with curated, domain data repositories. Data harmonisation is greatly aided by adherence to best practices and standards during data publication. Repositories that publish curated, discipline-specific datasets, therefore, play an important role in ensuring new analyses are sufficiently well documented to allow quality assessment and reuse by third parties. They also support data rescue and the alignment of legacy data with modern data requirements. These standards and best practices should in turn be developed based on community expertise and consensus, which requires international collaboration. In geochemistry, data providers and services from three different continents formed the OneGeochemistry initiative. OneGeochemistry promotes exchange and agreement on minimum common variables between researchers from all geochemical sub-disciplines and the more than 15 international societies, associations and science unions that govern different types of geochemical data. As a participant in the WorldFAIR project, OneGeochemistry aims to reconcile cross-domain solutions for data interoperability with domain-specific geochemical requirements. The implementation of geochemical data standards in repositories, and their broad adoption by the geochemical community, will enhance the value of data and services provided by synthesis databases, which will lead to better access to comprehensive data compilations and, ultimately, better science.

How to cite: Klöcking, M.: The importance of curated domain repositories and synthesis databases as evolving community resources for modern Earth System Science research, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9936, https://doi.org/10.5194/egusphere-egu25-9936, 2025.

19:25–19:55
|
EGU25-16237
|
solicited
|
On-site presentation
Lesley Wyborn

Earth System Science datasets have been acquired for centuries across five broad spheres: geosphere, cryosphere, hydrosphere, biosphere and atmosphere. They vary from human observations to sensor-derived measurements ranging from nanoscale laboratory data to large-volume petascale datasets collected remotely by satellites, drones, etc. Across all spheres most datasets have their roots in three core disciplines: Geology, Geophysics and Geochemistry. Today we are generating unprecedented volumes of data and when combined with computer capacity, now at exascale, our capability to integrate and analyse data should be unparalleled.

Digital data repositories emerged around 1980 and the internet soon after. Initially data was shared by shipping on hard media. The internet soon enabled globally data sharing data, including by web services (e.g., OneGeology In 2008). Multiple global data sharing networks were envisioned, but few moved beyond those that proposed them. Machine-to-machine data sharing is still a challenge. Many spheres cannot utilise the existing capacity of computers, including the full potential of AI applications, because these cannot read the volumes of available data. 

History has repeatedly shown that revolutionary infrastructures can take decades to realise their full potential and change from being a new way of doing things to multiple ways of doing new things. 

The FAIR principles were specifically designed to increase machine-to-machine interoperability of data: they are the blueprint of WHAT needs to be done but the HOW will involve rethinking 3 key steps. 

Firstly, shift the onus on aggregating data from the consumer to repositories capable of implementing discipline-centric FAIR (meta)data standards. 

Secondly, as recommended by the WorldFAIR Second Policy Brief to the European Open Science Cloud (EOSC), change from a bibliographic approach to data stewardship to one of data engineering, where richer and more comprehensive standardised (meta)data at the datum level enables machine-to-machine access of specific variables of interest across multiple disciplinary datasets. Take a more holistic approach to standards development (e.g., Observation, Measurement and Samples Standard (ISO 19156:2023)) and identify common universals across disciplines (e.g., time, place, units of measure). Initiatives like OneGeology and Geochemistry and hopefully soon OneGeophysics can support higher--level discipline centric (meta)data standards. Standards coordination groups (e.g., CODATA, Research Data Alliance) are critical. PIDS at the object level will be essential.

Thirdly, prioritise which datasets are made fully FAIR compliant and fund their curation in repositories that offer discipline based curation. The 2019 Beijing Declaration on Research Data notes that ‘publicly funded research data should be interoperable, and preferably without further manipulation or conversion, to facilitate their broad reuse in scientific research’. The myriad of data products generated from these primary data sources can go to generalist and institutional repositories.

Revolutionary infrastructures do take time to realise their full potential. It is nearly 25 years since the early experiments using the internet to globally network data repositories. The WorldFAIR Second EOSC Policy Brief emphasises that the change to machine-actionable FAIR data ‘is one of a magnitude which will necessitate considerable resourcing, investment, and upskilling; but it will also achieve significant benefits, including creating a digitally integrated Earth to support sustainable development of our planet.

How to cite: Wyborn, L.: Rethinking HOW We Create Global Networks of Earth and Environmental Datasets to Maximise Their Potential to Underpin Integrated Research for a Sustainable Planet., EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-16237, https://doi.org/10.5194/egusphere-egu25-16237, 2025.

19:55–20:00