ESSI2.9 | Open Interoperability Frameworks Built by Scientists for Scientists to Meet Global Societal Challenges
Open Interoperability Frameworks Built by Scientists for Scientists to Meet Global Societal Challenges
Co-sponsored by AGU
Convener: Angeliki Adamaki | Co-conveners: Anca Hienola, Kirsten Elger, Lesley Wyborn, Jacco Konijn
Orals
| Wed, 26 Apr, 10:45–12:30 (CEST)
 
Room 0.51
Posters on site
| Attendance Thu, 27 Apr, 16:15–18:00 (CEST)
 
Hall X4
Posters virtual
| Attendance Thu, 27 Apr, 16:15–18:00 (CEST)
 
vHall ESSI/GI/NP
Orals |
Wed, 10:45
Thu, 16:15
Thu, 16:15
In the big-data era, science attempts to address global societal problems (such as climate change, pandemics, environmentally sustainable exploitation of our resources) often with computationally expensive advanced methodologies that require machine-to-machine (m2m) interaction and interoperable data and services. At the same time, data science and technologies are rapidly evolving and offer new tools to address scientific demands, revealing yet more technological challenges as the complexity increases at all stages of the data and service development lifecycle.

To serve the complex, heterogeneous and diverse demands of their end-users, data providers try to work on federated solutions using FAIR enabling resources. Scientists and developers investigate the requirements for integrating systems that operate not necessarily using the same technologies but rather adopting technical solutions that facilitate the interoperability among them. Starting at the very low levels of data (e.g. near real-time, minimally processed, etc) and reaching higher-level derivative data collections, products and services, many issues are still open: metadata models, authentication and authorisation systems, ontologies, machine actionable licenses and PIDs that support the findability, accessibility and sustainable future of data (enabling proper citation and attribution to both creators and funders, usage tracking, etc.).

Groups facilitating global data sharing, networks and services are e.g. Federation of Digital Seismograph Networks (FDSN), OneGeology, OneGeochemistry, WorldFAIR, Earth Systems Grid Federation (ESFG), OGC, W3C, GEO, CODATA/DDI Cross-Domain Data Initiative), the European Open Science Cloud (EOSC) and the paneuropean cluster of environmental research infrastructures (ENVRI).

This session brings together diverse disciplines and a variety of experts (data centre architects, data stewards, developers, ontologists, scientists). We seek contributions that demonstrate scientific use cases (in the field of earth/environmental sciences), discuss ICT and data challenges and recommend best practices based on experience from interoperability frameworks. We welcome abstracts from small scale scientific methodologies developed by (multi)disciplinary groups aiming at multi/inter-disciplinary science to larger scale integrated platforms that offer interoperational data and services by multiple discipline-focused providers and/or cross-domain communities of providers

Orals: Wed, 26 Apr | Room 0.51

Chairpersons: Anca Hienola, Angeliki Adamaki, Jacco Konijn
10:45–10:50
10:50–11:00
|
EGU23-11193
|
On-site presentation
Ari Asmi

Interoperable tools for science is a great goal. Vision of seamless, shared and transparent scientific workflows, which can easily implement and automate processes across scientific fields and geographical domains gives hope for solving key scientific and societal challenges. 

However, we are not there yet. A lot of advances have been made, but there is no global agreement or a set of standards which enable such interoperability. Regional and national initiatives push towards diciplinary or regional open science environments, sharing some of the features of such true global interoperability, but often work in isolation - leading to danger of even more divergence of solutions. 

Science is global, and such discussions should be global. Scientific processes cover wide range of diciplines, practices and (science) cultural borders, and thus such solutions should be also shared across these boundaries. Modern science needs many kinds of experts, from traditional scientists to data stewards, software engineers and even policy developers and legal experts - and thus the interopebaility need to be considered also across the expert groups as well.

In this presentation I will present some of the tools currently available for such international development, specifically concentrating on the development and expertise from (now a decade-old!) Research Data Alliance. I will also present several other initiatives from regional (particularly European) and global importance, and discuss the ways a researcher or a research infrastructure developer could interact with them.

How to cite: Asmi, A.: Ensuring international and interdicipilinary interoperability of research data working groups, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11193, https://doi.org/10.5194/egusphere-egu23-11193, 2023.

11:00–11:10
|
EGU23-10537
|
On-site presentation
Werner Leo Kutsch, Alex Vermeulen, and Margareta Hellström

The Integrated Carbon Observation System (ICOS), is a distributed European Research Infrastructure which provides high-precision and highly standardised observations from more than 170 stations from three domains: Atmosphere, Ecosystem and Ocean. ICOS covers currently 16 European countries. All ICOS data is made available by the ICOS Carbon Portal, first in near-real time (within 24h when possible), and after further quality control as domain specific annual releases. The data flow follows the Findable, Accessible, Interoperable, Reusable (FAIR) principles.

ICOS has been cooperating with other European Environmental Research Infrastructures (the ENVRI Community) since more than a decade. ICOS used the ENVRI Reference Model to set up its data life cycle, developed common metadata and data citation strategies and most recently contributed to the ENVRI-hub within the European Open Science Cloud (EOSC). The ENVRI-hub will be a central gateway to environmental data and services offered by the European environmental research infrastructures. The data offered through the hub will be interoperable across the Earth system disciplines and therefore easy to use for interdisciplinary environmental research. Data will be open and free to use by anyone. Users of the ENVRI-hub will be also able to use the Virtual Research Environments and do their science computing directly inside the hub.

The synergies achieved through cooperation within the ENVRI community has saved a lot of resources for ICOS when implementing its data life cycle. Furthermore, it has created common services for scientists working on complex Earth System questions such as climate-biodiversity or climate-air-quality feedbacks.

How to cite: Kutsch, W. L., Vermeulen, A., and Hellström, M.: The benefits of Big Data Cooperation from a single research infrastructure viewpoint – ICOS and the ENVRI-hub in EOSC, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10537, https://doi.org/10.5194/egusphere-egu23-10537, 2023.

11:10–11:20
|
EGU23-7708
|
On-site presentation
Andreas Petzold, Ulrich Bundke, Chris Schleiermacher, Ana Rita Gomes, Katrin Seemeyer, Angeliki Adamaki, Alex Vermeulen, Zhiming Zhao, Damien Boulanger, Thierry Carval, and Anca Hienola

European Environmental Research Infrastructures (ENVRIs) on the ESFRI level are core facilities for providing data, research products and services from the four subdomains of Earth system science – Atmosphere, Marine, Solid Earth, and Biodiversity/Terrestrial Ecosystems. The ENVRI Cluster represents the core component of the European environmental research infrastructure landscape, with the ENVRI community as their common forum for collaboration and co-creation. The topics covered by the ENVRIs span the entire range of scientific objectives relevant for Earth system monitoring.

The community has developed the ENVRI-Hub as a central platform for accessing interdisciplinary FAIRfied environmental research assets, serving as an essential ENVRI community's interface to the European Open Science Cloud (EOSC). Through the ENVRI-Hub, the ENVRI community shares their FAIRness experience, technologies, and training as well as research products and services. The architecture and functionalities of the ENVRI-Hub are driven by scientific applications, use cases and user needs. Its three main pillars are the ENVRI Knowledge Base as the human interface to the ENVRI ecosystem, the ENVRI Catalogue of Services as the machine-actionable interface to the ENVRI ecosystem, and finally, subdomain and cross-domain scientific use cases as demonstrators for the capabilities of service provision among ENVRIs and across Science Clusters.

The Science Demonstrators are being developed by several RIs in parallel. They are the key product to express the ENVRI-Hub’s potential regarding easy access to metadata and services, data discovery, as well as the promotion of interoperability in science across sub-domains. Science Demonstrators are built with Jupyter Notebooks - an open-source web application that allows one to create and share documents that contain live code, equations, visualizations, and narrative text. Uses include cross domain data access, data cleaning and transformation, numerical simulation, statistical modelling, data visualization, machine learning, and much more. The Jupyter Notebook environment forms the nucleus of the future ENVRI Virtual Research Environment.

The ENVRI Science Demonstrators and Science Projects in the Horizon 2020 project EOSC Future aim at demonstrating how joint projects can address major challenges for Europe’s societies and how research infrastructures can support Horizon Europe’s missions within the EOSC. Presented Science Demonstrators cover one ENV domain wide service on the collocation of sampling sites, and two science cases from atmospheric and marine research, respectively.

Acknowledgement: ENVRI-FAIR has received funding from the EU Horizon 2020 research and innovation programme under grant agreement No 824068. Part of the work is funded by the EU Horizon 2020 project EOSC Future under grant agreement No 101017536. This work is only possible with the collaboration of the ENVRI-FAIR partners and thanks to the joint efforts of the whole ENVRI-Hub team.

How to cite: Petzold, A., Bundke, U., Schleiermacher, C., Gomes, A. R., Seemeyer, K., Adamaki, A., Vermeulen, A., Zhao, Z., Boulanger, D., Carval, T., and Hienola, A.: The ENVRI-Hub as a service for accelerating FAIRification of the Environment Domain Research Infrastructures, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-7708, https://doi.org/10.5194/egusphere-egu23-7708, 2023.

11:20–11:30
|
EGU23-14107
|
ECS
|
Virtual presentation
Advancing FAIRness of soil water content (meta)data with the help of Semantic Web technologies
(withdrawn)
Xeni Kechagioglou, Giuseppe Turrisi, Francesca De Pascalis, Claudio D’Onofrio, Luke Marsden, Christian Pichot, Christoph Wohner, Nicola Fiore, Dario Papale, Alberto Basset, Giovanni L’Abate, and André Chanzy
11:30–11:40
|
EGU23-11716
|
ECS
|
On-site presentation
Yifang Shi, Spiros Koulouzis, Riccardo Bianchi, Joris Timmermans, W. Daniel Kissling, and Zhiming Zhao

Quantifying ecosystem structure is of great importance for forest management, ecology, biodiversity monitoring, and climate change modeling. Advances in remote sensing — specifically Light Detection And Ranging (LiDAR) — have enabled the mapping of vegetation structure with unprecedented detail. However, considerable effort and advanced technical skills are required for researchers to process massive amounts of LiDAR data, giving the challenges in handling big data and high computational costs. Different requirements from end users also indicate that the FAIRness (i.e. Findability, Accessibility, Interoperability, and Reusability) of the processing workflow is needed for a broad user community. In this context, we developed a virtual research environment (VRE) solution for the Jupyter environment named Notebook-as-a-VRE, which allows users to search research assets (e.g. data, algorithms), compose workflows, manage the lifecycle of an experiment, and share the results among the user community. Functional components, including the component containerizer, the experiment manager, the VRE knowledge base, and the semantic search engine were deployed as Jupyter extensions on the user environment. In this way, users can encapsulate and containerize selected cells from Jupyter Notebook as standardized RESTful API services, use them for their customized workflows and publish the containerized cells or workflows as reusable components via community repositories. A high-throughput workflow called ‘Laserfarm’ was implemented in the NaaVRE for deriving geospatial data products of ecosystem structure at high resolution across the Netherlands. Geospatial data products containing 25 LiDAR-derived metrics were generated at 10 m resolution covering the whole Netherlands, representing open data on ecosystem height, ecosystem cover, and ecosystem structural complexity. The demonstrated NaaVRE solution can be flexibly expanded to other use cases in ecology, biodiversity, and the Earth science domain, with potential contributions to newly emerging national and regional biodiversity observation networks.

How to cite: Shi, Y., Koulouzis, S., Bianchi, R., Timmermans, J., Kissling, W. D., and Zhao, Z.: Generating geospatial data products of ecosystem structure from LiDAR using Notebook-as-a-VRE (NaaVRE), EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11716, https://doi.org/10.5194/egusphere-egu23-11716, 2023.

11:40–11:50
|
EGU23-11400
|
On-site presentation
Tjerk Krijger, Peter Thijsse, Dick Schaap, and Robin Kooyman

In the EOSC-Future project, ENVRI-FAIR partners are involved in developing two Science Projects (SPs), one about Invasive Species, and one about a Dashboard of the State of the Environment. The Dashboard should provide easy means to users to determine the state of the environment and follow trends of our Earth system for a selected number of parameters within the Earth components of Atmosphere, Ocean, and Biodiversity.

MARIS leads the development of the Ocean component in cooperation with IFREMER, OGS and NOC-BODC. It consists of a Map Viewer that displays in-situ measurements of selected Essential Ocean Variables (EOVs), namely Temperature, Oxygen, Nutrients and pH. These measurements are retrieved from a datalake of selected Blue Data Infrastructures (BDIs) such as Euro-Argo and SeaDataNet CDI using tailor-made APIs for fast sub-setting at data level.

Performance is a major challenge as users should not wait too long for retrieving and displaying the data in the Map Viewer following their selection criteria, while the origin data is organised in millions of observation files which make it hard to achieve fast responses. At MARIS there have been developments ongoing to create a system called beacon API with unique indexing system that could, on the fly, extract the specific data that was requested from millions of observation datafiles that contain multiple parameters in diverse units. We show the possibilities of this API by applying it to the SeaDataNet CDI database. The system is built in a way that it can be applied to other BDIs in the future as well and does not only function on the SeaDataNet CDI architecture.

The user interface of the map viewer is designed for (citizen) scientists and allows them to interact with the large data collections retrieving parameter values from observation data by geographical area and using sliders for date, time and depth. The in-situ values are co-located with product layers from Copernicus Marine, based upon modelling and satellite data. These in-situ data sets are also used in algorithms to generate aggregated values as dynamic trend indicators for sea regions. These are displayed at the Environmental Indicators dashboard and provide ocean trend indicators for the selected EOVs for designated areas. For more information, users can then click on such an indicator guiding them to the Map Viewer to browse deeper into the data and details facilitating the trends.

How to cite: Krijger, T., Thijsse, P., Schaap, D., and Kooyman, R.: EOSC-FUTURE – ENVRI-FAIR Science Project Environmental Indicators – Ocean Component – Beacon API, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-11400, https://doi.org/10.5194/egusphere-egu23-11400, 2023.

11:50–12:00
|
EGU23-15298
|
Virtual presentation
Accelerating the research on Biodiversity and Ecosystems: Best Practice on Climate Change vs Non-Indigenous Invasive Species (NIS)
(withdrawn)
Cristina Huertas-Olivares, Alberto Basset, Katrina Exter, Giorgios Kotoulas, Nicola Fiore, Marc Portier, Rory Meyer, Dick Schaap, Ioulia Santi, Matthias Obst, Alex Vermeulen, Peter Thijsse, Nicolas Pade, Juan Miguel González-Aranda, Joaquin López, Nikos Minadakis, Lucia Vaira, Christina Pavloudi, and Christos Arvanitidis
12:00–12:10
|
EGU23-14628
|
ECS
|
On-site presentation
Charis Chatzikyriakou, Zdeněk Šustr, Enol Fernández, Björn Backeberg, Sebastian Luna-Valero, Magdalena Brus, Xavier Salazar Forn, Christian Briese, and Diego Scardaci

The EC H2020 C-SCALE (Copernicus - eoSC AnaLytics Engine, https://c-scale.eu) project implements a European open source Big (Copernicus) Data Analytics platform by federating the best-of-breed tools, competences and services by collaboratively building on the experience and competences of pan-European e-Infrastructures and existing project initiatives.

The vision of the project is to empower European researchers, institutions and initiatives to easily discover, access, process, analyse and share Copernicus data, tools, resources and services through the EOSC Portal. To this end, C-SCALE delivers a federated compute and data infrastructure offering Copernicus and Earth Observation (EO) data, including a seamless user experience where the complexity of resource provisioning and orchestration is abstracted away from the end-user. The service offer of C-SCALE includes four main services:

  • The Federated Earth System Simulation and Data Processing Platform (FedEarthData) service brings together the providers of data and processing capacity, so that EO products held in distributed archives across the federation can be easily discovered and seamlessly accessed and processed on batch as well as on interactive analytic platforms deployed on distributed computing resources anywhere across the federation.
  • The Metadata Query Service (MQS) makes Copernicus data distributed across partners within the federation discoverable and searchable. It is a STAC-compliant API that redistributes incoming queries among the federated sites and provides a consolidated response containing the list of aggregated results. The MQS exposes all STAC collections available within the federation on a single endpoint and provides a search interface that accepts the core parameters of the STAC API Item Search specification.
  • The openEO platform service provides intuitive programming libraries alongside with a large EO data repository to simplify processing and data management. This large-scale data access and computation is performed on multiple infrastructures allowing use cases from exploratory research to large-scale production of EO-derived maps and information in an accelerated way.
  • The Workflow Solutions are easily deployable workflows supporting monitoring, modelling and forecasting of the Earth system. They provide adaptable templates and examples, in the form of Jupyter Notebooks, of Copernicus and EO data and analysis workflows enabling users to more easily arrange a processing pipeline to create results on the C-SCALE federation.

In addition to these services, the project delivers the C-SCALE community, for the engagement with existing and new stakeholders, including both researchers and service providers in Earth Observation, documentation and training material.

By the aforementioned services, the project aims to scale up the EOSC Portal through a continuously growing catalogue of services and resources supporting the whole research life cycle and enabling more scientific communities to access state-of-the-art services for their research activities. In addition, C-SCALE facilitates synergies between pan-European e-infrastructures operators, leading to harmonised services, improved use of resources and economies of scale.

How to cite: Chatzikyriakou, C., Šustr, Z., Fernández, E., Backeberg, B., Luna-Valero, S., Brus, M., Salazar Forn, X., Briese, C., and Scardaci, D.: Implementing a European Big Copernicus Data Analytics platform: The C-SCALE service offer in a nutshell, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14628, https://doi.org/10.5194/egusphere-egu23-14628, 2023.

12:10–12:20
|
EGU23-10908
|
On-site presentation
Tim Rawling, Beryl Morris, Andre Zerger, and Rebecca Farrington

Earth systems including the geosphere, biosphere, cryosphere, hydrosphere and atmosphere complexly interact to create a planet on which life and humanity has thrived. Throughout history we have studied and observed all these systems, but too often there has been too little strategic integration of activities supporting observing and monitoring, storage and access to data, computing and modelling, and predicting future states. 

This has reduced our ability to quickly advance our understanding of the interdependence of these complex systems, particularly as data has often been collected in isolation, or to address a very specific research problem. The challenges to achieving such an integrated perspective are scientific, technical, social, political and organisational.

In order to address this issue the Earth and environment related research infrastructure facilities in Australia have self-organised to create the National Earth and Environmental Sciences Facilities Forum (NEESFF).  NEESFF is intended to harness the capacity of its environmentally-focused capabilities to collectively create solutions and deliver the information needed for sustainable development and use of environmental resources.  NEESFF’s vision is an effective and coordinated response to global environmental conditions in a uniquely Australian context. 

Many NEESFF organisations are funded through the National Research Infrastructure Infrastructure Strategy (NCRIS) which supports research activity across many STEM and HASS disciplines.  The Australian Government recently developed a Roadmap outlining the Challenges, opportunities for system enhancement and potential step-changes that the next phase of research infrastructure investment will drive.  These too are guided by national and global science and societal challenges and rely on underpinning robust and integrated FAIR open data systems across the RI network.

Here we will outline the national approach to these challenges, the current thinking on the data challenges that we face in an Australian context and the first steps we are taking as a community to develop a framework for the delivery of integrated Earth data.

Members are:  Atlas of Living Australia (ALA); AuScope; Australian Urban Research Infrastructure Network (AURIN); Australian Research Data Commons; Australian Plant Phenomics Facility (APPF); BioPlatforms Australia; Geoscience Australia (GA); Integrated Marine Observing System (IMOS); Australian Terrestrial Ecosystem Research Network (TERN); Marine National Facility; Bureau of Meteorology; E2SIP (CSIRO); National Computational Infrastructure; and AARNet.

How to cite: Rawling, T., Morris, B., Zerger, A., and Farrington, R.: Cross disciplinary collaboration for societal benefit across Australian Research Infrastructure Networks, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-10908, https://doi.org/10.5194/egusphere-egu23-10908, 2023.

12:20–12:30
|
EGU23-4809
|
On-site presentation
|
Martina Stockhause and Sasha Ames

Data citation has become a core service for data infrastructures and data repositories, integrated in author guidelines and project proposals. It provides an essential piece in Open Science supporting reuse and enabling credit by integration into the established scientific credit system of scientific references (Scholix) and the envisioned FAIR Digital Object framework. Humans and machines need to be supported for provision and access of the information.

Modern data infrastructures are federated and without central funding, especially that of international collaborations. Providing and maintaining state-of-the-art data services does not fit within current funding structures, even though these infrastructures provide core services to a broad community, like the data infrastructure of the Coupled Model Intercomparison Project (CMIP). 

For the citation service of CMIP6, currently a single person effort, the newly established CMIP Task Team (TT) Data Citation is exploring a set of options for a sustainable data citation service for CMIP7 and further WCRP activities in a cost-benefit approach. Due to the different experiences of its members, the TT brings infrastructure aspects together with the needs of researchers. A governance structure for sustainability, standard agreements and the coordination of further developments will be framed. The background to the TT and first experiences are shared.

How to cite: Stockhause, M. and Ames, S.: The CMIP TT Data Citation for a sustainable continuation of a requested and established service, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4809, https://doi.org/10.5194/egusphere-egu23-4809, 2023.

Posters on site: Thu, 27 Apr, 16:15–18:00 | Hall X4

Chairpersons: Jacco Konijn, Anca Hienola, Kirsten Elger
X4.152
|
EGU23-8784
|
ECS
|
Angeliki Adamaki, Alex Vermeulen, Dick Schaap, Peter Thijsse, Tjerk Krijger, Raul Bardaji, Andreu Fornos, and Ivan Rodero

The European Open Science Cloud is a pan-European initiative that aims at developing a federated infrastructure that supports scientific research and Open Science. Having the FAIR principles at the core of Open Science practices, the Environmental Research Infrastructures (ENVRIs) that operate in Europe collaborate at cluster level (the ENVRI cluster) continuously improving the FAIRness of their services and working towards becoming more interoperable among them but also with other clusters. To showcase the benefits of an integration platform that supports scientific workflows, the ENVRIs develop the “Dashboard for the State of the Environment” as a cross-discipline service to address with scientific facts the environmental concerns. The project brings together three scientific domains (Atmosphere, Biodiversity, Ocean) that each have set up analytical workflows to provide environmental indicators in real-time, allowing the users to visualise the “State of the Environment” by interacting with the service interface. 

The Dashboard is designed to be completely user configurable so that the users can select from a list the indicators to be shown and their order. Providers can add, remove and edit indicators through a standard REST API, that allows transferring all parameters, including the configuration of the indicators and how to provision data values and thumbnail interaction. The Dashboard is implemented and operated using engineering best practices, including YAML for the indicators’ descriptions and a robust and flexible container-based deployment. It builds on EOSC services like AAI, cloud services, and data storage, and the workflows that provide the indicators will also build on the EOSC and Research Infrastructure (RI) computing integration. As a proof of concept, a limited list of indicators is available, and we foresee that the participating RIs will provide many more indicator options in the near future. In addition, through the extension API it will be possible for new RIs to start providing indicators. 

The Dashboard service is completely open source, and, as the whole concept, it is designed to be flexible and expansible. Therefore, we encourage other clusters that are part of EOSC to use the service as another basis for disseminating their relevant indicators to a wider audience.

How to cite: Adamaki, A., Vermeulen, A., Schaap, D., Thijsse, P., Krijger, T., Bardaji, R., Fornos, A., and Rodero, I.: A European Dashboard showcasing the State of the Environment, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8784, https://doi.org/10.5194/egusphere-egu23-8784, 2023.

X4.153
|
EGU23-2047
Hilde Orten and Bodil Agasøster

The goal of Science Project 9 of EOSC Future WP6.3 is to demonstrate that relevant environmental data and data on citizens' values, attitudes, behavior and involvement can be combined for social, political and scientific analysis.

This project includes an upgrade of an application developed internally at Sikt, with the aim to support integration of climate data with the ESS data and the regional contextual data already supported by the tool. Data will also be available via the EOSC Platform.

The upcoming work product of the DDI Alliance, the DDI-Cross Domain Integration metadata standard will be used to bridge between data from the different research domains and of different structures. We will also explore how standards may be used together.

This presentation will give an introduction to the project, report on work performed so far with focus on methods and challenges, and will provide ideas for the future.

How to cite: Orten, H. and Agasøster, B.: Cross Domain Interoperability in Science Project 9 of EOSC Future WP6.3, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-2047, https://doi.org/10.5194/egusphere-egu23-2047, 2023.

X4.154
|
EGU23-14294
Hannes Thiemann, Heinrich Widmann, Stephan Kindermann, Fanny Adloff, and Justus von Brandt

Climate Change is one of the most pressing global challenges in which researchers from around the world and from a wide range of disciplines are working together. This requires infrastructures that enable both local and cross-border cooperation.

Within IS-ENES, the Infrastructure for Earth System Modelling, European partners from the areas of climate modelling, computational science, data management, climate impacts and climate services are working together, to deliver a research infrastructures to provide access to climate model data and tools to boost the understanding of past, present and future climate variability and changes. Core element of the infrastructure is the Earth System Grid Federation (ESGF), which is operated in a global partnership.

In the Horizon Europe project FAIRCORE4EOSC, the German Climate Computing Center (DKRZ) is involved in the design of further services in the European Open Science Cloud, that also meet the requirements of the IS-ENES community and is particularly dedicated to examining and testing the possibility of integrating EOSC and IS-ENES services.

FAIRCORE4EOSC focuses on the development and realization of further core components for the European Open Science Cloud (EOSC). Leveraging existing technologies and services, the project will develop nine new EOSC-Core components aimed at improving the discoverability and interoperability of an increased amount of research outputs. In particular, IS-ENES checks this for selected data collections that have high scientific relevance for both data producers (the climate modellers) and data users from other research disciplines and will be available in the long term, such as those used for the IPCC reports,

These data collections and associated scientific entities (such as scientific projects) will be identified and will receive identifiers using Kernel Information Types as well as entries in the Data Type Registry with its corresponding contents. The consideration of the different aggregation levels, which is decisive for the re-use, is taken into account and a PID Graph will be utilized to enhance and simplify the re-use of data. Where appropriate, ‘Research Activity identifiers’ (RAiDs) will be assigned to projects and experiments, providing (domain agnostic) users with an aggregated view on the entities (data, software, people involved, etc.pp.) of the scientific project.

Crucial is the interlinking of service metadata with data collection metadata in the PID graph and EOSC research discovery graph to enable advanced discovery support for the ENES community. Additionally, the generated provenance records will be extended to include DOI and data citation info thus improving the reusability of derived data products in interdisciplinary research contexts. In the latter context, IS-ENES will use also the Metadata Crosswalk Registry (MSCR) to improve interoperable reuse of ENES data by impact communities through providing crosswalks from vocabularies for climate variables (e.g. the CF conventions) to ontologies understandable and interpretable by other communities.

This talk will provide an overview of the current status of the implementation and its long-term benefits for researchers in IS-ENES and beyond and will highlight some challenges related to the integration of various large infrastructures.

 

How to cite: Thiemann, H., Widmann, H., Kindermann, S., Adloff, F., and von Brandt, J.: Challenges of integrating large infrastructures using the example of ENES-CDI and EOSC, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-14294, https://doi.org/10.5194/egusphere-egu23-14294, 2023.

X4.155
|
EGU23-14305
Gathering data from different Research infrastructures:  lessons learnt from the soil water content use case of the ENVRI-Fair project
(withdrawn)
André Chanzy, Giovanni L'Abate, Lucia Vaira, Xeni Kechagioglou, Christian Pichot, Alberto Basset, and Dario Papale
X4.156
|
EGU23-4802
Peter Thijsse, Thierry Carval, Alexandra Kokkinaki, Justin Buck, Gwenaelle Moncoiffe, Erwann Quimbert, Guillaume Alviset, and Dick Schaap

The marine subdomain consists of a diverse data landscape, with several Research Infrastructures (RIs) involved. In the ENVRI-FAIR project the marine domain is represented by Euro-ARGO, ICOS (Marine), EMSO, and LifeWatch (Marine) as RIs as listed on the ESFRI roadmap, and SeaDataNet as European marine data management infrastructure. The overarching goal of ENVRI-FAIR is that all participating research infrastructures (RIs) will improve their level of FAIRness and become ready for connecting their data repositories and services to the European Open Science Cloud (EOSC).

To achieve this goal, the marine domain partners have first analysed and assessed the FAIRNess level of each participating RI and identified the necessary actions to improve their individual FAIRness. They then created a roadmap and implementation plan that led to the development of an Essential Ocean Variable (EOV) demonstrator, highlighting the FAIRness achievements.   

In this presentation we will discuss in more detail the RI FAIR assessment and analysis process and show how it evolved alongside the evolution of the FAIR assessment tools themselves and the harmonisation required to be meaningful and useful. We will then walk you through the analysis of this assessment that led to a list of strengths and weaknesses per RI, and the solutions to overcome the weaknesses for each individual of the FAIR principle. Finally, we will present the outcomes of the required upgrades, adoptions of standards, improvements, developments and services that were developed as part of the implementation plan and led to the construction of the EOV demonstrator product.

How to cite: Thijsse, P., Carval, T., Kokkinaki, A., Buck, J., Moncoiffe, G., Quimbert, E., Alviset, G., and Schaap, D.: The journey towards FAIR: A story from the marine domain, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-4802, https://doi.org/10.5194/egusphere-egu23-4802, 2023.

X4.157
|
EGU23-8691
Maria Mihalikova and Ingemar Häggström

The present era of rapid technological advances creates a challenge for data providers and scientists to create and maintain FAIR data and services not just for future operations but also for historical data gathered and analysed with technologies that are slowly phasing out of their usage. GUISDAP is an open source software package, written in Matlab, C and Fortran and provided and maintained by EISCAT, for analysis and visualisation of its incoherent scatter radar data as well as for some other radars in the world. As the new incoherent radar system (EISCAT_3D) is built, the current mainland radar systems will cease their operations. Thus, the development and maintenance of the compatibility of GUISDAP with newer computational technologies will become more challenging.

One way how to preserve GUISDAP operability and accessibility by the user community is to make it accessible through a Jupyter notebook docker deployment through EISCAT resources and in the frame of an EOSC project. This will help to ensure the FAIRness of historical EISCAT data by providing tools for reanalysis and visualisation that will be accessible by any potential EISCAT user. This poster will present our project with GUISDAP in a Jupyter notebook environment.

How to cite: Mihalikova, M. and Häggström, I.: The GUISDAP analysis software in Jupyter notebook as a tool for the FAIRness of the current EISCAT data, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-8691, https://doi.org/10.5194/egusphere-egu23-8691, 2023.

X4.158
|
EGU23-9235
|
Claudio D'Onofrio, Karolina Pantazatou, Ute Karstens, Margareta Hellström, Ida Storm, and Alex Vermeulen

Computational notebooks (e.g. Jupyter notebook) are a popular choice for interactive scientific computing to convey descriptive information together with executable source code. The user can annotate the scientific development of the work, the methods applied, describe ancillary data or the analysis of results, with text, illustrations, figures, and equations. Such ‘executable’ documents provide a paradigm shift in scientific writing, where not only the science is described, but the actual computation and source code are openly available and can be reproduced and validated.

Therefore, it is of paramount importance to preserve these documents. A unique and persistent identification (PID) is essential together with providing enough information to execute the source code. Generating a PID for a Jupyter notebook is not technically challenging. We can automatically collect system and run-time information and, with a guided workflow for the user, assemble a rich set of metadata. The collected information allows us to recreate the computational environment and run the source code, which in return (theoretically) should produce the same results as published.

The importance of providing a rich set of metadata for all digital objects in a human readable and machine actionable form is well understood and widely accepted as necessity for reproducibility, traceability, and provenance. This is reflected in the FAIR principles (Wilkinson, https://doi.org/10.1038/sdata.2016.18) which are regarded as gold standard by many scientific communities.

Pimentel et al. (https://doi.org/10.1109/MSR.2019.00077) analysed over 800’000 Jupyter notebooks from GitHub. 24 % executed without errors and only 4 % produced the same results. The likelihood to successfully compile and run a decade old source code is slim. Long term support for well established operating systems varies between 5 to 10 years, user software support is usually shorter and looking at free and open-source repositories there is often no support (or best effort) offered.

We present an approach to safely reproduce the computational environment in the future with a focus on long-term availability. Instead of trying to reinstall the computational environment based on the stored metadata, we propose to archive the docker image, the user space (user installed packages) and finally the source code. Recreating the system in this way is more like restoring a backup, where backup is the equivalent of an entire computer system. It does not solve all the problems but removes a great deal of complexity and uncertainty.

Though there are shortcomings in our approach, we believe our solution will lower the threshold for scientists to provide rich meta data, code and results attached to a publication that can be reproduced in the far future.

How to cite: D'Onofrio, C., Pantazatou, K., Karstens, U., Hellström, M., Storm, I., and Vermeulen, A.: Long-term Reproducibility for Jupyter Notebook, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9235, https://doi.org/10.5194/egusphere-egu23-9235, 2023.

X4.159
|
EGU23-9906
Beryl Morris, Werner Kutsch, Michael SanClements, Henry Loescher, Melissa Genazzio, Michael Mirtl, Jaana Back, Tommy Bornman, Paula Mabee, Xiubo Yu, Steffan Zacharias, Gregor Feig, Mark Grant, Emmanuel Salmon, and Leiming Zhang

Guided by the Framework Criteria of the Group of Senior Officials (GSO) on Global Research Infrastructures, 6 major ecosystem research infrastructures (SAEON/South Africa, TERN/Australia, CERN/China, NEON/USA, ICOS/Europe, eLTER/Europe) came together in 2020 under an MOU, establishing the Global Ecosystem Research Infrastructure (GERI).  With its goal of providing interoperable data and services based on terrestrial and coastal in-situ observations from a high number of observational sites, organized in a common hierarchical system and standardized in the highest possible degree, GERI provides a unique opportunity to advance our understanding of ecological processes across continents, decades, and disciplinary boundaries.

Aggregating ecological and biogeophysical data is complex. Not only do those data cover a myriad of different types of natural phenomena, and the interactions between, but the data are generated in different jurisdictions using different standards and approaches. Thus, as a critical first step in understanding the challenges and potential of its multi-institutional, multi-country data landscape, GERI has identified and mapped all the data types from each of its members, grouping the data suites provided by each GERI member into the drivers of changes (causes) and the ecological processes (effects) and then visualizing it into broad (searchable) respective common ecological categories.

This exercise has allowed evaluation of the potential for a targeted data harmonization effort based on a subset of data products with high relevance to specific use cases (e.g., drought across multiple continents). By using the subset of relevant data as a community test case, GERI can ascertain the efficacy of a specific harmonized data set in advancing a priority area of science. Such a prototype will set the stage for future efforts and ensures GERI addresses the most pressing global research challenges, i.e. those frontiers of knowledge where a global-critical-mass effort is required to achieve progress.   

How to cite: Morris, B., Kutsch, W., SanClements, M., Loescher, H., Genazzio, M., Mirtl, M., Back, J., Bornman, T., Mabee, P., Yu, X., Zacharias, S., Feig, G., Grant, M., Salmon, E., and Zhang, L.: Addressing pressing global societal research challenges through targeted harmonisation of macrosystems ecology data sets, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-9906, https://doi.org/10.5194/egusphere-egu23-9906, 2023.

X4.160
|
EGU23-15749
Rebecca Farrington, Alexander Prent, Lesley Wyborn, Tim Rawling, Marthe Klöcking, Kerstin Lehnert, Kirsten Elger, Geertje ter Maat, Dominik Hezel, and Simon Hodson

‘WorldFAIR: Global cooperation on FAIR data policy and practice’ is a European Commission funded project composed of 11 discipline and cross-discipline case studies drawn together by CODATA, the Committee on DATA of the International Science Councils Committee on DATA, and is supported by the Research Data Alliance. WorldFAIR is a diverse, global community effort that currently has 19 partners located in Africa, Australasia, Europe, and North and South America, representing organisations from research, government and industry. The 11 individual case studies are drawn from Chemistry, Nanomaterials, Geochemistry, Social Surveys, Population Health, Urban Health, Biodiversity, Agriculture, Oceans, Disaster Risk Reduction and Cultural Heritage. The WorldFAIR project aims to focus on the interoperability and reusability of research data products from both within and across disciplines by creating a Cross-Domain Interoperability Framework (CDIF).

The foundation of the CDIF will be a series of FAIR Implementation Profiles (FIPs) which will be used as a methodology for individual communities to express their FAIR practices and decisions for each of the 15 individual FAIR guiding principles. 

As an example of how this will work, the WorldFAIR’s Geochemistry case study is led by OneGeochemistry, an international network of national geochemical data infrastructure organisations. Initially an informal network with representatives from AuScope (Australia), GEOROC (Germany), EPOS Multi-scale Laboratories (Europe), EarthChem (US) and AstroMaterials (US). With the advent of WorldFAIR, OneGeochemistry has formalised it’s governance structure and is now a CODATA Work Group. Over the life of WorldFAIR, OneGeochemistry will work towards developing a community prototype FAIR Implementation Profile(s) for individual geochemical techniques, including the minimum defined variables, through workshops and consultations, and subsequently be responsible for their communication, publication and dissemination. The Geochemistry case study will work closely with the Chemistry case study and leverage relevant chemical standards and vocabularies wherever possible.

Through the development of community lead FAIR Implementation Profile(s) for geochemistry within a global Cross-Domain Interoperability Framework (CDIF), WorldFAIR and OneGeochemistry are both advancing the adoption of the FAIR data principles within Geochemistry and simultaneously enabling interoperability of geochemical research data products across the other ten discipline case studies.

How to cite: Farrington, R., Prent, A., Wyborn, L., Rawling, T., Klöcking, M., Lehnert, K., Elger, K., ter Maat, G., Hezel, D., and Hodson, S.: The WorldFAIR project: enabling global interdisciplinary cooperation on integrating FAIR Data policy and practices in geochemistry with ten other disciplinary groups., EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15749, https://doi.org/10.5194/egusphere-egu23-15749, 2023.

X4.161
|
EGU23-12686
Fabrice Cotton, Angelo Strollo, Helle Pedersen, Helen Crowley, Stefan Wiemer, Florian Haslinger, Marc Urvois, Jean Schmittbuhl, Stefano Lorito, Andrey Babeyko, Daniele Bailo, Jan Michalek, Otto Lange, Javier Quintero, Gaetano Festa, Shane Murphy, Mariusz Majdanski, Iris Christadle, Mateus Prestes, and Stefanie Weege

Modern scientific endeavours already have the capacity to call upon a vast variety of data, often in huge volumes. However, the challenge is not only how to make the most of such a resource, but also how to make it available to the wider scientific community, especially for encouraging curiosity-driven research. Fifty-one institutions from 13 countries are currently working together in the Geo-INQUIRE (Geosphere INfrastructure for QUestions into Integrated REsearch) project.

The main goal of this new project is to enhance, give access to, and make interoperable, key datasets of the Geoscience community. This will include "big" data streams and high-performance computing codes which are critical to studying the temporal variation of the solid Earth, forecasting multi-hazards, evaluating Georesources and the analysis of the interface between the solid Earth as well as oceans and atmosphere.  About 150 access points – both on-site and virtually are involved. Transnational Access (TA, both virtual and on-site) will be provided at six test beds across Europe: the Bedretto Laboratory, Switzerland; the Ella-Link Geolab, Portugal; the Liguria-Nice-Monaco submarine infrastructure, Italy/France; the Irpinia Near-Fault Observatory, Italy; the Eastern Sicily facility, Italy; and the Corinth Rift Laboratory, Greece.

Several European Research Infrastructure Consortia take part, namely the European Plate Observing System (EPOS) for solid Earth and geodynamics observations, the European Multidisciplinary Seafloor and Water Column Observatory (EMSO) for deep-sea and coastal observations, and ECCSEL for CO2 capture, utilization, transport, and storage, and geoenergy. This 16 million Euro project started in October 2022, within the Horizon Europe Infrastructure program of the European Union.

The presentation will briefly describe the project and give examples of curiosity-driven research topics which will be made possible through such a multi-disciplinary project. We will finally present the challenges and efforts made to comply with FAIR principles and accompany the dissemination of the data with innovative and cross-disciplinary training activities.

How to cite: Cotton, F., Strollo, A., Pedersen, H., Crowley, H., Wiemer, S., Haslinger, F., Urvois, M., Schmittbuhl, J., Lorito, S., Babeyko, A., Bailo, D., Michalek, J., Lange, O., Quintero, J., Festa, G., Murphy, S., Majdanski, M., Christadle, I., Prestes, M., and Weege, S.: Advancing frontier knowledge of the solid earth by providing access to integrated and customized services: the Geo-INQUIRE project, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-12686, https://doi.org/10.5194/egusphere-egu23-12686, 2023.

X4.162
|
EGU23-15909
Carmela Cornacchia and Ilaria Rosati

The Italian Integrated Environmental Research Infrastructures System (ITINERIS) Project started in November, and it will build the Italian Hub of Research Infrastructures in the environmental scientific domain providing access to data and services and supporting the Country to address current and expected environmental challenges. ITINERIS coordinates a network of national nodes from 22 RIs (18 from the environmental domain, 2 from agri-food with strong link with the environment and 2 from the PSE domain, supporting services for the marine domain).

ITINERIS has been designed looking at synergy with the European RI framework, and it will support the participation of Italian scientists in pan-European initiatives (ENVRI-FAIR, EOSC) and in HE (Pillar 1, Missions, Partnerships, Clusters). ITINERIS will have significant impact on national environmental research, providing scientific support to the design of actionable environmental strategies. ITINERIS adopts a whole-system, cross-disciplinary approach to the Earth System and its changes, allowing users to benefit from the integrated system of RIs and the knowledge it produces. This broad-scale vision of environmental research, sustained by the main Italian environmental scientists involved in European RIs, is truly innovative and it will support our Country in taking a leading role in European environmental research, designing the framework for the next decades.

A specific objective will be focused on the access to facilities, FAIR data and related services connecting the established network of distributed national environmental research infrastructures to as wide as possible user community. Following a user-centric approach and in accordance with the RIs’ network technical capability and mission, access services to the national RIs’ facilities and FAIR resources (data, services and other research outputs) will be set up. Most of the 22 RIs participating in ITINERIS already offer data and services from and across different domains of the Earth system - Atmosphere, Hydrosphere, Terrestrial Biosphere and Geosphere through different systems, protocols, portals and different access and FAIRification procedures.

ITINERIS builds upon this current effort to improve Access management and FAIRness, developing the conditions for harmonizing standards, metadata and policies amongst the different RIs. It has to be considered that involved RIs are very diverse and on different levels of maturity but face similar challenges in their operations in regard to FAIR compliance and Access management.

How to cite: Cornacchia, C. and Rosati, I.: Access to facilities, FAIR data and related services: the ITINERIS project, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-15909, https://doi.org/10.5194/egusphere-egu23-15909, 2023.

Posters virtual: Thu, 27 Apr, 16:15–18:00 | vHall ESSI/GI/NP

Chairperson: Jacco Konijn
vEGN.10
|
EGU23-14673
Towards a more pervasive role of PIDs in ENVRI metadata
(withdrawn)
Margareta Hellström