ESSI2.3

ESSI2 EDI
Established and Establishing Disciplinary International Frameworks that will Ultimately Enable Real-Time Interdisciplinary Sharing of Data. 

As we increasingly face global challenges such as climate change, pandemics, environmentally sustainable exploitation of our resources, there is a greater urgency to bring together multiple existing data/information infrastructure systems that are distributed around the world to create machine actionable, interoperable, reusable, real-time data sharing frameworks.

The problem is that research can be a ‘competitive’ process, and there is a tendency for this competition to be focused on which is considered to be the best data sharing system or data standard that supposedly is THE one that everyone SHOULD use.

An alternative approach is to build loosely coupled frameworks that allow multiple existing systems to interoperate, but still, preserve their deeper disciplinary specialization. For this approach to work, there will need to be agreement on 1) the minimum core variables for sharing data content, and 2) the technical standards/technologies required to enable real-time data interoperability.

There are well-established examples of groups facilitating global data sharing (e.g., Federation of Digital Seismograph Networks (FDSN), OneGeology, Earth Systems Grid Federation (ESFG), OGC, W3C, GEO). Many new groups are starting to form global disciplinary data networks: some are already trying to link frameworks together to enable global interdisciplinary sharing of data (e.g., CODATA/DDI Cross-Domain Data Initiative).

This session seeks contributions from any group that has established or is establishing a data-sharing infrastructure system/framework regardless of scale, as well as those that are attempting global and/or interdisciplinary networking. Topics may range from (meta)data standards, defining minimum core content variables, or be focused on technologies or organizational setups for enabling data sharing. Papers on the social dynamics of building sharing systems/frameworks are also welcome.

Co-sponsored by AGU
Convener: Anca Hienola | Co-conveners: Jacco Konijn, Lesley Wyborn, Florian Haslinger, Kirsten Elger
Presentations
| Fri, 27 May, 15:10–16:40 (CEST)
 
Room 0.31/32

Presentations: Fri, 27 May | Room 0.31/32

Chairpersons: Anca Hienola, Kirsten Elger, Florian Haslinger
15:10–15:12
15:12–15:22
|
EGU22-8474
|
solicited
|
Virtual presentation
John Watkins, Johannes Peterseil, Alessandro Oggioni, and Vladan Minic

One of the major goals of the upcoming European integrated Long-Term Ecosystem  critical zone and socio-ecological Research Infrastructure (eLTER RI) is to provide reliable and quality-controlled long-term environmental data from various disciplines for scientific analysis as well as the assessment of environmental policy impacts. For this purpose, eLTER has been designing and piloting a federated data infrastructure for integration and dissemination of a broad range of in situ observations and related data.
Implementing such a pan-European environmental data infrastructure is a lengthy and complex process driven by user needs, shareholder requirements and general service and technology best practises. The European LTER community has laid the foundations of this eLTER Information System. For further improvements, user needs have recently been collected by (a) targeted interviews with selected stakeholders to identify requirements, (b) workshops mapping requirements to potential RI services, and (c) analysis work for designing the RI service portfolio for. The requirements collections are used to derive functional (i.e. the behaviour of essential features of the system) and non-functional (i.e. the general characteristics of the system) requirements for the IT infrastructure and services. These collected requirements revolve around the development of workflows for the ingestion, curation and publication of data objects including the creation, harvesting, discovery and visualisation of metadata as well as providing means to support the analysis of these datasets and communicating study results.
Considering that downstream analyses of data from both eLTER and other RIs are a key part of the RI´s scope the design includes virtual collaborative environments where different data and analyses can be brought together and results shared with FAIR principles as the default for research practice. The eLTER RI will take advantage of data stored in existing partner data systems, harmonised by a central discovery portal and federated data access components providing a common information management infrastructure for bridging across environmental RIs.
This presentation will provide an overview of the current stage of the eLTER RI developments as well as its major components, provide an outlook for future developments and discuss the technical and scientific challenges of building the eLTER RI for interdisciplinary data sharing.

How to cite: Watkins, J., Peterseil, J., Oggioni, A., and Minic, V.: The construction of the eLTER Pan-European research infrastructure to support multidisciplinary environmental data integration and analysis, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8474, https://doi.org/10.5194/egusphere-egu22-8474, 2022.

15:22–15:27
|
EGU22-2145
|
On-site presentation
Martina Stockhause and Michael Lautenschlager

The Data Distribution Centre (DDC) of the Intergovernmental Panel on Climate Change (IPCC) celebrates its 25th anniversary in 2022. DKRZ is the last remaining founding member among the DDC Partners. The contribution looks back on the past 25 years of the DDC at DKRZ from its establishment to the present. It shows which the milestones have been introduced in the areas of data management and data standardization, e.g. 

  • the NetCDF/CF data standard,
  • the DataCite data DOI assignment enabling data citation,  
  • the data preservation and stewardship standards of the World Data System (WDS), 
  • the Earth System Grid Federation (ESGF) as data infrastructure standard, or 
  • the IPCC FAIR Guidelines for the current 6th Assessment Report (AR6). 

In addition to the continuous effort to adopt new standards and curate the data holdings, current challenges - technical and organizational - and possible future directions are discussed. The most difficult of the challenges remains the long-term strategy for sustainable DDC services as a part of an increasingly interoperable data service environment, which is technically described in the FAIR digital object framework and which is contentwise guided by the UN Sustainable Development Goal 13 on climate action. 

(http://ipcc-data.org; http://ipcc.wdc-climate.de)

How to cite: Stockhause, M. and Lautenschlager, M.: 25 years of the IPCC Data Distribution Centre at the German Climate Computing Center (DKRZ), EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-2145, https://doi.org/10.5194/egusphere-egu22-2145, 2022.

15:27–15:32
|
EGU22-10897
|
ECS
|
Presentation form not yet defined
Sarah Ramdeen, Kerstin Lehnert, Jens Klump, Matt Buys, Sarala Wimalaratne, and Lesley Wyborn

In October 2021, DataCite and the IGSN e.V. signed an agreement to form a partnership to support the global adoption, implementation, and use of physical sample identifiers. Both DataCite and IGSN currently offer the ability to provide Globally Unique Persistent, Resolvable Identifiers (GUPRIs) within the overall research ecosystem, and the proposed collaboration will bring together the strengths of each organization.

DataCite is a community-led organisation that has been providing the means to create, find, cite, connect, and use research across 47 countries globally since 2009. DataCite provides persistent identifiers (DOIs) for research data and other research outputs, and supports the efforts of several identifier communities. DataCite also develops services that make it easier for researchers to connect and share their DOIs with the broader research ecosystem. 
IGSN e.V. is an international, non-profit organization with more than 20 members and has a narrower focus than DataCite. The core purpose of IGSN is to enable transparent and traceable connections between samples, instruments, grants, data, publications, people and organizations. Since 2011, IGSN has provided a central registration system that enables researchers to apply a globally unique and persistent identifier for physical samples.

The proposed partnership will enable IGSN to leverage DataCite DOI registration while allowing IGSN to focus on community efforts such as promoting and expanding the global samples ecosystem and supporting new research and best practice in methods of identifying, citing, and locating physical samples. DataCite will provide the IGSN ID registration services and support to ensure the ongoing sustainability of the IGSN PID infrastructure and its integration with the global PID ecosystem.

This partnership is an opportunity for IGSN to reenvision its governance and community engagement and to reassess how the IGSN can best serve the community in today’s open science ecosystem. This talk will focus on the developing changes to the IGSN governance and community efforts.
Different research communities have a wide range of requirements towards metadata and identification of samples. The IGSN plans to develop an international ‘Community of Communities’ which will include members from the global samples community across multiple disciplines. The community will support varying levels of skills with PIDs and metadata. It will enable cohesion around the use of IGSN thus enabling greater research discovery, innovation and advancement for samples.

The IGSN Samples Community (IGSN SC) aspires to be a collaborative space for community development that promotes the use of samples and their connections to any derived observations, images, and analytical data.

How to cite: Ramdeen, S., Lehnert, K., Klump, J., Buys, M., Wimalaratne, S., and Wyborn, L.: Progressing the global samples community through the new partnership between IGSN and DataCite, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-10897, https://doi.org/10.5194/egusphere-egu22-10897, 2022.

15:32–15:37
|
EGU22-5931
|
Presentation form not yet defined
Christian Briese, Charis Chatzikyriakou, Diego Scardaci, Zdeněk Šustr, Enol Fernández, Björn Backeberg, and Elonora Testa

Through the provision of massive streams of high-resolution Earth Observation (EO) data, the EU Copernicus programme has established itself globally as the predominant spatial data provider. These data are widely used by research communities to monitor and address global challenges, such as environmental monitoring and climate change, supporting European policy initiatives, such as the Green Deal and others. To date, there is no single European data sharing and processing infrastructure that serves all datasets of interest, and Europe is falling behind international developments in Big Data analytics and computing.

The C-SCALE (Copernicus - eoSC AnaLytics Engine, https://c-scale.eu) project federates European EO infrastructure services, such as ESA’s Sentinel Collaborative Ground Segment, the Copernicus DIASes (Data and Information Access Services under the EC), independent nationally-funded EO service providers, and European Open Science Cloud (EOSC) e-infrastructure providers. It capitalises on EOSC's capacity and capabilities to support Copernicus research and operations with large and easily accessible European computing environments. The project will implement and publish the C-SCALE Federation in the EOSC Portal as a suite of complementary services that can be easily exploited. It will consist of a Data Federation, a service providing access to a large EO data archive, a Compute Federation, and analytics tools.

The C-SCALE Data Federation aims at making EO data providers under EOSC findable, their metadata databases searchable, and their product storage accessible. While a centralised, monolithic, complete Copernicus data archive may not be feasible, some organisations maintain various archives for limited areas of their interest. C-SCALE, therefore, integrates these heterogeneous resources into a “system of systems” that will offer the users an interface that, in most cases, provides similar functionality and quality of service as a centralised, monolithic data archive would. The federation is built on existing technologies, avoiding redundancy and replication of functions and not disrupting existing usage patterns at participating sites, instead only adding a simple layer for improved discovery and seamless access.

At the same time, the C-SCALE Compute Federation provides access to a wide range of computing providers (IaaS VMs, container orchestration platforms, HPC and HTC systems) to enable the analysis of Copernicus and EO data under EOSC. The design of the federation allows users to deploy their applications using federated authentication mechanisms, find their software under a common catalogue, and have access to data using C-SCALE Data Federation tools. The federation relies on existing tools and services already compliant with EOSC, thus facilitating the integration into the larger EOSC ecosystem.

By making such a scalable Big Copernicus Data Analytics federated services available through EOSC and its Portal and linking the problems and results with experience from other research disciplines, C-SCALE helps to support the EO sector in its development. By abstracting the set-up of computing and storage resources from the end-users, it enables the deployment of custom workflows to generate meaningful results quickly and easily. Furthermore, the project will deliver a blueprint, setting up an interaction model between service providers to facilitate interoperability between commercial and public cloud infrastructures.

How to cite: Briese, C., Chatzikyriakou, C., Scardaci, D., Šustr, Z., Fernández, E., Backeberg, B., and Testa, E.: C-SCALE: A new Data and Compute Federation for Earth Observation, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-5931, https://doi.org/10.5194/egusphere-egu22-5931, 2022.

15:37–15:42
|
EGU22-11968
|
Virtual presentation
Anna Miglio, Andras Fabian, Carine Bruyninx, Stefanie De Bodt, Juliette Legrand, Paula Oset Garcia, and Inge Van Nieuwerburgh

Accurate positioning for activities such as navigation, mapping, and surveying rely on permanent stations located all over the world and continuously tracking Global Navigation Satellite Systems (GNSS, such as Galileo, GPS, GLONASS). 
The Royal Observatory of Belgium maintains repositories containing decades of observation data from hundreds of GNSS stations belonging to Belgian and European networks (e.g., the EUREF public repository). 
However, current procedures for accessing GNSS data do not adequately serve user needs. For example, in the case of the EUREF repository, despite the fact that its GNSS data originate from a significant number of data providers and could be handled in different ways, provenance information is lacking and data licenses are not always available.
In order to respond to user demands, GNSS data and the associated metadata need to be standardised, discoverable and interoperable i.e., made FAIR (Findable, Accessible, Interoperable, and Re-usable). Indeed, FAIR data principles serve as guidelines for making scientific data suitable for reuse, by both people and machines, under clearly defined conditions. 
We propose to identify existing metadata standards that cover the needs of the GNSS community to the maximum extent and to extend them and/or to develop an application profile, considering also best practices at other GNSS data repositories. 

Here we present two proposals for metadata to be provided to the users when querying and/or downloading GNSS data from GNSS data repositories. 
We first consider metadata containing station-specific information (e.g., station owner, GNSS equipment) and propose an extension of GeodesyML, an XML implementation of the eGeodesy model aligned with international standards such as ISO19115-1:2014 and OGC's GML. The proposed extension contains additional classes and properties from domain specific vocabularies when necessary, and includes extra metadata such as data license, file provenance information, etc. to comply with FAIR data principles. All proposed changes to GeodesyML are optional and therefore guarantee full backwards compatibility. 

Secondly, we consider metadata related to GNSS observation data i.e. RINEX data files. We propose an application profile based on the specifications of the Data Catalog Vocabulary (DCAT), a RDF vocabulary that, by design, facilitates the interoperability between data portals (supporting DCAT-based RDF documents) and enables publishing metadata directly on the web by using different formats.
In particular, our proposal (GNSS-DCAT-AP) includes new recommended metadata classes to describe the specific characteristics of GNSS observation data: the type of RINEX file (e.g., compression format, frequency); the RINEX file header and information regarding the GNSS station including the GNSS antenna and receiver; the software used to generate the RINEX  file. Additional optional classes allow the inclusion of information regarding the GNSS antenna, receiver and monument associated with the GNSS station and extracted from the IGS site log or GeodesyML files

How to cite: Miglio, A., Fabian, A., Bruyninx, C., De Bodt, S., Legrand, J., Oset Garcia, P., and Van Nieuwerburgh, I.: Proposed metadata standards for FAIR access to GNSS data, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-11968, https://doi.org/10.5194/egusphere-egu22-11968, 2022.

15:42–15:47
|
EGU22-6628
|
Presentation form not yet defined
Rui Fernandes, Carine Bruyninx, Paul Crocker, Anne Socquet, and Mathilde Vergnolle and the EPOS-GNSS Members

EPOS-GNSS is the Thematic Core Service being implemented in the framework of the European Plate Observing System (EPOS) focused on management and dissemination of GNSS (Global Navigation Satellite Systems) Data and Products. The European Research Infrastructure Consortium (ERIC) has provided to EPOS a legal personality and capacity that is recognised in all EU Member States that permits to provide open access to a large pool of Solid Earth science integrated data, data products and facilities for researchers.

The GNSS community in Europe is benefiting from EPOS ERIC to create mechanisms and procedures to harmonize, in collaboration with other pan-European infrastructures (particularly EUREF), the access to GNSS data, metadata and derived products (time-series, velocities, and strain rate maps) that primarily are the interest of the Solid Earth community but ultimately benefit many other stakeholders, particularly data providers and other scientific and technical applications.

In this presentation we focus on the three main components that since last year have entered in the pre-operational phase: (a) Governance – with the aim that the entire community, from data providers to end-users, will be represented and their efforts recognized; (b) GLASS – the in-house dedicated software package developed for the dissemination of GNSS data and products with rigorous quality control procedures; (c) Products – internally consistent GNSS solutions of dedicated products (time-series, velocities and strain-rates) created from the available data set using state-of-art methodologies to be used to improve the understanding of the different Solid Earth mechanisms taken place in the European region.

How to cite: Fernandes, R., Bruyninx, C., Crocker, P., Socquet, A., and Vergnolle, M. and the EPOS-GNSS Members: EPOS-GNSS – Current status of service implementation for European GNSS data and products, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-6628, https://doi.org/10.5194/egusphere-egu22-6628, 2022.

15:47–15:52
|
EGU22-10071
|
On-site presentation
Florian Haslinger, Lars Ottemöller, Carlo Cauzzi, Susana Custodio, Rémy Bossu, Alberto Michelini, Fabrice Cotton, Helen Crowley, Laurentiu Danciu, Irene Molinari, and Stefano Parolai

The European Plate Observing System EPOS is the single coordinated framework for solid Earth science data, products and services on a European level. As one of the science domain structures within EPOS, EPOS Seismology brings together the three large European infrastructures in seismology: ORFEUS for seismic waveform data & related products, EMSC for parametric earthquake information, and EFEHR for seismic hazard and risk information. Across these three pillars, EPOS Seismology provides services to store, discover and access seismological data and products from raw waveforms to elaborated hazard and risk assessment.

ORFEUS, EMSC and EFEHR are community initiatives / infrastructures that each have their own history, structure, membership, governance and established mode of work (including data sharing and distribution practices), developed in parts over decades. While many institutions and individuals are engaged in more than one of these initiatives, overall the active membership is quite distinct. Also, each of the initiatives has different connections to and interactions with other international organisations. Common to all is the adoption and promotion of recognized international standards for data, products and services originating from wider community organisations (e.g. FDSN, IASPEI, GEM), and the active participation in developing those further or creating new ones together with the community.     

In this presentation we will briefly review the history and development of the three initiatives and discuss how we set up EPOS Seismology as a joint coordination framework within EPOS. We will highlight issues encountered on the way and those that we are still trying to solve in our attempt to create and operate a coordinated research infrastructure that appropriately serves the needs of today’s scientific community. Among those issues is also the ‘timeliness’ of data and products: while a number of services offer almost-real-time access to newly available information at least in theory, this comes with various downstream implications that are currently actively discussed. We also cover the envisaged role of EPOS Seismology in supporting international multi-disciplinary activities that require and benefit from harmonized, open, and interoperable data, products, services and facilities from the waveform, catalogue and hazard / risk domains of seismology.

How to cite: Haslinger, F., Ottemöller, L., Cauzzi, C., Custodio, S., Bossu, R., Michelini, A., Cotton, F., Crowley, H., Danciu, L., Molinari, I., and Parolai, S.: Facilitating Multi-Disciplinary Research via Integrated Access to the Seismological Data & Product Services of EPOS Seismology, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-10071, https://doi.org/10.5194/egusphere-egu22-10071, 2022.

15:52–15:57
|
EGU22-5478
|
On-site presentation
Jan Michalek, Kuvvet Atakan, Christian Rønnevik, Sara Kverme, Lars Ottemøller, Øyvind Natvik, Tor Langeland, Ove Daae Lampe, Gro Fonnes, Jeremy Cook, Jon Magnus Christensen, Ulf Baadshaug, Halfdan Pascal Kierulf, Bjørn-Ove Grøtan, Odleiv Olesen, John Dehls, and Valerie Maupin

The European Plate Observing System (EPOS) is a European project about building a pan-European infrastructure for accessing solid Earth science data, governed now by EPOS ERIC (European Research Infrastructure Consortium). The EPOS-Norway project (EPOS-N; RCN-Infrastructure Programme - Project no. 245763) is a Norwegian project funded by National Research Council. The aim of the Norwegian EPOS e‑infrastructure is to integrate data from the seismological and geodetic networks, as well as the data from the geological and geophysical data repositories. Among the six EPOS-N project partners, four institutions are providing data – University of Bergen (UIB), - Norwegian Mapping Authority (NMA), Geological Survey of Norway (NGU) and NORSAR.

In this contribution, we present the EPOS-Norway Portal as an online, open access, interactive tool, allowing visual analysis of multidimensional data. It supports maps and 2D plots with linked visualizations. Currently access is provided to more than 300 datasets (18 web services, 288 map layers and 14 static datasets) from four subdomains of Earth science in Norway. New datasets are planned to be integrated in the future. EPOS-N Portal can access remote datasets via web services like FDSNWS for seismological data and OGC services for geological and geophysical data (e.g. WMS). Standalone datasets are available through preloaded data files. Users can also simply add another WMS server or upload their own dataset for visualization and comparison with other datasets. This portal provides unique way (first of its kind in Norway) for exploration of various geoscientific datasets in one common interface. One of the key aspects is quick simultaneous visual inspection of data from various disciplines and test of scientific or geohazard related hypothesis. One of such examples can be spatio-temporal correlation of earthquakes (1980 until now) with existing critical infrastructures (e.g. pipelines), geological structures, submarine landslides or unstable slopes. 

The EPOS-N Portal is implemented by adapting Enlighten-web, a server-client program developed by NORCE. Enlighten-web facilitates interactive visual analysis of large multidimensional data sets, and supports interactive mapping of millions of points. The Enlighten-web client runs inside a web browser. An important element in the Enlighten-web functionality is brushing and linking, which is useful for exploring complex data sets to discover correlations and interesting properties hidden in the data. The views are linked to each other, so that highlighting a subset in one view automatically leads to the corresponding subsets being highlighted in all other linked views.

How to cite: Michalek, J., Atakan, K., Rønnevik, C., Kverme, S., Ottemøller, L., Natvik, Ø., Langeland, T., Lampe, O. D., Fonnes, G., Cook, J., Christensen, J. M., Baadshaug, U., Kierulf, H. P., Grøtan, B.-O., Olesen, O., Dehls, J., and Maupin, V.: EPOS-Norway Portal, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-5478, https://doi.org/10.5194/egusphere-egu22-5478, 2022.

15:57–16:02
|
EGU22-4265
|
On-site presentation
Daniele Bailo, Jan Michalek, Keith G Jeffery, Kuvvet Atakan, and Rossana Paciello and the EPOS IT Team

The European Plate Observing System (EPOS) addresses the problem of homogeneous access to heterogeneous digital assets in geoscience of the European tectonic plate. Such access opens new research opportunities. Previous attempts have been limited in scope and required much human intervention. EPOS adopts an advanced Information and Communication Technologies (ICT) architecture driven by a catalogue of rich metadata. The architecture of the EPOS system together with challenges and solutions adopted are presented. The EPOS Data Portal is introducing a new way for cross-disciplinary research. The multidisciplinary research is raising new requirements both to students and teachers. The EPOS portal can be used either to explore the available datasets or to facilitate the research itself. It can be very instructive in teaching as well by demonstrating scientific use cases. 

EPOS ERIC had been established in 2018 as European Research Infrastructure Consortium for building a pan-European infrastructure and accessing solid Earth science data. The sustainability phase of the EPOS (EPOS-SP – EU Horison2020 – InfraDev Programme – Project no. 871121; 2020-2022) is focusing on finding solutions for the long-term sustainability of EPOS developments. The ambitious plan of geoscientific data integration started already in 2002 with a Conception Phase and continued by an EPOS-PP (Preparatory Phase, 2010-2014) where about 20 partners joined the project. The finished EPOS-IP project (EPOS-IP – EU Horison2020 – InfraDev Programme – Project no. 676564; 2015-2019) included 47 partners plus 6 associate partners from 25 countries from all over Europe and several international organizations.

The EPOS Data Portal provides access to data and data products from ten different geoscientific areas: Seismology, Near Fault Observatories, GNSS Data and Products, Volcano Observations, Satellite Data, Geomagnetic Observations, Anthropogenic Hazards, Geological Information and Modelling, Multi-scale laboratories and Tsunami Research. The Data portal Graphic User Interface (GUI) provides search functionalities to enable users to filter data by using several criteria (e.g. spatio-temporal extents, keywords, data/service providers, free-text); also, it enables users to pre-visualize data in Map, Tabular or Graph formats; the GUI finally provides details about the selected data (e.g., name, description, license, DOI), as well as to further refine the search in order to dig into a smaller level of granularity of data.

The presentation is showing achievements of the EPOS community with focus on the EPOS Data Portal which is providing information about available datasets from TCS and access to them. We are demonstrating not only features of the graphical user interface but also the underlying architecture of the whole system.

How to cite: Bailo, D., Michalek, J., Jeffery, K. G., Atakan, K., and Paciello, R. and the EPOS IT Team: EPOS Data portal for cross-disciplinary data access in the Solid Earth Domain, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-4265, https://doi.org/10.5194/egusphere-egu22-4265, 2022.

16:02–16:07
|
EGU22-8905
|
Presentation form not yet defined
Chad Trabant, Henry Berglund, Jerry Carter, and David Mencin

The Data Services of IRIS and the Geodetic Data Services of UNAVCO have been supporting the seismological and geodetic research communities for many decades.  Historically, these two facilities have independently managed data repositories on self-managed systems.  As part of merger activities between IRIS and UNAVCO, we have established a project to design, develop and implement a common, cloud-based platform.  Goals of this project include operational improvements such as cost-effectiveness, robustness, on-demand scalability, significant growth potential and increased adaptability for new data types.  While we expect a number of operational improvements, we anticipate a number of additional benefits for the research communities we serve.

The new platform will provide services for data queries across the internal repositories.  This will provide researchers with an easier path to discovery, and access to integratable data sets of related geophysical data.

Researchers will be able to conduct their data processing in the same, or data-proximate, cloud as the platform, taking advantage of copious and affordable computation offered by such environments.  Following the paradigm of moving the computation to the data, this will avoid the time and resource consuming need to transfer the data over the internet.  Furthermore, the adoption of cloud-optimized data containers and direct access by researchers will support efficient processing.  In cases where transferring large volumes of data are still necessary, the large capacity of cloud storage systems will allow enhanced mechanisms such as Globus for transfer, which we will be exploring.

For many users a transition of the data repositories to a new environment will be nearly seamless.  This will be made possible by implementing many of the same services already supported by the current facilities, such as the suite of FDSN web services.  The project is currently in a prototyping stage, and we anticipate having a complete design by the end of 2022.  We will report on the status of the project, anticipated directions and challenges identified so far.

How to cite: Trabant, C., Berglund, H., Carter, J., and Mencin, D.: Developing a Next Generation Platform for Geodetic, Seismological and Other Geophysical Data Sets and Services, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8905, https://doi.org/10.5194/egusphere-egu22-8905, 2022.

16:07–16:12
|
EGU22-9421
|
On-site presentation
Angeliki K. Adamaki, Ana Rita Gomes, Alex Vermeulen, Ari Asmi, and Andreas Petzold

As science and technology evolve, interdisciplinary targets are anything but static, introducing additional levels of complexity and challenging further the initiatives to break the barriers to interdisciplinary research. For over a decade the community of the Environmental Research Infrastructures, forming the ENVRI cluster, has been building strong foundations to overcome these challenges and benefit the environmental sciences. One of the overarching goals of the ENVRI cluster is to provide more FAIR (Findable, Accessible, Interoperable and Reusable) data and services which will be open to everyone who wishes to get access to environmental observations, from scientists and research communities of scientifically diverse clusters to curious citizens, data scientists and policy makers.

Starting with domain-specific use cases we further explore potential cross-domain cases, e.g. in the form of environmental science stories crossing disciplinary boundaries. A set of Jupyter Notebooks developed by the contributing Research Infrastructures (and accessible from a hub of services called the ENVRI-Hub) are promising tools to demonstrate and validate the capabilities of service provision among ENVRIs and across Science Clusters, and act as examples of what a user can achieve through the ENVRI-Hub. In one of the examples we investigate, a user-friendly well-structured Jupyter Notebook that makes use of research infrastructures’ application programming interfaces (APIs) jointly plots in a map the geographical locations of several Marine and Atmospheric stations (where the stations in this example are defined as measurement points actively collecting data). The FAIR principles provide a firm foundation defining the layer that supports the ENVRI-Hub structure and the preliminary results are promising. Considering that the APIs can become discoverable via a common ENVRI catalogue, the ENVRI-Hub aims to make full use of the machine-actionability of such a catalogue in the future to facilitate this kind of use case execution in the Hub itself.

Acknowledgement: ENVRI-FAIR has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 824068. This work is only possible with the collaboration of the ENVRI-FAIR partners and thanks to the joint efforts of the whole ENVRI team.

How to cite: Adamaki, A. K., Gomes, A. R., Vermeulen, A., Asmi, A., and Petzold, A.: Breaking the barriers to interdisciplinarity: Contributions from the Environmental Research Infrastructures, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-9421, https://doi.org/10.5194/egusphere-egu22-9421, 2022.

16:12–16:17
|
EGU22-8862
|
ECS
|
Presentation form not yet defined
Ana Rita Gomes, Angeliki Adamaki, Alex Vermeulen, Ulrich Bundke, and Andreas Petzold

The ENVRI-FAIR project brings together the ESFRI environmental research infrastructures (ENVRI) that provide environmental data and services, with the aim of making their resources compliant to the FAIR principles. To achieve this goal , the required work is mostly technical, with the ENVRIs working towards not only improving the FAIRness of their own data and services, but also reflecting their efforts at a higher level by becoming FAIR as a cluster. The approach of this task cannot be linear as it requires harmonization of efforts at different dimensions. To build on a common ground, the most crucial technical gaps have been prioritized and the ENVRIs identify common requirements and design patterns, and collaborate on making good use of existing technical solutions that improve their FAIRness.

 

One of the highest ranked priorities, and obviously among the biggest challenges, is the design of a machine actionable ENVRI Catalogue of Services that also supports the integration into the EOSC. Through this catalogue the service providers will be able to make their assets findable and accessible by mapping their resources into common and rich metadata standards, while by means of a web application the human interaction with the FAIR services can be accomplished. The design of this application, named the ENVRI-Hub, is discussed here. Other aspects related to the ENVRI services, e.g. the use of PIDs, the use of relevant vocabularies, tracking license information and provenance etc. are also investigated.

 

Considering the ENVRI-Hub as a web application, this can act as an integrator by bringing together already existing ENVRI services and interoperable services across research infrastructure boundaries . Exploring the potentials of the ENVRI-Hub already from the design phase, the ingestion of metadata from ENVRI assets such as the ENVRI Knowledge Base, the ENVRI Catalogue of Services and the ENVRI Training Catalogue is investigated, aiming to provide the users with functionalities that are relevant to e.g. the discovery of environmental observations, services, tutorials and other available resources. The chosen architectural pattern for the development of the ENVRI-Hub can be compared to a classical n-tier architecture, comprising 1) a data tier, 2) a logic tier and 3) a presentation tier. To integrate the different ENVRI platforms while preserving the application’s independence, the ENVRI-Hub demonstrator aims to replicate an instance of the Knowledge Base and Catalogue of Services. Following a centralised architectural approach, the ENVRI-Hub serves as a harvester entity, collecting data and metadata from the ENVRI Knowledge Base and the ENVRI Catalogue of Services, therefore bringing together these ENVRI platforms into one single portal.

 

Acknowledgement: ENVRI-FAIR has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 824068.

This work is only possible with the collaboration of the ENVRI-FAIR partners and thanks to the joint efforts of the whole ENVRI-Hub team.

How to cite: Gomes, A. R., Adamaki, A., Vermeulen, A., Bundke, U., and Petzold, A.: ENVRI-Hub, the open-access platform of the environmental sciences community in Europe: a closer look into the architecture, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8862, https://doi.org/10.5194/egusphere-egu22-8862, 2022.

16:17–16:22
|
EGU22-3261
|
Presentation form not yet defined
Jens Klump, Ulrich Engelke, Vincent Fazio, Pavel Golodoniuc, Lesley Wyborn, and Tim Rawling

AuScope, founded in 2006, is the provider of research infrastructure to Australia’s  Earth and geospatial science community. Its unifying strategic goals include building the Downward Looking Telescope (DLT) (a metaphor for an integrated system of Earth and geospatial instruments, services, data and analytics to enable scientists to understand Earth’s evolution through time) and exploring how Earth resources may support growing human demands. The AuScope Virtual Research Environment (AVRE) program is responsible for enabling the DLT through providing persistent access to required data and tools from a diverse range of Australian research organisations, government geological surveys and the international community.

In 2009 AuScope released a portal to provide online access to evolved data products to specific groups of users. Subsequently, this portal was combined with online tools to create the AVRE platform of specialised Virtual Laboratories that enabled the execution of explicit workflows. By 2021 it was recognised that AVRE should modernise and take advantage of new technologies that could empower researchers to access higher storage capacities and wider varieties of computational processing options. AVRE also needed to leverage notebooks, containerisation and mobile solutions and facilitate a greater emphasis on ML and AI techniques. Increased storage meant researchers could access less processed, rawer forms of data, which they could then prepare for their own specific requirements, whilst the growth in Open Source software meant easy access to tools that could meet or efficiently be adapted to their needs. 

Recognising that AuScope researchers now required new mechanisms to help them find and reuse multiple resources from globally distributed sites and be able to integrate these with their own data types and tools, the AVRE informatics and technology experts began assessing the requirements for modernising the AVRE platform. The technologists reviewed other virtual research environments, research data portals, and e-commerce platforms for examples of well-designed interfaces and services that help users get the best use out of a platform. 

We then undertook a series of interactive consultations across a broad range of AuScope researchers (geophysics, geochemistry, geospatial, geology, etc). We accepted there were multiple requirements, from simple data processing on small volume data sets through to complex data modelling and assimilation at petascale, and openly acknowledged that there were numerous ways of processing: one size would not fit all.

In the consultations, we focussed on the context that AVRE was about enabling researchers to use a diversity of resources to realise the AuScope strategic goal of the DLT. We recognised that this would require an ability to meet the specialised requirements of a broad range of the current individual AuScope geoscience programs, but at the same time, there was a need to allow for future integration with global transdisciplinary challenges that explore how Earth resources may support growing human demands.

In this presentation, we will discuss the outcomes from our consultations with various AuScope Programs and will present initial plans for a co-designed, re-engineered AVRE platform to meet the expressed needs of a diverse range of DLT developers and users.

How to cite: Klump, J., Engelke, U., Fazio, V., Golodoniuc, P., Wyborn, L., and Rawling, T.: Reimagining the AuScope Virtual Research Environment Through Human-Centred Design, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-3261, https://doi.org/10.5194/egusphere-egu22-3261, 2022.

16:22–16:27
|
EGU22-5407
|
Presentation form not yet defined
Simon Jirka, Christian Autermann, and Sebastian Drost

In the past, many projects have evaluated and demonstrated the use of the Sensor Web Enablement (SWE) Standards of the Open Geospatial Consortium (OGC) in order to publish sensor data. Advantages of these standards included the provision of a domain-independent approach for ensuring interoperability of interfaces, data, and metadata. However, in most cases, the developed infrastructures were limited to pull-based data retrieval patterns. This means that data consumers regularly query servers for data updates which may result in high server loads due to a high-frequency of update requests or increased latencies until a consumer receives new sensor data.


Although there were relevant specifications such as the OGC Publish/Subscribe standard as well as discussion papers, the OGC SWE framework never included a widely accepted solution to handle an active push-based delivery of observation data. With the adaptation of the SensorThings API standard of the OGC in conjunction with mainstream Internet of Things protocols such as the Message Queuing Telemetry Transport (MQTT) protocol this has changed in recent years.


In 2020 we have already presented at the EGU an approach on how to use these technologies to enable the efficient collection of sensor observation data in hydrological application by bridging between sensors and data management servers (Drost et al., 2020).


As part of this contribution, we will discuss the applicability of these technologies, OGC SensorThings API as well as MQTT, to also cover the delivery of data to consumers in addition to the previously described data transmission from sensor devices to a data sink. We will put special emphasis on experiences gathered from the deployment in marine environments (e.g., live underway data and event metadata streams of research vessels), as part of the EMODnet Ingestion II project. Special consideration will be given to a discussion of potential advantages of push-based communication patterns as well as identified challenges for future work (e.g., metadata about push-based data streams, standardization of payloads, access control, best practices on how to structure provided data streams).


Furthermore, we will address the development of data visualization tools for such interoperable real-time data streams and will discuss the opportunities to transfer these technologies to further application domains such as hydrology.


References

Drost, S., Speckamp, J., Hollmann, C., Malewski, C., Rieke, J., & Jirka, S. (2020). Internet of Things Technologies for the Efficient Collection of Hydrological Measurement Data. EGU General Assembly 2020, Online. https://doi.org/10.5194/egusphere-egu2020-10452

How to cite: Jirka, S., Autermann, C., and Drost, S.: Real-time Delivery of Sensor Data Streams using IoT and OGC Standards, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-5407, https://doi.org/10.5194/egusphere-egu22-5407, 2022.

16:27–16:32
|
EGU22-8537
|
Virtual presentation
Dmitry Khvorostyanov, Victor Champonnois, Alain Laupin-Vinatier, Jacqueline Boutin, Gilles Reverdin, Nathalie Lefèvre, Antonio Lourenco, Alban Lazar, Jean-Benoit Charrassin, and Frédéric Vivier

LOCEAN laboratory of the Pierre and Simon Laplace Insitute (IPSL) is in charge of a number of scientific projects and measurement campaigns that result in a large flow of heterogeneous oceanographic data managed at LOCEAN. The data are of various origins and include in situ data from buoys, ships, moorings, marine mammals and satellite missions for salinity, altimetry, ocean color, temperature. LOCEAN also has an instrumental development team that designs and deploys buoys in various parts of the global ocean, with a need to receive and track the data in the near-real time. The data PIs can be involved in different research groups and projects, and while focusing on providing their data, they might need to collaborate with other teams providing complementary datasets.

To address these needs, the INSITUDE platform is developed at LOCEAN with these goals in mind: (1) receive, manage, track in the near-real time, and explore diverse data; (2) assist scientific experts in the data quality control; (3) facilitate cross-uses of insitu and satellite data available at LOCEAN.

The software consists of four components: (1) Django application for the meta-data management; (2) Data processing software (Python); (3) Flask application for server-side interactions with the database; (4) Interactive data exploration/validation front-end.

The basic workflow involves the following steps:

(1) The user specifies the relevant meta-data using the web interface of the Django application; the meta-data database is thus updated;

(2) The processing core is launched automatically at regular times during a day: it reads the meta-data from the database, queries the mailboxes and/or external web services for the data requested, receives, decodes and processes the data, and fills the measurements database. It also generates ASCII data files for selected datasets, which can be downloadable via dedicated web pages or can be used for processing with external user programs (e.g. matlab or python scripts);

(3) The data stored in the measurements database can be interactively explored using DataViewer applications, allowing zoomable views of time series, vertical profiles, and trajectories shown on the virtual globe. Data from different campaigns and for different variables can be viewed together. The quality control assistant allows experts to seamlessly validate the data by assigning quality flags to selected data points or regions, optionally after computing relevant statistics. The validated data can then be visualized and saved based on desired quality flag values.

The INSITUDE platform facilitates data sharing across multiple teams and collaborations between data providers and data experts, researchers and engineers, enabling research projects focused on cross-exploration of various datasets, studies of processes involving both in situ and satellite data, and interpretation of in situ data in a larger-scale context owing to the satellite data. The system offers centralized intuitive acquisition control and access to the data received, along with the related meta-data (projects, campaigns, buoys, people, etc.), facilitates data quality control/validation.

The INSITUDE platform is currently used at LOCEAN and can be deployed in data centers of national data infrastructures, such as the French ODATIS/DATA TERRA.

How to cite: Khvorostyanov, D., Champonnois, V., Laupin-Vinatier, A., Boutin, J., Reverdin, G., Lefèvre, N., Lourenco, A., Lazar, A., Charrassin, J.-B., and Vivier, F.: INSITUDE: web-based collaborative platform for centralized oceanographic data reception, management, exploration, and analysis, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-8537, https://doi.org/10.5194/egusphere-egu22-8537, 2022.

16:32–16:40