ESSI2.1 | Advancing Joint Research on Grand Challenges: International Scientific Infrastructures, FAIR data, and Open Science
EDI
Advancing Joint Research on Grand Challenges: International Scientific Infrastructures, FAIR data, and Open Science
Convener: Vasco MantasECSECS | Co-conveners: Lesley Wyborn, Danie Kinkade, Helen Glaves, Jens Klump
Orals
| Tue, 16 Apr, 16:15–18:00 (CEST)
 
Room -2.16
Posters on site
| Attendance Wed, 17 Apr, 10:45–12:30 (CEST) | Display Wed, 17 Apr, 08:30–12:30
 
Hall X4
Posters virtual
| Attendance Wed, 17 Apr, 14:00–15:45 (CEST) | Display Wed, 17 Apr, 08:30–18:00
 
vHall X4
Orals |
Tue, 16:15
Wed, 10:45
Wed, 14:00
The world is witnessing a transformation of long-held paradigms in the face of unprecedented grand environmental and social challenges. These complex, interconnected issues demand collaborative, innovative, and data-driven approaches. International scientific infrastructures play a pivotal role in advancing research on these challenges by facilitating data sharing, promoting FAIR (Findable, Accessible, Interoperable, and Reusable) data principles, and upholding CARE (Collective Benefit, Authority to Control, Responsibility, and Ethics) principles. This session invites abstracts from scientists, developers, and decision-makers to explore how international scientific infrastructures are shaping the future of research and decision-making in the geosciences and beyond.
We invite research and insights into the role and progress of AI/ML, open science, FAIR principles, governance, collaborative research, and ethical data sharing as applied to climate research and modeling, dynamic satellite mapping of the Earth's surface, 3D/4D mapping of the subsurface, early warning systems, water security, capacity building, and the evaluation of impact.

Orals: Tue, 16 Apr | Room -2.16

Chairpersons: Lesley Wyborn, Danie Kinkade, Vasco Mantas
16:15–16:25
|
EGU24-14360
|
On-site presentation
Alice-Agnes Gabriel

Geohazards and risks increase worldwide rapidly due to continuing urbanization, climate change, and high-risk critical distributed infrastructure. The longest modern instrumental records of earthquakes cover less than 100 years, while recurrence intervals of large earthquakes are hundreds of years or more. Increasingly dense observations and physics-based simulations empowered by supercomputing provide pathways for overcoming the lack of data and elucidating spatiotemporal patterns that extend our knowledge beyond sporadic case studies and average statistical laws - however, are typically challenging to integrate.

Digital Twins are emerging in Solid Earth Science, allowing curiosity-driven science to test scientific hypotheses against observations over ranges of space-time scales not accessible for laboratory and field observations. The results can clarify processes leading to large earthquakes, improve our forecasting ability, and enhance the general understanding of earthquakes and faults.

In this presentation, I will highlight Geo-INQUIRE (www.geo-inquire.eu), DT-Geo (www.dt-geo.eu) and ChEESE-2P (www.cheese2.eu), European projects that aim to overcome cross-domain barriers and will exploit innovative data management techniques, modeling and simulation methods, developments in AI and big data, and extend existing data infrastructures to disseminate these resources to the wider scientific community. Specifically, we will provide and enhance access to selected key data, products, and services, enabling the dynamic processes within the geosphere to be monitored and modeled at new levels of spatial and temporal detail and precision.

How to cite: Gabriel, A.-A.: Enabling curiosity-driven science and digital twins for earthquake physics in the exascale era, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14360, https://doi.org/10.5194/egusphere-egu24-14360, 2024.

16:25–16:35
|
EGU24-12644
|
Virtual presentation
Helle Pedersen, Elizabetta D'Anastasio, Jerry Carter, Rob Casey, Jonathan Hanson, Florian Haslinger, Javier Quinteros, and Lesley Wyborn

Over the past years the awareness of the importance and usefulness of Globally Unique Persistent Resolvable identifiers (GUPRIs), appropriate licenses, further standardisation of metadata, and general adherence to the FAIR principles has increased significantly in the international seismological community. One important milestone was the introduction of seismological network identifiers as an FDSN recommendation in 2014 and recently updated at the end of 2023. Further advances were made, or are in development, in connection with the establishment of more formalised research infrastructures like EPOS in Europe, Auscope in Australia, and the reorganisation of IRIS and UNAVCO to Earthscope in the U.S, as well as national and international initiatives such as for example COPDESS, and RDA. In Europe the developments in seismology have taken place within or with close links to projects such as  Geo-INQUIRE, ChEESE, Digital Twin of GEOphysical extremes (DT–GEO), and building on achievements and tools from projects (e.g. FAIRsFAIR), and in general as part of the European Research Infrastructure environment.

In this contribution we reflect on the current state of the use of identifiers, application of licenses and other improvements in the FAIRness of seismological data, products and services, focusing on FDSN and ORFEUS/EIDA, Earthscope, Auscope, and the GFZ/Geofon and RESIF data centers.

Experience gained with DOIs as seismological network identifiers is conclusive in terms of acceptability of associating a DOI at network level: more than 70% of FDSN registered networks now have a DOI. On the contrary, the correct citation based on the DOI is only gradually gaining traction in scientific publications, due to a combination of slow uptake by researchers and the difficulty of the scientific journals to set up automatic or semi-automatic checking procedures.  Additional challenges remain e.g. when trying to implement identifiers for data collections and/or downstream products that properly support reproducibility of scientific workflows. A simple collection of DOIs would not be enough to describe a user defined dataset, that is characterized by a much finer granularity. Therefore, the evaluation of other alternatives, like the inclusion of time-stamped query related to a new DOI describing such a dataset could be needed. This could be used for small datasets with data from different sources, or even for ML/AI training sets, defined as a collection of networks.

Some standardisation and best practices have emerged with regard to licensing of seismological data and products, in particular the use of by attribution licenses like CC-BY. A common and harmonised understanding of legal implications, intellectual property, and consequences of specific licenses, however, seems still quite a bit away.

Implementing FAIRness, and then measuring it or even reporting the level of FAIRness to funding agencies has met with some success at least in specific initiatives or through specific projects. One noteworthy development is the introduction of FAIR Implementation Profiles (FIPS) that allow a quantitative assessment of the achieved FAIRness.   

How to cite: Pedersen, H., D'Anastasio, E., Carter, J., Casey, R., Hanson, J., Haslinger, F., Quinteros, J., and Wyborn, L.: FAIR and open data: state of affairs for seismological networks and infrastructures globally, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12644, https://doi.org/10.5194/egusphere-egu24-12644, 2024.

16:35–16:45
|
EGU24-21810
|
On-site presentation
Alexander Prent and Rebecca Farrington

Interdisciplinary science missions rely on the ability to combine data from across many research domains. Convergence of data can be achieved through adoption of the FAIR principles for data assets, making them Findable, Accessible, Interoperable and Reusable. In order to make data FAIR beyond a limited number of researchers, a broader research community has to declare which schemas, data standards, protocols and other resources are used for metadata and data. These resources, when published, are FAIR Enabling Resources (FERs). Listing which FERs are used to make a dataset FAIR helps the community towards interoperability between datasets. FAIR Implementation Profiles (FIPs) list FERs for each FAIR principle through a systematic question and answer based form and can be the basis for comparing FERs used in different data assets.

Through comparison of different communities’ FIPs, mappings and crosswalks can be developed between datasets, resulting in interoperability between datasets. Employing a FIP comparison strategy enables a group to grow the FAIR data asset size. Comparing FIPs with regards to a specific community can help grow it in both size and complexity, adding additional community members and their related interoperable datasets. FAIRness here evolves both on data asset size as on the community complexity level.

Elaborating on this; intercommunity agreement on FER usage, or the development of mappings and crosswalks between FERs, increases the communities FAIRness, growing its complexity and size. Growth of FAIR data assets can be achieved when multiple datasets use the same FERs and become a FAIR data collection. Additionally, complexity of the FAIR community goes hand in hand with growth of the FAIR data asset as multiple groups are generally involved in the collation of multiple datasets. FAIRness also increases if FERs are aligned for data types from different instruments, resulting in their various methodologies also becoming interoperable. With FAIRness increasing between methodologies the community complexity generally increases as for the combining of datasets.

Here we will present key outcomes from the WorldFAIR Geochemistry Work Package on how FAIRness of a community and its constituent data assets can evolve along three pathways.  FAIRness can be increased for the community (complexity), for data assets (size) and between methodologies or (sub)disciplines with FIPs as a means to document FERs used for community, data or methodologies in a structured manner, the comparative FIPs approach can form the basis for convergence and FAIR evolution on either of the three pathways.

How to cite: Prent, A. and Farrington, R.: FAIR Convergence using FAIR Implementation Profiles and the FAIR evolution pathways concept: lessons learned from the WorldFAIR Geochemistry Work Package , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-21810, https://doi.org/10.5194/egusphere-egu24-21810, 2024.

16:45–16:55
|
EGU24-1701
|
Virtual presentation
Mark Rattenbury

International standards are important for communication of geoscience information across borders and between countries, and in particular for addressing multinational and global issues such as climate change, resilience to natural hazards and sustainable resource extraction.

The Commission for the Management and Application of Geoscience Information (CGI) is the International Union of Geological Sciences’ (IUGS) commission for developing, managing and governing geoscience data models and vocabularies, amongst other standards. CGI undertakes its activities through working groups arranged around different types of information standards and through its governing council. With members drawn from most continents and regions, their collaboration results in standards that are internationally applicable; the GeoSciML and EarthResourceML are examples of data models developed with multinational cooperation and applied in global and regional applications such as OneGeology and Minerals4EU. The data models are supported by geoscience vocabularies developed and published by CGI.

CGI as an IUGS commission has both a unique position and an opportunity around governance of global geoscience information standards. Through its enduring status, CGI is not bound by finite and funding-constrained projects. The contributing projects can be very influential for standards development but sustaining standards after project cessation can be difficult.

To date, CGI has tended to manage and govern standards it has developed or co-developed. The opportunity for CGI going forward is to take more of a leadership role across IUGS and internationally-focussed societies and agencies to host, manage and/or promote their standards. With growing expectations of FAIR Principles adherence across the global geoscience community, CGI, as a commission of the IUGS, can help enable their implementation through providing enduring, authorised, geoscience information standards and services.

How to cite: Rattenbury, M.: International Geoscience Information Standards, Management and Governance, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-1701, https://doi.org/10.5194/egusphere-egu24-1701, 2024.

16:55–17:05
|
EGU24-17356
|
On-site presentation
|
Martina Stockhause, Matthew Mizielinski, Anna Pirani, Lina Sitz, Alessandro Spinuso, Mudathir Abdallah, Jesbin Baidya, Paul Durack, and Daniel Ellis

The Intergovernmental Panel on Climate Change (IPCC) regularly assesses a wide range of research results related to climate change reaching from physical sciences to economic and social sciences to provide policy makers with options for combatting the challenges of climate change. The IPCC authors analyze data across multiple domains and from multiple sources.

The IPCC data guidelines enhance the transparency of IPCC outcomes by ensuring that  figure creation is traceable, citing input data and long-term preserving data and software. The related data and metadata requested from the authors includes detailed information on datasets used in every figure, for which standardized machine-accessible and -readable information needs to be supported by the input data providers.

An example of an important input data provider is the Coupled Model Intercomparison Project (CMIP), which has continuously improved its standards and data infrastructure to keep track with the significant increase in the scale of the project over recent  phases. CMIP defines a set of standards including vocabularies for controlled metadata fields, e.g. variable and experiment names, along with the data itself and its structure. A set of infrastructure services provide access to data, through the Earth System Grid Federation (ESGF), description of the climate models used and known errata, through ES-DOC services, and data citation information including data usage in derivative data sets and published papers where known. 

The contribution will discuss the diverse data-workflows of the IPCC authors and the ways that the CMIP infrastructure supports them. Authors access data from the primary data portals of the  ESGF, but also from secondary data portals (Copernicus, Pangeo, Climate4Impact) or local data pools hosted by national institutions. The IPCC authors have faced a number of challenges, including accessing data citation and model description information together with the data, and in identifying new dataset versions with significant changes. With the IPCC’s plan to utilize provenance records in AR7 to gather all information requested by the IPCC data guidelines, machine-readable information accessible through the file headers becomes essential. 

The entire IPCC AR7 data workflow needs to be supported by tools: the figure creation by the IPCC authors, the report editing process by the TSU and the curation of the CMIP7 input data subset used and the intermediate and final datasets created by various DDC Partners, including bi-directional references between outputs. Virtual workspaces such as CEDA and  DKRZ provided for the authors in AR6, which give access to their data pools and common software packages like the ESMValtool, can support the authors in the preparation of their figures and the provision of the requested documentation and provenance information. A dedicated Figure Manager will play a central role in managing the report figures and supporting the overall data workflow.  Ultimately, lowering the burden for the authors, the TSU staff and the DDC Partners. This timely gathered information can then be analyzed and used for a harmonization of dataset version usage across the chapters and reports.

How to cite: Stockhause, M., Mizielinski, M., Pirani, A., Sitz, L., Spinuso, A., Abdallah, M., Baidya, J., Durack, P., and Ellis, D.: How do the CMIP7 infrastructure plans support the implementation of the IPCC data guidelines?, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17356, https://doi.org/10.5194/egusphere-egu24-17356, 2024.

17:05–17:15
|
EGU24-12690
|
On-site presentation
Sara Polanco, Dan Sandiford, Xiaodong Qin, Andres Rodríguez Corcho, Lauren Ilano, Christopher Alfonso, Julian Giordani, Ben Mather, Nigel Rees, and Rebecca Farrington

The field of numerical modeling of Earth’s systems is rapidly growing and it is instrumental for addressing the current environmental crisis. Such models often require specialized computational resources (HPC), can take days-to-weeks to run, and produce large volumes of heterogeneous output data. The lack of curation of these numerical models and community standards hinders our ability to access, interpret and build on published numerical models. Here, we present a first-of-its-kind open science framework that aims to establish a community practice to increase the usefulness of numerical modeling outputs and leverage computational resources. M@TE provides a digital platform that encapsulates the entire model development process: from setup, to model output, and analysis. This supports discovery, data preservation, reproducibility, and reuse, with flexibility for users with different levels of expertise. M@TE has a human-browsable, machine-searchable, user-friendly front end (https://mate.science/ ), and a back-end GitHub organization (https://github.com/ModelAtlasofTheEarth) and model output repository targeted to expert users. Contributions to M@TE are handled by GitHub automation workflows that guide contributors through the process of documenting their models, ensuring that they meet community standards, validating metadata and creating DOIs.  M@TE provides a platform for a much wider appreciation of Earth processes and numerical modeling, particularly to industry stakeholders, professional geoscientists and educators. Furthermore, M@te is creating a single platform that will advance the interoperability of digital twins required to address the current environmental crisis.

How to cite: Polanco, S., Sandiford, D., Qin, X., Rodríguez Corcho, A., Ilano, L., Alfonso, C., Giordani, J., Mather, B., Rees, N., and Farrington, R.: Model Atlas of the Earth (M@TE): advancing the interoperability of digital twins, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12690, https://doi.org/10.5194/egusphere-egu24-12690, 2024.

17:15–17:25
|
EGU24-5509
|
ECS
|
Virtual presentation
Glorie Metsa WOWO, Pierre C. Sibiry Traore, Vijaya Joshi, Paul Cohen, Janet Mumo Mutuku, Mihai Surdeanu, Maria Alexeeva Zupon, and Keith Alcock

Forests and arable land in Ghana face a significant threat due to the mechanisation and increase of illegal gold mining (galamsey). These directly affect local’s diets and nutrition, impacting communities that rely on forest resources and small-scale farming for sustenance. In regional systems, accurately predicting future outcomes is a crucial task, with applications ranging from environmental management to agriculture. Recent advances in participatory science and modelling have highlighted the potential of building collective models based on the knowledge and beliefs of local populations who interact with the system. Such approaches have provided more accurate estimations of future outcomes compared to traditional expert-driven methods. Under the HEURISTICS project, we are exploring the causal efficacy of local knowledge, beliefs, and attitudes in local communities' decisions on transition to agriculture, forest and Galamsey. Using unsupervised and supervised classification, different land uses and land covers (LULC) are classified from 2017 - 2023, including: Forest, Croplands, Settlements, Water, and Galamsey/Mining Sites. In addition to EO data, open-source data from OpenStreetMap are extracted, providing valuable information on roads, rivers, water streams, and administrative boundaries. To further enrich the data, machine reading models are employed to extract beliefs from articles, ranking them based on relevance to topics such as Galamsey, mining, cities, and settlements. Additionally, we leverage text data to map public sentiment towards mining activities. By analysing the origin and sentiment of sentences, we gain insights into how people perceive different areas and how these perceptions are connected to land use. Further analysis examines the factors influencing sentiment scores, including mining proximity, boundary effects, and authority influence. Grid-level sentiment maps reveal nuanced spatial patterns and highlight areas potentially impacted by mining. We also predict future mining trajectories by using machine learning models trained on historical  2017-2022 text and mining data that allows us to make a prediction for the 2023 year, and to identify key  factors correlated with the mining activities. An R-squared value of 0.94 was obtained, indicating that our approach explains 94% of the variance in mining proportion. 

Keywords: Land use prediction, mining activity, Galamsey, sentiment analysis, remote sensing, Ghana.

How to cite: Metsa WOWO, G., Sibiry Traore, P. C., Joshi, V., Cohen, P., Mumo Mutuku, J., Surdeanu, M., Alexeeva Zupon, M., and Alcock, K.: Predicting Mining activities dynamics in Ghana: A Fusion of Social beliefs and Remote Sensing, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-5509, https://doi.org/10.5194/egusphere-egu24-5509, 2024.

17:25–17:35
|
EGU24-8323
|
On-site presentation
|
Peter Baumann

Data about the Earth are too difficult to access, making EO exploitation inaccessible for non-experts and tedious for experts. Reasons are manifold, from intrinsic complexity to technically overloaded handling where data are presented more in a generator-centric than a user-centric manner. Under the headline of Analysis-Ready Data (ARD) significant research is going on to find ways of stripping off unneccesary burden from services.

While progress is being made on improving metadata, such as by CEOS, the data perspective still is underrepresented. With our research we aim at contributing to closing this gap in particular on gridded data, so rasters and datacubes. Starting point is the data and processing model the ISO/OGC coverage standards offer; in these ecosystems, three use cases are inspected: determination of service quality parameters, automated data fusion, and ML. We find that some ARD aspects are covered, but we also spot several issues that deserve investigation and standardization effort. Broadly, these fall into the following categories:

  • conceptual clarification, ex: pixel-in-center / pixel-in-corner;
  • enhancing existing (and otherwise proven) standards, ex: establish a framework for units of measure amenable to autoamtic conversion, similar to coordinate reference systems;
  • improved standards governance, ex: avoiding competing standards known to be not interoperable.

In our talk we present results achieved from work in OGC Testbed-19 and EU FAIRiCUBE. We discuss gaps found and present suggestions for improvement towards easier and more reliable consumption of EO data by humans and machines.

 

How to cite: Baumann, P.: Analysis-Ready EO Data: A Standards-Centric Perspective, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-8323, https://doi.org/10.5194/egusphere-egu24-8323, 2024.

17:35–17:45
|
EGU24-14071
|
ECS
|
On-site presentation
Yuhan "Douglas" Rao, Rob Redmon, and Eric Khin

As artificial intelligence (AI) and machine learning (ML) gaining broad interests in the Earth and space science community, the demand for AI-ready data can support the development of responsible AI/ML applications with open environmental data. Through a broad community collaboration under Earth Science Information Partners, we have developed an AI-readiness checklist as a community guideline for the development of AI-ready open environmental data. The checklist was initially based on an early draft of an AI-ready matrix developed by the OSTP Open Science Sub-committee but has been modified notably based on feedback from data users and AI/ML practitioners. The current version of the AI-readiness checklist can be used to holistically assess the documentation, quality, access, and pre-processing of a given dataset. The AI-readiness assessment result can be then summarized into a data card that provides human-readable metrics to assist users in determining if the dataset meets the user's need for their AI/ML development. The next milestone of this community-driven effort is to develop a community-driven convention by building on the existing data conventions and standards to fill the data management gap to support AI-ready data management. In this presentation, we will also showcase a collection of AI-ready climate datasets applying the AI-readiness checklist and data card concept to support AI/ML applications in climate sciences. The AI-readiness development process requires active community engagement with data repositories, domain scientists, and AI/ML practitioners to establish a flexible framework to ensure the rapid evolution of AI/ML technologies can be addressed in modern data management.

How to cite: Rao, Y. "., Redmon, R., and Khin, E.: Community-Driven Development of Tools to Improve AI-Readiness of the Open Environmental Data, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14071, https://doi.org/10.5194/egusphere-egu24-14071, 2024.

17:45–17:55
|
EGU24-14110
|
ECS
|
On-site presentation
Kristina Vrouwenvelder, Natalie Raia, and Shelley Stall

In an era when research is increasingly interdisciplinary and when the use of AI and ML methods across the Earth, space, and environmental science continues to grow, the importance of FAIR data is clear. The community has worked to elevate the importance of outputs beyond the research article; publishers, including the American Geophysical Union, now require authors to share data and code alongside their research articles in an effort to increase the transparency and reproducibility of science and enhance data and software reuse. Yet before we can fully realize the potential of FAIR data and software policies to advance the scientific enterprise, significant challenges remain. These include the need for education and efficient workflows to ease the burden on the researcher and increase their uptake of open, FAIR practices at the point of data and/or software publication as well as at article publication. Here, we will share a progress report from AGU on the effect of publisher policies on data and software sharing and discuss work by AGU and the broader community to break down barriers for the researcher and to ensure that data and software creators receive appropriate attribution for their work. 

How to cite: Vrouwenvelder, K., Raia, N., and Stall, S.: Overcoming challenges to data and software attribution throughout the research workflow: a publisher perspective, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14110, https://doi.org/10.5194/egusphere-egu24-14110, 2024.

17:55–18:00

Posters on site: Wed, 17 Apr, 10:45–12:30 | Hall X4

Display time: Wed, 17 Apr, 08:30–Wed, 17 Apr, 12:30
Chairpersons: Helen Glaves, Danie Kinkade, Jens Klump
FAIR Solutions
X4.164
|
EGU24-5336
The Importance of "FAIR by design" data in marine robotics 
(withdrawn)
Roberta Ferretti, Simona Aracri, Corrado Motta, Marco Bibuli, Gabriele Bruzzone, Massimo Caccia, and Angelo Odetti
X4.165
|
EGU24-7376
|
Anusuriya Devaraju, Woodman Stuart, Sam Bradley, Vincent Fazio, Neda Taherifar, Benyamin Motevalli, Jens Klump, Lesley Wyborn, and Rebecca Farrington

AuScope Australia (https://www.auscope.org.au/) focuses on delivering data, services, and tools to support the future research of the Australian geoscience research community. As a component of the “Data Lens” of the AuScope Downward-Looking Telescope, the AuScope Discovery Portal (https://portal.auscope.org.au/) harvests metadata from affiliated data catalogues to support more comprehensive access. Over time, it has become apparent that the data repositories offered by the AuScope partners and universities need to be improved for curating data from the AuScope projects. Many do not provide structured metadata suitable for harvesting into the Discovery Portal and offer limited data discovery and retention support. Consequently, most institutional data repositories do not support AuScope’s strategy toward making all data from AuScope projects or data collected with AuScope-funded instruments compliant with the FAIR (Findable, Accessible, Interoperable, and Reusable) Guiding Principles for both humans and machines (Wilkinson et al., 2016). 

The goal of the AuScope Data Repository is to preserve and offer continued access to data from its communities (e.g., NCRIS-funded data projects and Australian Geoscience research communities) working on fundamental geoscience questions and grand challenges, including climate change, natural resources security and natural hazards. Datasets submitted to the repository will be made openly available where appropriate, with attributions to promote open science. The repository is essential for geoscience research innovation in support of the AuScope 5-Year Investment Plan and Australian Academy of Science Decadal plan for Australian Geoscience: Our Planet, Australia's Future. 

This presentation will cover the repository’s scope and design, including non-technical (e.g., practices, governance, and engagement) and practical aspects of the repository (e.g., persistence identification, data discovery, interoperability, workflows and security and architectural requirements). We discuss the key considerations when setting up a data repository for scientific communities. The first release of the repository is now available online to gather early feedback from the selected AuScope data providers and project affiliates. We will summarize the presentation with the next steps of the development process, including engagement activities and documented data practices.

How to cite: Devaraju, A., Stuart, W., Bradley, S., Fazio, V., Taherifar, N., Motevalli, B., Klump, J., Wyborn, L., and Farrington, R.: Building A Trustworthy Data Repository for Australian Geoscience Research Communities, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-7376, https://doi.org/10.5194/egusphere-egu24-7376, 2024.

X4.166
|
EGU24-11538
Jan Michalek, Daniele Bailo, Javier Quinteros, Otto Lange, Rossana Paciello, Valerio Vinciarelli, and Kety Giuliacci and the Geo-INQUIRE project team

Building and establishing a fully interoperable Research Infrastructure (RI) allowing smooth data exchange across various scientific domains is a dream of many researchers and data managers around the world. Indeed, there are various pathways to achieve it and many attempts are currently being made. Unfortunately, there is no single approach to fit all RIs since there is natural heterogeneity that encompasses the different phases of the data lifecycle, starting from data collection until data interpretation spreading across various domains. 

In the current contribution we present an example methodology for building an interoperable RI which is being applied within the Geosphere INfrastructures for QUestions into IntegratedREsearch project (Geo-INQUIRE, https://www.geo-inquire.eu/). Geo-INQUIRE was launched in October 2022 and comprises a unique consortium of 51 partners, including national research institutes, universities, national geological surveys, and European consortia. A portfolio of 150 Virtual Access (VA) and Transnational Access (TA, both virtual and on-site) installations will be offered to the scientific community  across domain barriers, especially the land-sea-atmosphere environments, including EPOS, EMSO, ARISE, ECCSEL, and ChEESE RIs. The great challenge is to find common grounds across the domains and define principles general enough for all participating RIs, though detailed enough allowing useful interoperability. The example methodology we are presenting here has been developed within implementation of European Plate Observing System (EPOS; an European Research Infrastructure Consortium since 2018) and relies on data access provision through web-services (APIs). For achieving true interoperability it is not enough to have robust and efficient web-services but also having rich metadata description provided by sufficient metadata models compliant with FAIR principals is critical. The methodology therefore envisages data integration through an approach that includes web-services, rich metadata and semantics. These three key elements are monitored and evaluated through a set of criteria distributed into three levels which were put together into an Implementation Level Matrix (ILM) to understand the landscape of service provision. This ILM serves as a tool for capturing the changing maturity of installations/services and tracking their readiness for interoperable integration into RIs.

The interoperability starts at the data provider level and therefore domain-specific coordination and common development efforts are important.  A good example of this is the work done in seismology. After a long process of joint work with experts in data FAIRness a group of data centres submitted to the International Federation (FDSN) a proposal to update their community guidelines about DOIs for seismic networks. This was iterated, discussed, and finally adopted as a community standard in December 2023. However, the guidelines include not only topics exclusive to seismology, but also the aim of being as FAIR as possible from the point of view of a multi-disciplinary perspective. The challenge to extrapolate this to the RIs and data providers from other disciplines taking part in the project and adapt them to their reality is an on-going effort within the project. 

How to cite: Michalek, J., Bailo, D., Quinteros, J., Lange, O., Paciello, R., Vinciarelli, V., and Giuliacci, K. and the Geo-INQUIRE project team: Methodology for building interoperable Research Infrastructures: Example from Geo-INQUIRE project, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11538, https://doi.org/10.5194/egusphere-egu24-11538, 2024.

X4.167
|
EGU24-17059
Paolo Lino Manganello, Giuseppe Castorina, Maria Pia Congi, Anna Maria Blumetti, and Luca Guerrieri

The Earthquake Environmental Effects Catalogue (EEE Catalogue) represents a global database of environmental and geological effects induced by recent, historical and paleoearthquakes. These coseismic effects include among others surface faulting, ground subsidence, ground cracks, slope movements, liquefaction, tsunamis, hydrogeological anomalies. The observed relevance of earthquake environmental effects as a major source of damage, in addition to vibratory ground motion, confirms how the knowledge of this type of effects can be essential for seismic hazard and intensity assessment. Therefore, the EEE Catalogue can be considered as an helpful tool for land planning, particularly in areas of high seismicity. 
The first version of EEE Catalogue was launched in 2011 and was designed in PHP format. The structure of this catalogue is based on collecting data at three different levels of increasing detail, corresponding to three different tables: Earthquake, Locality, Site. The Earthquake features provide general information on the seismic event. The Locality features contain information on the features of a specific locality where some coseismic effects have occurred. The Site features provide information at the site of each earthquake environmental effect, including detailed characteristics on the type of earthquake. The structure of the catalogue is completed by Country features, consisting of point geometry.
The focus of this work is the new release of the EEE Catalogue to populate the Research Infrastructure of GeoSciences IR. The GeoSciences IR aims to create a research infrastructure for the Italian Network of Geological Surveys (RISG), coordinated by the Geological Survey of Italy (ISPRA). In this framework, for the GeoSciences IR is mandatory to be in compliance with FAIR (Findability, Accessibility, Interoperability, Reuse) principles to improve data sharing. Data that become part of the infrastructure must be harmonized in accordance with the INSPIRE Directive and FAIR principles.
A GeoPackage implementation  of the EEE Catalogue has been carried out. GeoPackage represents an open format for geospatial information developed by the Open Geospatial Consortium (OGC). This format has several advantages including platform independence, versatile data support, large storage capacity, and maintenance of relations between database tables.
The EEE Catalogue structure has been modified and improved. Furthermore, because of the complexity of the dataset and the need to enter new records into the database, new code lists were implemented, whereby a specific database field is filled by a list of predefined attributes. The main challenge of the new release is the semantic standardization on the basis of standard vocabularies (e.g., INSPIRE, GeoSCiML).

How to cite: Manganello, P. L., Castorina, G., Congi, M. P., Blumetti, A. M., and Guerrieri, L.: A GeoPackage implementation for the new version of Earthquake Environmental Effects Catalogue (EEE Catalogue) in the context of GeoSciencesIR project, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17059, https://doi.org/10.5194/egusphere-egu24-17059, 2024.

X4.168
|
EGU24-15541
Gerhard Wörner, Marthe Klöcking, Adrian Sturm, Bärbel Sarbas, Leander Kallas, Stefan Möller-McNett, Kirsten Elger, Daniel Kurzawe, and Matthias Willbold

The GEOROC database is a leading, open-access source of geochemical and isotopic datasets that provides access to curated compilations of igneous and metamorphic rock and mineral compositions from >20,600 publications. It is a data resource that supports and facilitates hundreds of new research publications each year across multiple geoscientific and related disciplines.

This presentation is to “advertise” to the geochemical community this data product and our ongoing efforts to improve the service by providing FAIR (findable, accessible, interoperable and reusable) geochemical data. We will also describe some recently published research where authors were using large geochemical data compilations such as GEOROC and PetDB for innovative approaches in digital geochemistry.

To further support such research also in the future, the Digital Geochemical Data Infrastructure (DIGIS) initiative is developing a new IT and data infrastructure for GEOROC 2.0 to enable modern solutions to data submission, discovery and access. GEOROC data compilations are made accessible via a web search interface and an API. In addition, DIGIS maintains a direct data pipeline between the data compiled in GEOROC and the EarthChem Portal. Hence, GEOROC represents one of six different geochemical databases that can be queried and accessed synchronously within the EarthChem Portal. The DIGIS infrastructure further partners with GFZ Data Services, a domain repository for geosciences data, hosted at GFZ, offering data publication services with assigned digital object identifiers (DOI). Individual researchers can directly submit their geochemical datasets to the repository (using the EarthChem Data Templates) where they are archived for the long term. Regular thematic snapshots of the GEOROC synthesis database are archived in the GRO.data repository of the University of Göttingen.

Part of this cooperation is the development of standardised vocabularies and data reporting to enhance interoperability of geo- and cosmochemical data systems. Harmonized data entry for the GEOROC, PetDB and Astromat synthesis databases will avoid duplication and ensure consistent data and metadata. With these efforts, and as a participant of the OneGeochemistry(1,2) initiative, DIGIS is working towards the goal of globally harmonised geochemical data to enable interdisciplinary, data-driven research.

 

References

Klöcking, M. et al. (2023). Community recommendations for geochemical data, services and analytical capabilities in the 21st century. In Geochimica et Cosmochimica Acta (Vol. 351, pp. 192–205).

Prent, A. et al. (2023) Innovating and Networking Global Geochemical Data Resources Through OneGeochemistry. Elements 19, Issue 3, pp. 136–137.

How to cite: Wörner, G., Klöcking, M., Sturm, A., Sarbas, B., Kallas, L., Möller-McNett, S., Elger, K., Kurzawe, D., and Willbold, M.: GEOROC 2.0: A Globally Connected Geochemical Database to Facilitate Interdisciplinary, Data-Driven Research, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-15541, https://doi.org/10.5194/egusphere-egu24-15541, 2024.

X4.169
|
EGU24-17594
Advancing Transparency and Accessibility: Implementing FAIR Data Principles in IPCC AR6 WGI Report
(withdrawn)
Lina Elisabet Sitz, Anna Pirani, Jose Manuel Gutierrez, Charlotte Pascoe, Martina Stockhause, David Huard, Diego Cammarano, Molly MacRae, and Ellie Fisher
X4.170
|
EGU24-13755
Kelsey Druken, Aidan Heerdegen, Romain Beucher, Roger Edberg, Natalia Bateman, Victoria Allen, Claire Carouge, Martin Dix, Heidi Nettelbeck, and Andy Hogg

ACCESS-NRI is a national research infrastructure (NRI) established to support the Australian Community Climate and Earth System Simulator, or ACCESS. The ACCESS suite of software and data outputs are essential tools used to simulate past and future climate, weather and Earth systems and to support research and decision making within Australia. ACCESS-NRI's mission is to build an open collaborative infrastructure that will accelerate research in Earth system, climate and weather modelling as well as enable new research not currently possible. The facility brings together skills in software development, high-performance computing, data management and analysis to enhance the ACCESS modelling framework, making it easier to use and lowering the barrier for innovation.   

To improve usability and uptake of this complex modelling framework, the software, data and training program is comprised of 3 teams that focus on providing open and transparent processes for the development, release, and user training of the ACCESS models, tools and data. This presentation will provide an overview of the program’s establishment over the first 18 months as a new facility. This includes enabling reproducible build and deployment workflows, supporting tools to analyse and evaluate model output, data management and development of training tools and materials. Core to all capabilities across the organisation is openness and community engagement in our open development and decision-making processes. This presentation will also discuss the critical infrastructure foundations we have built to support this engagement and advance the impact of ACCESS. 

ACCESS became a National Research Infrastructure (NRI) facility through funding from the Australian Government’s National Collaborative Research Infrastructure Strategy (NCRIS) Program and officially launched in June 2022. This facility is a major collaborative undertaking between the Bureau of Meteorology, CSIRO and five Australian universities, in collaboration with national and international partners. 

How to cite: Druken, K., Heerdegen, A., Beucher, R., Edberg, R., Bateman, N., Allen, V., Carouge, C., Dix, M., Nettelbeck, H., and Hogg, A.: Building the software, data and training foundations to support Australia’s climate simulator (ACCESS-NRI), EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13755, https://doi.org/10.5194/egusphere-egu24-13755, 2024.

FAIR Communities and Open Science
X4.171
|
EGU24-13944
Shelley Stall and Kristina Vrouwenvelder

Open Science is a paradigm shift for science; open practices can remove barriers to sharing science and increase its reproducibility and transparency. As society faces global, interdisciplinary challenges like climate change, open scientific research – including open data, software, workflows, samples – is more important than ever. AI and ML methods are increasingly used in the Earth, space, and environmental sciences to investigate these large challenges, and analysis-ready data for use in these methods is predicated on open, FAIR principles for data sharing. However, maximizing FAIR-ness and ensuring research is ‘as open as possible’ across the many Earth, space, and environmental science disciplines encompasses a range of challenges, including lacking infrastructure, incentives, resources, and guidance for all participants in the research ecosystem. We believe that societies, including AGU, EGU, JpGU, communities like ESIP, and beyond, have a significant role to play in catalyzing collaboration to overcome this range of challenges. Here, we share progress on collaborative efforts involving AGU, partners, and the broader community to solve these challenges, including developing discipline-specific guidance for researchers on data and software sharing, training for researchers in leading open science practices, and guidance and partnerships for publishers interested in implementing FAIR and CARE in the publishing workflow. We look forward to partnering on global collaborations to advocate for our researchers and the open future of science.

How to cite: Stall, S. and Vrouwenvelder, K.: Open collaboration for open science: a global perspective on disciplinary challenges in the Earth, space, and environmental sciences, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13944, https://doi.org/10.5194/egusphere-egu24-13944, 2024.

X4.172
|
EGU24-13221
Katrina Virts, Pontus Olofsson, Sean Gregory, Jeanne' le Roux, and Rahul Ramachandran

Every two years, the National Aeronautics and Space Administration (NASA) leads an assessment of U.S. Federal civilian agency Earth observation needs submitted through the Satellite Needs Working Group (SNWG) survey.  In four survey cycles beginning in 2016, nearly 400 high-priority satellite needs have been identified, spanning Earth Science and representing a wide variety of potential applications for Earth observation data.

During each assessment cycle, new data products and services (i.e., solutions) that meet the needs of multiple agencies are identified and proposed for funding.  The majority of solutions being developed or currently operational are global in scope, including harmonized land surface reflectance data from Landsat and Sentinel-2; composites of cloud properties derived from MODIS, VIIRS, and five geostationary satellites; dynamic surface water extent and land surface disturbance products derived from multiple optical and radar missions; a suite of low-latency products from the ICESat-2 mission; and a soil moisture product derived from the upcoming NISAR mission.

The SNWG Management Office, within the Earth Action element of NASA’s Earth Science Division, manages both the biennial SNWG survey assessment and the development of solutions starting at full capacity with the 2020 cycle.  Each solution project is required to align with NASA’s open science policy, including developing source code in an open code repository, having an open-source software license, and making all data freely available via NASA’s Earthdata website.  The presentation will include an overview of the SNWG process, its emphasis on open science, and highlight several operational solutions freely available to the global research and applications communities.

How to cite: Virts, K., Olofsson, P., Gregory, S., le Roux, J., and Ramachandran, R.: NASA’s Satellite Needs Working Group Management Office: Developing Solutions in an Agile, Open Science Environment, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13221, https://doi.org/10.5194/egusphere-egu24-13221, 2024.

X4.173
|
EGU24-19102
|
ECS
Xiaoyan Hu, Ziming Zou, Jizhou Tong, and Qi Xu

The age of big science is leading to the fact that many of today's important scientific problems and grand human challenges call for major breakthroughs through interdisciplinary joint research. Space science research and innovative applications are facing such a situation. Open science and artificial intelligence enable a new era in the space science research and application, offering new opportunities as well as challenges, such as the absence of data governance theories and standards, data quality and interoperability to be improved, and insufficient supply of data & intelligence-driven analysis models and tools.

To make good use of large-scale space science research data and effectively support across-domain joint research, the Chinese National Space Science Data Center (NSSDC), in conjunction with several universities and research institutions, has carried out a series of practices on FAIR data implementation and AI for space science, contributing to the development of a new generation of open research infrastructure.

On data governance and stewardship side, NSSDC actively promotes the FAIR principles in China's space science satellite missions and large-scale ground-based observation network projects, develops a theoretical model of scientific data governance and a set of data standards. For intelligent data application, NSSDC is exploring the development of AI-ready space science big data along with the development of intelligent analysis tools and models for automatic target identification, feature extraction, correlation and causal analysis, and event evolution prediction. In this process, we found that many of these AI-ready demands coincide with FAIR principles. How to achieve AI-ready and FAIRness are two closely related goals. In fact, both need to deal with the scale disaster and dimensional disaster of domain data, to enhance the openness of scientific resources including data, models, software and scientific workflows, to adapt to the tendency of significantly increasing machine participation in the scientific research process, and to address the complexity puzzle of frontier scientific problems. Through the fusion, integration and efficient interconnection of these scientific resources, an open scientific infrastructure that supports cross-domain and cross-platform data discovery, access, analysis and mining has been established, effectively supporting joint innovation for major scientific issues. Currently, NSSDC is also exploring connections and interactions with scientific infrastructures in other related disciplines, such as astronomy, high-energy physics and Earth system science.

How to cite: Hu, X., Zou, Z., Tong, J., and Xu, Q.: NSSDC's Practices toward FAIR Data and AI for Space Science, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19102, https://doi.org/10.5194/egusphere-egu24-19102, 2024.

X4.174
|
EGU24-19771
Hannes Thiemann, Ivonne Anders, and Martin Schupfner

Governmental organizations collect and manage diverse data types at various levels to fulfill their official duties. This includes geographic, environmental, meteorological, population, health, traffic, transport, financial, and economic data. Traditionally, access to such data was restricted, but over the past decade, there has been a global shift towards more open data policies, influenced in part by directives like GeoIDG, the PSI directive, and INSPIRE. In Germany, federal states and their offices have also embraced open data policies, with some data being made publicly available (Open Government Data) through portals like Destatis or GDI-DE. This data serves multiple purposes, such as identifying locations, analyzing environmental trends, traffic planning, health service planning, and more. Public authorities' data is increasingly utilized for scientific investigations, yet the full potential remains untapped, particularly for large datasets. Despite the high quality of governmental data, further alignment with FAIR principles (Findable, Accessible, Interoperable, and Reusable) is necessary to enhance its efficiency for reuse in research. Privacy regulations and legal frameworks may impose limitations, necessitating data anonymization or adherence to modern data standards. Nevertheless, governmental data remains a valuable resource contributing significantly to expanding knowledge across scientific disciplines.

In a pilot project funded by the NFDI4Earth, in a collaboration, the German national meteorological service (DWD) and the German Climate Computing Centre (DKRZ) aimed to facilitate access to data from public authorities, increase data visibility, as well as the number of users from different disciplines, and make these data available in standardised and FAIR formats for easy use in research but also for other public applications. As an example, the COSMO-REA6 reanalysis dataset from DWD (Kaspar et al. 2020) was selected, crucial for climate modeling, analyses, and energy applications in Europe. The standardization process involved mapping public authority standards to domain-specific standards in climate research, requiring close collaboration between DWD and DKRZ. After detailed curation and quality checks, the dataset was made accessible through the ESGF infrastructure and long-term archived in the WDCC, addressing licensing and authorship considerations.

The project's insights and lessons learned were incorporated into a blueprint, providing guidance on making data from other authorities accessible and usable for both research and the public. Overall, the entire process can be divided into 5 sub-steps: (1) determination and classification of the need, (2) survey of the feasibility, (3) implementation, (4) feedback and follow-up, (5) dissemination. This blueprint outlines generalizable steps and aspects applicable across domains and collaborators, offering a framework for optimizing the use of governmental data in diverse fields.


References: 

Kaspar, F., et al., 2020: Regional atmospheric reanalysis activities at Deutscher Wetterdienst: review of evaluation results and application examples with a focus on renewable energy, Adv. Sci. Res., 17, 115–128, https://doi.org/10.5194/asr-17-115-2020, 2020. 



How to cite: Thiemann, H., Anders, I., and Schupfner, M.: Facilitate the reuse of data from public authorities in research , EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-19771, https://doi.org/10.5194/egusphere-egu24-19771, 2024.

X4.175
|
EGU24-20915
Is this Community in Denial of Reality?
(withdrawn)
Hans Pfeiffenberger

Posters virtual: Wed, 17 Apr, 14:00–15:45 | vHall X4

Display time: Wed, 17 Apr, 08:30–Wed, 17 Apr, 18:00
Chairpersons: Jens Klump, Vasco Mantas, Helen Glaves
FAIR Communities
vX4.28
|
EGU24-20634
Hazem Mahmoud, Renee Key, Crystal Gummo, Brennan Bunch, Pauline Detweiler, Susan Kizer, John Kusterer, Matthew Tisdale, and Jeff Walter

In the dynamic landscape of Earth Science research, the promotion of open science principles is paramount for advancing knowledge and collaboration. The Earthdata Forum is an actively maintained and operational user forum for all participating National Aeronautics and Space Administration (NASA) Earth Observing System Data and Information System (EOSDIS) Distributed Active Archive Centers (DAACs), and the Global Change Master Directory (GCMD). The Forum serves as a cross-DAAC platform from which user communities can obtain authoritative information relating to NASA Earth Science. This abstract explores the role of the Earthdata Forum forum.earthdata.nasa.gov as a pivotal platform in fostering open science within the Earth Science community. The platform serves as a hub for researchers to actively engage in discussions, share datasets, and collaboratively tackle challenges in the field.

 

Key aspects discussed include the platform's contribution to data accessibility, collaboration, and knowledge sharing. Forum.earthdata.nasa.gov provides a space where researchers transparently ask questions, discuss methodologies, share insights, and seek advice from a vibrant community. The resulting collaborative environment not only facilitates the exchange of ideas but also bolsters the collective knowledge base.

 

The abstract also delves into the significance of community engagement within the platform, emphasizing how active participation contributes to the ethos of open science. Furthermore, discussions on open-source tools, policy considerations, and the sharing of educational resources underscore the multifaceted role of forum.earthdata.nasa.gov in advancing open science principles.

 

As we navigate the evolving landscape of Earth Science research, understanding the impact of platforms like the Earthdata Forum on promoting transparency, accessibility, and collaboration becomes crucial. This abstract provides insights into the platform's role in nurturing an open science culture and its implications for the broader scientific community.

 

How to cite: Mahmoud, H., Key, R., Gummo, C., Bunch, B., Detweiler, P., Kizer, S., Kusterer, J., Tisdale, M., and Walter, J.: Fostering Open Science in Earth Data Science Research: Insights from Earthdata Forum, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-20634, https://doi.org/10.5194/egusphere-egu24-20634, 2024.