ESSI3.1 | Collaborations, Tools and Services Towards an Integrated Research Data Infrastructure
EDI
Collaborations, Tools and Services Towards an Integrated Research Data Infrastructure
Co-organized by ERE1/GI2, co-sponsored by AGU and JpGU
Convener: Martina Stockhause | Co-conveners: Danie Kinkade, Yasuhiro Murayama, Alba BrobiaECSECS, Bruce Crevensten
Orals
| Wed, 30 Apr, 14:00–18:00 (CEST)
 
Room -2.92
Posters on site
| Attendance Tue, 29 Apr, 10:45–12:30 (CEST) | Display Tue, 29 Apr, 08:30–12:30
 
Hall X4
Posters virtual
| Attendance Tue, 29 Apr, 14:00–15:45 (CEST) | Display Tue, 29 Apr, 14:00–18:00
 
vPoster spot 4
Orals |
Wed, 14:00
Tue, 10:45
Tue, 14:00
Addressing global environmental and socio-technical challenges requires interdisciplinary, data-driven approaches. Today’s research produces unprecedented volumes and complexity of value-added research data and an increasing number of interactive data services, putting traditional information management systems to the test. Collaborative infrastructures are challenged by their dual role of advancing research and scientific assessments while facilitating transparent data and software sharing.

Since the breakthrough of datacubes as a contributor to Analysis-Ready Data, a series of implementations have been announced, and likewise services. However, often these are described through publications only and without publicly accessible deployments to evaluate.

We invite abstracts from all data stakeholders that highlight innovative platforms, frameworks, datacube tools, services, systems, and initiatives designed to enhance access and usability of data for research on topics such as climate change, natural hazards, sustainable development, etc. We welcome presentations describing collaborations across national and disciplinary boundaries as well as live demos of datacube tools and services that contribute to building trustworthy and interoperable data networks, guided by UNESCO’s Open Science recommendations, the FAIR and CARE data principles. The expected outcome for attendees is to get a realistic overview on the datacube tools, service landscape and ongoing collaborations that enable researchers worldwide to address pressing global problems through data.

The session is organized in two time blocks, the first focussing on collaboration and the second focussiong on tool aspects of Open Science.

Orals: Wed, 30 Apr | Room -2.92

The oral presentations are given in a hybrid format supported by a Zoom meeting featuring on-site and virtual presentations. The button to access the Zoom meeting appears just before the time block starts.
Chairpersons: Martina Stockhause, Yasuhiro Murayama, Alba Brobia
14:00–14:05
14:05–14:15
|
EGU25-7551
|
Highlight
|
On-site presentation
Reyna Jenkyns

The World Data System (WDS), a member of the International Science Council, serves a membership of over 150 trusted data repositories and related organizations. It builds on the data sharing legacy of World Data Centers that were initiated seven decades ago. Governed by a Scientific Committee, the WDS consists of an International Program Office (WDS-IPO) based in Oak Ridge, Tennessee, USA, and an International Technology Office (WDS-ITO) based in Victoria, BC, Canada. The WDS mission is to enhance the capabilities, impact, and sustainability of our member data repositories and data services. In this presentation, we outline the 2025 to 2027 Action Plan objectives, highlighting activities and collaborations that are underway or planned to progress open science, integrated data infrastructures and FAIR/CARE/TRUST Principles. 

How to cite: Jenkyns, R.: Advancing Open Science through Trusted Data Repository Intersections at the World Data System, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7551, https://doi.org/10.5194/egusphere-egu25-7551, 2025.

14:15–14:25
|
EGU25-4337
|
On-site presentation
Robin Kooyman, Peter Thijsse, Dick Schaap, and Tjerk Krijger

Achieving fast access to analysis-ready data from a large number of multidisciplinary data resources is key for contributing to many of the nowadays societal and scientific challenges via Digital Twins of the Oceans or virtual research environments. However, achieving this kind of performance is a major challenge as original data is often organised in millions of (observation) files which makes it hard to achieve fast responses. Next to this, data from different domains are stored in a large variety of data infrastructures, each with their own data-access mechanisms, which causes researchers to spend much time on trying to access relevant data. In a perfect world, users should be able to retrieve analysis-ready data in a uniform way from different data infrastructures following their selection criteria, including for example spatial or temporal boundaries, parameter types, depth ranges and other filters. 

Therefore, as part of several European projects, MARIS has developed a software system called BEACON with a unique indexing and dynamic chunking system that can, on the fly with high performance, extract specific data based on the user’s request from millions of (observational) data files, containing multiple parameters in diverse units. The system returns one single harmonised file as output, regardless of whether the input contains many different data types or dimensions. In January 2025, BEACON 1.0.0 was made publicly available as an open-source software, allowing everyone to set-up their own BEACON node to enhance the access to their data or use existing BEACON nodes from well-known data infrastructures such as Euro-Argo or the World Ocean Database for fast and easy access to harmonized data subsets. More technical details, example applications and general information on BEACON can be found on the website https://beacon.maris.nl/.

The presentation would focus on one of the core features of BEACON called “Relative Optimized Chunking (ROC)”, which is a unique dynamic chunking technology that has been developed specifically to make the data retrieval as fast as possible. This optimized way of chunking reduces the number of chunks BEACON has to search through when a data request has been made. This is done by applying variable sized chunking on multiple levels at the same time such as geo-location, depth and time, which means that data that is relatively close to each other is chunked accordingly. This enhances the speed because it allows BEACON to traverse the millions of datasets using its index with much more precision by not only finding the relevant datasets, but also the exact data blocks containing the relevant data.

The demonstration will involve the use of an existing BEACON node in the field of marine science to access data subsets via its REST API and demonstrate its performance. This will be done in a Jupyter Notebook by querying data via a JSON request to the BEACON system. By going through the Notebook, it will be explained how the BEACON system can be accessed and used by developers including the most recent developments.

How to cite: Kooyman, R., Thijsse, P., Schaap, D., and Krijger, T.: BEACON - Accelerating access to multidisciplinary data with Relative Optimized Chunking technology, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-4337, https://doi.org/10.5194/egusphere-egu25-4337, 2025.

14:25–14:35
|
EGU25-18594
|
On-site presentation
Benjamin Louisot, Roland Koppe, Robin Heß, Ulrich Loup, Jürgen Sorg, Marc Adolf, Claas Faber, Andreas Lehmann, Nils Brickmann, Marc Hanisch, David Schäfer, Linda Baldewein, Ulrike Kleeberg, Marie Ryan, Sabine Barthlott, Christof Lorenz, Florian Obersteiner, and Hylke van der Schaaf

In environmental sciences, observational data remains indispensable for monitoring and understanding of natural processes, validating earth system models and remote sensing products, and training of data driven methods. However, unified standards and interfaces for ensuring that such data is consistently available, usable, and compliant with FAIR and Open Science principles are still lacking.

The so-called DataHub initiative of the Helmholtz Research Field Earth & Environment, involving seven large environmental research Centers across Germany, addresses this gap by collaboratively developing a cohesive and unified research data space, including consistent data formats, metadata standards, tools, interfaces and services.

Since the beginning of the DataHub, we have been particularly focusing on unifying time-series data from environmental sensor systems, which are operated across all participating Centers. In this context, we have developed a digital ecosystem, that enhances and links existing and established research data infrastructures with well-defined interfaces and metadata standards. This ecosystem now covers the full processing chain from the integration of new sensor systems and their metadata over automatic and manual quality checks and flagging schemes to the visualization via dashboards and data portals or the usage in data analysis frameworks. In particular, our framework consists of multiple independent tools and services like the Sensor Management System (Brinckmann et al., 2024) as dedicated system for managing sensor metadata, the System for Automated Quality Control (SaQC, Schäfer et al. 2024) as common framework for QA/QC, a tailored metadata profile which adapts the SensorThings API (STA) from the Open Geospatial Consortium to common requirements from environmental sciences (Lorenz et al. 2024), the Earth Data Portal (https://earth-data.de) as overarching data portal and visualization suite as well as tools and services that link all these different building blocks.

While the first concepts of this ecosystem were based on temporary tools and interfaces, we have now reached a level of maturity, that allows us to confidently scale our solutions to new communities and user groups. In this presentation, we will hence give a brief overview of our ecosystem as well as the integrated tools and services. The main focus will be on a hands-on demonstration of the full workflow from deploying a new sensor system, the integration into the contributing services, the (meta)data provision via STA as well as the integration in different downstream systems like the Earth Data Portal for data visualization.

By this, we want to promote the potential of a decentralized research data infrastructure, that has been developed and adopted across multiple research Centers and reach out for new communities and user groups for ultimately creating a FAIR and inter-institutional open data space for our environmental sciences.

How to cite: Louisot, B., Koppe, R., Heß, R., Loup, U., Sorg, J., Adolf, M., Faber, C., Lehmann, A., Brickmann, N., Hanisch, M., Schäfer, D., Baldewein, L., Kleeberg, U., Ryan, M., Barthlott, S., Lorenz, C., Obersteiner, F., and van der Schaaf, H.: FAIRification of sensor-based time-series data – a demonstration of the Helmholtz DataHub digital ecosystem , EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-18594, https://doi.org/10.5194/egusphere-egu25-18594, 2025.

14:35–14:45
|
EGU25-21367
|
On-site presentation
Rebecca Farrington, Lesley Wyborn, Jo Croucher, Anusuriya Devaraju, Alex Hunt, Hannes Hollmann, Jens Klump, Angus Nixon, Alexander Prent, Sara Polanco, Nigel Rees, and Tim Rawling

Addressing global environmental and societal challenges requires robust, interdisciplinary data ecosystems that support collaboration across geographic, cultural, and disciplinary boundaries. AuScope, Australia’s National Research Infrastructure (NRI) provider for the geoscience community, collaboratively tackles interdisciplinary grand challenges such as climate change, natural resource security, and natural hazards. AuScope is funded by the Australian Government’s National Collaborative Research Infrastructure Strategy (NCRIS) and integrates tools, data, analytics, and services across Australian research and government agencies, in particular, partnering with organisations at the forefront of research data systems and infrastructure.

Through collaborations with CSIRO, Geoscience Australia, state and territory geological surveys, universities, and other NCRIS facilities, including the National Computational Infrastructure (NCI), the Terrestrial Ecosystem Research Network (TERN), and the Australian Research Data Commons (ARDC), AuScope is addressing the complexities of modern FAIR data management at scales ranging from small scale local installations to co-located High Performance Compute and Data (HPCD) Platforms. Key AuScope initiatives such as Geophysics 2030 Collections (https://ardc.edu.au/project/2030-geophysics-collections/), AusGeochem (https://ausgeochem.auscope.org.au/), the Modelling Atlas of The Earth (M@TE; https://mate.science), and the AuScope Data Repository (https://repository.data.auscope.org.au/) exemplify how the FAIR principles can be operationalised to support impactful research both within and beyond the geosciences and at multiple scales.

Nationally, AuScope collaborates with other Earth and Environmental Research Infrastructure providers, working to transform Australia’s research capabilities through, for example, Coastal Research Infrastructure (CoastRI) and implementing the National Digital Research Infrastructure Strategy (NDRI). Globally, AuScope contributes to initiatives like OneGeochemistry, the CODATA-led WorldFAIR Plus project, EarthScope (US), EPOS, Geo-INQUIRE, and ChEESE (Europe), ensuring compatibility with international research infrastructures, data standards, and best practices while at the same time, aligning with Australia’s geoscience priorities. 

This presentation will highlight how AuScope is progressively operationalising the FAIR and TRUST principles across its investments by focusing on place-based research to foster interoperability, strategic collaboration, and Open Science practices. By aligning with the CARE principles as well as advancing collaborative data infrastructure, AuScope creates trusted, interoperable data ecosystems that empower researchers to effectively and efficiently address pressing interdisciplinary societal challenges at both a national and international scale.

How to cite: Farrington, R., Wyborn, L., Croucher, J., Devaraju, A., Hunt, A., Hollmann, H., Klump, J., Nixon, A., Prent, A., Polanco, S., Rees, N., and Rawling, T.: AuScope’s Research Data Systems: Operationalising FAIR place-based research through collaboration, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-21367, https://doi.org/10.5194/egusphere-egu25-21367, 2025.

14:45–14:55
|
EGU25-19278
|
On-site presentation
Melanie Lorenz, Kirsten Elger, Inke Achterberg, and Malte Semmler

The Specialized Information Service for Geosciences (FID GEO) is a German Research Foundation (DFG)-funded initiative that has been serving the geoscience community in Germany for almost a decade. FID GEO provides essential publication services through its partner domain repositories GFZ Data Services (for research data and software) and GEO-LEOe-docs (for text publications). Beyond these repositories, FID GEO actively supports the digital transformation and supports researchers in adopting Open Science practices mainly through workshops, publications, conference contributions and active participation in topic-specific meetings. 

Collaboration is a cornerstone of FID GEO’s work. It engages closely with geoscientific societies, national infrastructures and initiatives such as the German National Research Data Infrastructure (NFDI), while also contributing to policy-making processes such as the planned German Research Data Act. Recognizing the inherently global nature of geosciences, FID GEO further aligns its activities with international developments, striving to synchronize national progress with global standards and best practices for data management and distribution. FID GEO acts as an interface between scientists, libraries, repositories and the world of digital data management and thus support the transformation of the publication culture in the geosciences at national and international level.

For many years, FID GEO has received feedback from researchers expressing a strong desire for a ‘single source’ platform to manage and share their increasingly large datasets, publications, and projects. At the same time, researchers often feel overwhelmed by the complexity and number of existing infrastructures. However, not only does a one-size-fits-all solution appear technically out of reach, it also faces issues in scalability and sustainable maintenance. A viable way forward is the widespread implementation of machine-readable (meta)data standards that also enable the connection between distributed data systems. Additional metadata properties enable persistent digital links between different research outputs and the unique identification of authors and institutions through persistent identifiers. Another significant challenge within the research landscape is the often competing nature of infrastructures, driven by limited funding opportunities and overlapping goals. Through its extensive network and active collaborations, FID GEO addresses these challenges by guiding researchers through this complex landscape and demonstrates practical ways to make their scientific outputs visible, reusable, and aligned with the FAIR and Open Science principles.

This presentation will share best practices, lessons learned, and future directions for fostering a collaborative and open research environment. FID GEO envisions a geoscience community empowered by shared data and cooperative infrastructures, better equipped to address pressing global challenges.

How to cite: Lorenz, M., Elger, K., Achterberg, I., and Semmler, M.: One platform will not solve everything: How FID GEO strengthens Germany’s Open Science Landscape for the geosciences., EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-19278, https://doi.org/10.5194/egusphere-egu25-19278, 2025.

14:55–15:05
|
EGU25-10583
|
On-site presentation
Richard Conway, James Hinton, Chandra Taposeea, Claudio Iacopino, Salvatore Pinto, and Simon Hunter

The ‘Exploitation Platform’ concept derives from the need to access and process an ever-growing volume of data. Many web-based platforms have emerged - offering access to a wealth of satellite Earth Observation (EO) data, increasingly collocated with cloud computing resources and applications for exploiting the data. Rather than downloads, the exploitation platform offers a cloud environment with EO data access and associated compute and tools facilitating the analysis and processing of large data volumes. Users benefit from the scalability & performance of the cloud infrastructure, the added-value services offered by the platform – and avoid the need to maintain their own hardware. Data hosted in the cloud infrastructure reaches a wider audience and Infrastructure Providers gain an increased cloud user base.

Users are beginning to appreciate the advantages of exploitation platforms. However, the market now offers a plethora of platforms with various added value services and data access capabilities. This ever-increasing offer is intimidating and confusing for most users, often facing challenges such as inconsistent interfaces, proprietary software and limited interoperability. To fully exploit the potential of these complementary platform resources, interoperation amongst the platforms is needed, such that users of one platform may consume the services of another directly platform-to-platform.

EOEPCA (EO Exploitation Platform Common Architecture) is a European Space Agency (ESA) funded project with the goal to define and agree a re-usable exploitation platform architecture using standard interfaces to encourage interoperation and federation between operational exploitation platforms - facilitating easier access and more efficient exploitation of the rapidly growing body of EO and other data. Interoperability through open standards is a key guiding force for the Common Architecture. EOEPCA adheres to standards from organisations such as Open Geospatial Consortium (OGC) and follows best practices in data management, including implementation of OGC Web Services and emerging OGC API specifications for features, coverages and processes. Platform developers are more likely to invest their efforts in standard implementations that have wide usage; off-the-shelf clients and software are more likely to be found for standards-based solutions.

The EOEPCA system architecture is designed to meet defined use cases for various user levels(expert application developers to data analysts and end users). The architecture is defined as a set of Building Blocks (BBs), exposing well-defined open-standard interfaces. These include Identity and Access Management, Resource Discovery, Data Access, Processing Workflows, Data Cube Access, Machine Learning Operations, and more. Each of these BBs are containerized for Kubernetes deployment, providing an infrastructure-agnostic deployment target.

The exploitation platform is conceived as a ‘virtual work environment’,  withusers accessing data, developing algorithms, conducting analysis and sharing value-adding outcomes. The EOEPCA architecture facilitates this through a Workspace BB, providing collaboration environments for groups of users, including dedicated storage and services for analysis, processing and publishing of added-value data and applications. This is supported by an Application Hub BB, providing interactive web-tooling for analysis, algorithm development, data exploitation and providing a web dashboard capability, whereadded-value outcomes are showcased.

Our presentation will highlight the generalised architecture, standards, best practice and open source software components available.

How to cite: Conway, R., Hinton, J., Taposeea, C., Iacopino, C., Pinto, S., and Hunter, S.: EOEPCA+: a method for an open-sourced EO Exploitation Platform Common Architecture, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-10583, https://doi.org/10.5194/egusphere-egu25-10583, 2025.

15:05–15:15
|
EGU25-19453
|
On-site presentation
Rhys Evans, David Poulter, Philip Kershaw, Ian Foster, Rachana Ananthakrishnan, Forrest Hoffman, Aparna Radhakrishnan, Stephan Kinderman, Sasha Ames, and Daniel Westwood

The Earth System Grid Federation (ESGF) is the international partnership responsible for the distribution, cataloging and archiving of both the Coupled Model Intercomparison Project (CMIP) and the Coordinated Regional Climate Downscaling Experiment (CORDEX). In operation since 2009, it was the first decentralised climate data repository of its kind, storing and serving many petabytes of data across tens of global and region data centre partners.

Over the last five years, the system has been fully rearchitected, introducing a cloud-ready deployment architecture and a new system for distributed search, fundamental to ESGF’s federated model for data access. This has involved innovations, translating successful experience with the STAC (Spatio-Temporal Asset Catalogue) specification from the EO world and developing a profile for its use with global climate projections data. Providing a STAC interface to ESGF archives has allowed us to explore alternate access methods for cloud-accessible analysis-ready data ready formats through the use of tools such as Kerchunk, a lightweight non-conversion approach for referencing existing data, which works with open-source python packages like fsspec and Xarray. Use of STAC also provides the potential for greater integration between EO and climate modelling domains essential for the validation of model outputs.

ESGF has traditionally used a distributed model for search services which though powerful has led to challenges around consistency of search content. Over the last twelve months, in preparation for CMIP7, a further fundamental innovation has been made in the architecture to address these issues. The new system adopts a centralised model, with two search nodes, one in the US and one in Europe each hosted on public cloud. These two nodes are synchronised together using a new event-driven architecture. This approach, driven by a shared messaging framework between the nodes, ensures eventual-consistency across the nodes, to reduce or eliminate errors caused by individual node down time and simplify processes such as the replication and retraction of data from the archives distributed at sites across the federation.

The move to a message based, event driven architecture has been integrated with STAC records and services. In ESGF-NG data is shared between nodes as messages in the form of STAC Item records, ensuring a consistent, publicly documented archive distributed across many nodes. The ESGF team have contributed several changes to the STAC project to facilitate this change. Looking forward, we see potential in this new event driven architecture for search systems as a means to integrate across federations - in the European context this could include the ESA Climate Change Initiative open data portal, work with the Copernicus Climate Data Store and DestinE.

How to cite: Evans, R., Poulter, D., Kershaw, P., Foster, I., Ananthakrishnan, R., Hoffman, F., Radhakrishnan, A., Kinderman, S., Ames, S., and Westwood, D.: ESGF Next Generation and preparations for CMIP7, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-19453, https://doi.org/10.5194/egusphere-egu25-19453, 2025.

15:15–15:25
|
EGU25-7959
|
On-site presentation
Clare Richards, Kelsey Druken, Romain Beucher, and Felicity Chun

Developing climate models often requires the ability to access and share extremely large datasets (spanning tens to hundreds of terabytes) that are discoverable and optimised for high-performance computing (HPC) applications. This is a major challenge, as researchers frequently lack the storage resources and specialised support needed to ensure efficient data management and sharing practices across the full data life-cycle. The challenges of sharing data are evident even when dealing with curated datasets that are prepared for broad access, citation, and reuse. However, these challenges are amplified during the rapid and iterative stages of model development and prototyping. At this point, multiple versions of datasets must be shared and evaluated by a wide range of experts before the data is finalised and curated for public use. This iterative process requires robust infrastructure and coordination to avoid bottlenecks that can hinder progress.

To help overcome these barriers, Australia’s Climate Simulator (ACCESS-NRI) provides a dedicated merit allocation scheme for compute and storage resources. This includes 3PB of storage for datasets that are for use by community members to undertake scientific research, support model development and/or will be shared for reuse. Experience has shown, that if the usage of these storage resources is not managed then the data can quickly go from being an asset to a liability.  Therefore, to maximise the value of both the data and investment in storage, ACCESS-NRI has developed an approach for sharing datasets that is designed to support science and innovation while enhancing the current practices for making data more accessible and usable in accordance with the FAIR and CARE principles.

We will present the motivating use cases and show how this approach supports the model development cycle while making data and the science it underpins more transparent, open and accessible. This approach encourages data generators to transition their datasets from unmanaged, undocumented spaces into managed environments where curation and oversight are aligned with the data’s intended purpose and use. It acknowledges that supporting FAIR principles does not always require full curation to the standards of a long-term publication. Instead, it focuses on reducing barriers to data sharing by promoting active data management practices. These practices enhance discoverability, trust, and reliability, ensuring that shared data is fit for purpose without imposing unnecessary burdens.

ACCESS-NRI is a national research infrastructure (NRI) established to support the Australian Community Climate and Earth System Simulator, or ACCESS. The ACCESS suite of software and data outputs are essential tools used to simulate past and future climate, weather and Earth systems and to support research and decision making within Australia.

ACCESS-NRI's mission is to build an open collaborative infrastructure that will accelerate research in Earth system, climate and weather modelling as well as enable new research not currently possible. The facility brings together skills in software development, high-performance computing, data management and analysis to enhance the ACCESS modelling framework, making it easier to use and lowering the barrier for scientific innovation.

How to cite: Richards, C., Druken, K., Beucher, R., and Chun, F.: Breaking down data sharing barriers and uplifting FAIR for climate data at scale, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7959, https://doi.org/10.5194/egusphere-egu25-7959, 2025.

15:25–15:35
|
EGU25-15095
|
On-site presentation
Rorie Edmunds, Jens Klump, Kirsten Elger, Lesley Wyborn, Kerstin Lehnert, Lindsay Powers, and Fabian Kohlmann

Research to address global environmental and societal challenges increasingly depends on the availability of large-scale, multidisciplinary datasets, making the need for robust systems that ensure data discoverability, accessibility, and interoperability evermore critical. However, having the data is not enough, one also needs to know about—and understand the connections among—related outputs and entities that support the veracity and reproducibility of the research.

The International Generic Sample Number (IGSN) is a persistent identifier (PID) for material samples arising from any research discipline. Originally developed in the Earth Sciences, the IGSN provides a vital component in solving the abovementioned challenges, enabling seamless integration of sample data across diverse platforms, disciplines, and organizational and geographic boundaries. By uniquely and permanently linking samples to their descriptions (provided as structured metadata), analytical results, and associated publications, IGSNs facilitate transparency, traceability, and reusability of material samples in line with the FAIR and CARE Principles. This is underpinned by the proven interoperability of the IGSN with the scientific communication infrastructure, which also enables citations of samples in the literature to be automatically captured.

This presentation will showcase the collaborative efforts of the IGSN Organization (IGSN e.V.) and DataCite to establish a resilient, cross-disciplinary, globally harmonized PID system for material samples. Use cases will illustrate how IGSNs enhance research workflows, enabling researchers to be more effective and attributed. We will also discuss governance, technical standards, and best practices that promote trust in the IGSN-DataCite partnership and scalability of sample PID adoption, aligning with UNESCO’s Open Science recommendations.

How to cite: Edmunds, R., Klump, J., Elger, K., Wyborn, L., Lehnert, K., Powers, L., and Kohlmann, F.: Advancing Open, FAIR, and Responsible Science through the International Generic Sample Number, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-15095, https://doi.org/10.5194/egusphere-egu25-15095, 2025.

15:35–15:45
|
EGU25-20550
|
On-site presentation
Hans Pfeiffenberger and David Carlson

As founders and former chief editors of Earth System Science Data (ESSD), the authors are concerned about the reproducibility and availability of important scientific sources and findings, and about timely access to scientific data and data-related services. We are discussing (1) incidents with the availability of DOIed datasets and their metadata and (2) a recent outage of an important data infrastructure.

Both observations are considered sufficiently serious that the authors wonder why the underlying facts and realities are not discussed widely in this community.

1) The most cited dataset published through ESSD is the series of yearly reports on the Global Carbon Budget, e.g. the latest, https://doi.org/10.5194/essd-2024-519. These articles are cited in scientific publications by the hundreds of times and routinely inform the United Nations climate change conferences (COPs). The first datasets of the series were held and provided DOIs by the Carbon Dioxide Information Analysis Center (CDIAC), which was hosted by the Oak Ridge National Laboratory. When CDIAC was shut down in 2017, the datasets were transferred to a repository at another US National Lab, loosing most of the metadata in the process, most notably authorship. Thankfully, hosting of post-2017 additions to the dataset series has been taken over by the Integrated Carbon Observing System (ICOS) and DOIs to all elements of the series still resolve (albeit, in a sloppy manner for pre-2018 data). One could argue that the most reliable holder of metainformation about this – not just scientifically – important data are not the repositories but ESSD, operated by a commercial publisher, Copernicus. 

2) When tropical storm Helene hit North Carolina, in September 2024, power and internet connectivity went out from the Asheville headquarter site of NOAA’s NCEI, an aggregator, archive and service provider for environmental data. Although NCEI is hosted at four geographically dispersed sites, NCEI data ingest and services came to a halt for several weeks. It appears that most data from the period during and after Helene have been collected retroactively, and services are fully available again. While NOAA’s real-time weather services, important to deal with the emergency, seem to have been available during Helene, one is tempted to ask if they could become interrupted under similar circumstances.

Both these and some other observations – which will be discussed at EGU2025 - create the uncomfortable impression that the huge efforts of this community wrt. the FAIRness of data and in the creation of a multitude of publicly funded infrastructure elements do not achieve to meet today’s needs, and possibly may not meet them tomorrow. If government labs and agencies of a rich nation cannot achieve this – who can?

(Part of this work has been presented before, at a pre-conference workshop to RDA20, Gothenburg, 2023)

How to cite: Pfeiffenberger, H. and Carlson, D.: Are Publicly Funded Data-Infrastructures Reliable?, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-20550, https://doi.org/10.5194/egusphere-egu25-20550, 2025.

Coffee break
Chairpersons: Martina Stockhause, Danie Kinkade, Alba Brobia
16:15–16:20
16:20–16:30
|
EGU25-9104
|
On-site presentation
Colin Price, Aviv Shay, and Peter Baumann

Lightning is a hazard for many sectors and industries, including the power utility sector, wind turbines, forest management, and civil aviation.  Commercial aircraft are struck by lightning approximately once every year, but most airlines try to avoid thunderstorms if possible by rerouting around these turbulent and electrified storms.  However, such diversions can delay flights, add costs to fuel demands, while increasing greenhouse gas emissions for the aircraft company.  In this study using data cubes we have combined lightning data from the World Wide Lightning Location Network (WWLLN) together with civil aviation flight data from FlightRadar24 to better understand the risks of lightning to civil aviation.  Combining historic lightning and aviation data we can address questions about risks to aircraft from thunderstorms, the frequency of close encounters with thunderstorms, and the frequency of rerouted flights due to thunderstorm activities.  The emerging concept of Analysis-Ready Data (ARD) attempts to find concepts and methods towards services operating on homogenized data. For spatio-temporal data, datacubes are an accepted cornerstone for ARD providing Big Geo Data easier for users and applications, ready for analysis, visualization, fusion, etc. As part of the Cube4EnvSec NATO Science for Peace and Security (SPS) project we will present live demos of our data cube tools and services related to lightning risks for civil aviation over Europe.  Derived analytics from the datacube will also be presented.

How to cite: Price, C., Shay, A., and Baumann, P.: Using Data Cubes to Investigate Links Between Lightning and Civil Aviation, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9104, https://doi.org/10.5194/egusphere-egu25-9104, 2025.

16:30–16:40
|
EGU25-6020
|
On-site presentation
Dimitar Mishev and Peter Baumann


Datacubes are an accepted cornerstone for Analysis-Ready Data (ARD). One analysis technique of skyrocketing importance today is AI, and this begs the question: how to generalize evaluation of pre-trained models on datacubes?

From a theoretial viewpoint, the connection is immediate: datacubes mathematically resemble tensors, and EO models evaluate tensors. In practice, though, the situation is less straightforward as our experiments with different models have shown. A main issue is the variety and the lack of standardized interfaces of ML models: different input data are processed, data need model-specific preprocessing, and several more. In our research towards offering ML-on-datacubes as a commodity in a federated datacube infrastructure we have collected challenges and methods for presentation.

In our demo, we present AI-Cubes as an enabling concept uniting AI and datacubes. The demos will approach the theme from two sides:

- AI support for datacube query writing: We have trained a chatbot to explain and assist with datacube queries in the OGC/ISO/INSPIRE WCPS standard. This can act as a productiity-enhancing tool for both expert and non-expert users. We demonstrate live how specific questions get answered, such as phrasing NDVI on Sentinel-2 data.

- AI model evaluation on datacubes: particularly attractive is that datacubes allow simple navigation to any area, any time, and even from federated services. This we demonstrate live.

We also highlight challenges coming with this simple data access: models do not convey the same performance anywhere, anytime. This has led to new research on "model fencing", ie: attempting to restrict model application to situations where they exhibit sufficient accuracy. We present first ideas of this research.

Altogether, we cast light on the combination of datacubes and AI from a service and infrastructure perspective. 

How to cite: Mishev, D. and Baumann, P.: AI and Datacubes - a Happy Marriage?, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-6020, https://doi.org/10.5194/egusphere-egu25-6020, 2025.

16:40–16:50
|
EGU25-10578
|
On-site presentation
Arno de Kock, Timm Waldau, Pedro Batista, Peter Baumann, Thorsten Behrens, Peter Fiener, Jens Foeller, Markus Moeller, Ingrid Noehles, Karsten Schmidt, and Burkhard Golla

The DynAWI Extreme Weather Toolbox represents an innovative approach to addressing climate-related challenges in agriculture. This publicly accessible web application offers three primary functions: a historical agricultural weather indicator atlas, a dynamic configurator for calculating user-specified weather indexes, and a forecast model for predicting reduced yields or complete crop failure due to weather extremes. The web application can perform real-time analyses based on multi-dimensional spatio-temporal data.

The technical implementation is based on a client-server architecture, utilizing a scalable geodata infrastructure and an array database management system rasdaman, enabling efficient processing of multidimensional geodata. The system allows real-time analysis of extreme weather events, such as droughts, heatwaves, and heavy rainfall, dating back to 1995. The toolbox aims to provide stakeholders—from farmers to policymakers—with a comprehensive platform for weather-related risk assessment and decision support in agriculture.

In a live demonstration, we will showcase the platform's key features, emphasizing its interactive capabilities and extensive parameter customization options.

How to cite: de Kock, A., Waldau, T., Batista, P., Baumann, P., Behrens, T., Fiener, P., Foeller, J., Moeller, M., Noehles, I., Schmidt, K., and Golla, B.: DynAWI Extreme Weather Toolbox: an online platform for agricultural risk assessment and decision support, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-10578, https://doi.org/10.5194/egusphere-egu25-10578, 2025.

16:50–17:00
|
EGU25-2826
|
On-site presentation
Giuliano Langella, Piero Manna, and Florindo Antonio Mileti

Land take, a significant driver of land degradation, poses challenges for sustainable land management, particularly in regions under high anthropogenic pressure. Addressing these challenges necessitates robust, data-driven approaches to monitor, quantify, and mitigate land take. This contribution explores the integration of datacube technology within the LandSupport Regions platform, leveraging advances from the European LandSupport project and its extension under the Italian GeoSciences-IR project.

Raster datacubes, structured as multidimensional arrays, enable efficient management and analysis of large-scale spatio-temporal datasets, overcoming traditional file-based limitations. The LandSupport Regions platform utilizes a datacube-based Spatial Decision Support System (S-DSS) to enhance the monitoring of land consumption, land cover, and land use at multiple scales—from municipal to national levels. The system integrates heterogeneous datasets, including satellite imagery (e.g., Copernicus), regional land use maps, and environmental indicators (such as high resolution and multi-temporal imperviousness maps), within a common infrastructure, adhering to the FAIR principles.

A key focus is on land take quantification, supported by high-resolution datacubes capable of tracking land-use changes over time. The platform offers decision-makers a suite of tools for generating actionable indicators, assessing compliance with land-use policies, and proposing mitigation strategies aligned with zero net land take objectives. Moreover, the system’s interoperability and open-access characteristics allow integration of user-defined data and models, fostering innovation and scalability.

The platform’s capabilities are demonstrated through use cases in Italy, where local administrations leverage datacube analytics to refine urban and regional planning. These use cases underscore the role of datacubes in delivering accurate, timely insights for sustainable land management. By aligning regional initiatives with European Green Deal objectives, the LandSupport Regions platform – produced under the GeoSciences-IR project – exemplifies the potential of datacube-enabled S-DSSs to advance environmental governance.

How to cite: Langella, G., Manna, P., and Mileti, F. A.:  Datacubes as enablers for land take quantification in LandSupport Regions, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-2826, https://doi.org/10.5194/egusphere-egu25-2826, 2025.

17:00–17:10
|
EGU25-3362
|
On-site presentation
Xiaogang Ma, Jiyin Zhang, and Jolyon Ralph

Over the past three years, we have successfully launched an open data service for Mindat, one of the largest databases focused on mineral species and their global distributions. Our achievements include: 1) a comprehensive review of existing data records, covering the list of data subjects, their characteristics, and inherent biases, 2) the establishment of an open data API (application programming interface) alongside Python and R packages to integrate the API into workflow platforms, and 3) fostering community collaboration on data standards and best practices for open data, such as mineral nomenclature, rock classification, and technical frameworks for applying the FAIR (findable, accessible, interoperable, and reusable) principles. Mindat is both crowd-sourced and expert-curated, and for the past decades it has been proven to be an effective approach to engage both data contributors and users. Mindat has been popular amongst geoscience professionals and the public alike. Through our open data initiatives and community engagement, we have also gathered valuable insights to guide future developments of Mindat open data. In this presentation, we will highlight the current open data capabilities, provide an overview of the review of Mindat's data records, and share our vision for leveraging advanced artificial intelligence technologies to expand and enhance Mindat in the future.

How to cite: Ma, X., Zhang, J., and Ralph, J.: Mindat: A crowd-sourced and expert-curated open data ecosystem for mineralogy, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-3362, https://doi.org/10.5194/egusphere-egu25-3362, 2025.

17:10–17:20
|
EGU25-2206
|
On-site presentation
Elisabetta D'Anastasio, Jonathan B. Hanson, Steve Sherburn, Joshua Groom, and Mark Rattenbury

The GeoNet programme at GNS Science Te Pū Ao (GNS) is the primary agency responsible for collecting, managing, and delivering geohazard data in Aotearoa New Zealand, enabling the monitoring of volcanoes, earthquakes, landslides, and tsunamis. The programme oversees a multi-parametric sensor network along with a diverse array of instrumentation and methodologies to provide both raw and analysis-ready data to its end users. Since its inception in 2002, an "open by default" policy has been the guiding principle of this research and monitoring data infrastructure. 

To enhance the interoperability, accessibility, and usability of GeoNet's data, and in alignment with FAIR data principles, we developed an in-house interdisciplinary solution (Tilde) for storing and accessing time-series datasets managed by the programme. Operating successfully for two years, Tilde has improved the interoperability, usability, and FAIR-ness of GeoNet data. In this presentation, we will outline how Tilde has achieved these improvements, discuss challenges and unresolved questions within the geophysical community, and explore potential future directions for leveraging this open data platform to address CARE principles and indigenous data governance in Aotearoa New Zealand. 

How to cite: D'Anastasio, E., Hanson, J. B., Sherburn, S., Groom, J., and Rattenbury, M.: Integrated and Open geohazard monitoring data in Aotearoa New Zealand: developing an interoperable data service for GeoNet’s time series datasets.  , EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-2206, https://doi.org/10.5194/egusphere-egu25-2206, 2025.

17:20–17:30
|
EGU25-16708
|
ECS
|
Virtual presentation
Chen-Yu Hao, Jo-Yu Chang, I-Liang Shih, and Ya-Chu Change

This study introduces a scalable and integrated datacube framework for efficient geospatial data processing and analysis. Leveraging the advanced cloud infrastructure of the National Center for High-Performance Computing (NCHC), the framework combines the openEO API and OGC services to address challenges in managing multidimensional datasets. By ensuring interoperability, security, and high-performance computing, the framework provides a reliable solution for researchers and practitioners to tackle complex geospatial challenges.

Framework Architecture

The framework architecture integrates advanced tools and services, focusing primarily on the openEO API and OGC standard services (e.g., Web Coverage Service and Web Coverage Processing Service). The openEO API provides a unified interface supporting multiple programming languages, allowing users to design and execute customized workflows and enabling batch processing.

openEO integration
The openEO API plays a central role in the framework, performing the following functions:

  • Unified Data Access and Processing Interface: openEO offers a standardized access and processing layer for Earth observation data, abstracting underlying complexities and enabling users to uniformly access multidimensional data from various sources, such as satellite imagery and terrain datasets.
  • Process Graphs and User-Defined Processes: openEO supports User-Defined Processes and Process Graphs, enabling users to create tailored data processing pipelines based on specific analytical requirements. This is particularly valuable for advanced analyses like temporal change detection or spatial statistics.
  • Seamless Integration with OGC Services: openEO works seamlessly with OGC services (e.g., WCS and WCPS) in the framework, enhancing its ability to handle multi-source data. While openEO provides high-level data access and analytical capabilities, OGC services ensure interoperability and standardization at the data layer.

API Proxy Architecture Design

The API proxy is a critical component of the framework, bridging the openEO API and the backend infrastructure to ensure efficient, secure, and stable interactions between users and the system. Its main functions include authentication, authorization management, traffic control, and caching. With the API proxy, openEO can provide a simplified user experience while ensuring optimal utilization of backend data and resources.

Application Scenarios

1. Terrain Analysis
By transforming digital terrain models (DTMs) into multidimensional structures, the framework significantly improves the processing speed and accuracy of large-scale datasets. openEO’s role includes providing a unified interface for data access, enabling users to quickly retrieve and process data for custom slope calculations, visibility analyses, and more. Simultaneously, API proxy security layers ensure strict management of data access and usage.

2. Temporal Analysis Using Landsat Imagery
Temporal analysis of Landsat imagery involves handling large volumes of time-series data. Here, openEO acts as the analytical hub, allowing users to submit analysis requests (e.g., calculating the Normalized Difference Water Index (NDWI)) via the API. The framework then automatically invokes OGC services for data processing and result generation.

Conclusion

The proposed datacube framework successfully integrates openEO API and OGC services, offering a scalable, interoperable, and high-performance solution. As a unified data access and analytical interface, openEO provides flexible and robust tools that significantly simplify complex data processing workflows. By lowering technical barriers and enhancing analytical accessibility, the framework delivers unprecedented convenience for geospatial data analysis, making it a key tool in research and decision-making processes.

How to cite: Hao, C.-Y., Chang, J.-Y., Shih, I.-L., and Change, Y.-C.: Scalable and Interoperable Datacube Framework for Advanced Geospatial Data Analysis, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-16708, https://doi.org/10.5194/egusphere-egu25-16708, 2025.

17:30–17:40
|
EGU25-18601
|
ECS
|
On-site presentation
Jerimar Vasquez Rojas, Juan Carbone, Alfredo Izquierdo González, Javier Benavente González, Jesús Gómez Enri, Tomás Fernández -Montblanc, Flavio Martins, William Cabos Narvaez, Carlos Yagüe, Carlos Román-Cascón, Oscar Álvarez, Caio Fonteles, Bruno Marques, and Francisco Campuzano

The main objective of the OceanUCA project is the modernization of the technological infrastructure of the University of Cadiz in relation to atmospheric and hydrodynamic numerical modeling specifically configured to simulate physical processes on the coast of Andalusia (Spain).
The initiative focuses on the improvement of modeling systems (oceanographic and atmospheric) and the modernization of servers, mainly THREDDS and ERDDAP. THREDDS facilitates connectivity between scientific data providers and end users, while ERDDAP simplifies the sharing and visualization of time series data through common formats, graphics and maps. The project aims to optimize access, organization and storage of data, create a complete data bank and standardize protocols.
For the storage of data from numerical models, a file server is acquired that allows the custody of large volumes of information related to simulated physical processes, especially focused on the Andalusian coasts. In the future, this server will also facilitate the storage of data from other sources for further calculation, processing and sampling. This acquisition contributes to centralizing the files, currently distributed across different storage sources, and to improving communication across the THREDDS/ERDDAP servers.
The project includes a web application that presents the models in a user-friendly and interpretable format, especially for the scientific community, through the visualization of images.
The technological infrastructure will allow significant advances by facilitating the download of numerical data and taking advantage of graphical processing and high-performance computing to process large data sets. This approach improves the scalability and resolution of forecasts, making them more accessible to the public. By adopting an open-source framework, the project promotes collaboration and knowledge sharing at national and international scales, empowering both the scientific community and the public to use coastal and atmospheric data for informed decision-making and sustainable resource management.

How to cite: Vasquez Rojas, J., Carbone, J., Izquierdo González, A., Benavente González, J., Gómez Enri, J., Fernández -Montblanc, T., Martins, F., Cabos Narvaez, W., Yagüe, C., Román-Cascón, C., Álvarez, O., Fonteles, C., Marques, B., and Campuzano, F.: OceanUCA: Technological innovation for the management and communication of coastal data in Andalusia through numerical modelling and open source technology., EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-18601, https://doi.org/10.5194/egusphere-egu25-18601, 2025.

17:40–17:50
|
EGU25-10638
|
On-site presentation
Ivonne Anders, Beate Krüss, Marco Kulüke, Karsten Peters-von Gehlen, Hannes Thiemann, and Heinrich Widmann

In recent years, the concept of data spaces has gained prominence, particularly in industry, as a framework for organizing and sharing data across business ecosystems and institutional and disciplinary boundaries. While the term itself is not yet widely adopted in the scientific community , it can be directly applied to research. Data spaces provide a  structured environment for integrating data sets from diversedisciplines, methods or fieldsand making themaccessible for collaboration and analysis. Climate and climate impact research, which relies on data from different fields such as meteorology, hydrology or socio-economics, is in a unique position to benefit from the application from this approach.

In line with the principles of open science, researchers are increasingly adopting frameworks that promote transparency, accessibility and reproducibility. FAIR Digital Objects (FDOs) offer effective means of achieving these goals while also enabling interactions between different data spaces. As standardized, interoperable, and machine-readable entities, FDOs link data, metadata and software, simplifying integration and promoting reuse across disciplines.

Using an example from climate research, we demonstrate how climate model data from an institutional data space, observational data from field campaigns, and satellite data (e.g., from the Destination Earth Data Lake) can be combined. By employing STAC (Spatio Temporal Asset Catalog) catalogs defined as FAIR Digital Objects facilitating the European Open Science Cloud (EOSC) Data Type Registry, we address a specific interdisciplinary research question. This approach not only illustrates the practical application of FDOs but also highlights how they can provide a robust framework for tackling larger and more complex scientific challenges by streamlining workflows and enabling collaboration across disciplinary and institutional boundaries.

How to cite: Anders, I., Krüss, B., Kulüke, M., Peters-von Gehlen, K., Thiemann, H., and Widmann, H.: Climate Science Meets Data Spaces: FAIR Digital Objects as a Gateway to Interdisciplinary Science, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-10638, https://doi.org/10.5194/egusphere-egu25-10638, 2025.

17:50–18:00
|
EGU25-20533
|
Virtual presentation
Jon Vandegriff, Robert Weigel, Jeremy Faden, and Alexander Antunes

We describe a simple interface for accessing time series numeric data: the Heliophysics Application Programmer's Interface (HAPI). Although it started in NASA's Heliophysics domain, no Heliophysics idioms are present in the standard, and HAPI can be used to serve any tabular, numeric data that is indexed by time.  HAPI was the result of a community push to standardize similar access methods at multiple data centers, and it is now in use at 12 data centers around the world, with over 12,000 datasets available in a standard way. HAPI offers a more conceptual view of the data, independent of the storage arrangements at a server. It also is not intended to replace an existing server's API, but to sit alongside that API.  The project is mature, with a reference server available, as well as clients in multiple programming languages.  We will present an overview of the API and compliance with FAIR principles. We also will describe some of the visualization and analysis tools being developed now that standardized access is becoming a reality. We invite discussion with other time series data providers in other domains.

How to cite: Vandegriff, J., Weigel, R., Faden, J., and Antunes, A.: A Grass-roots Standard for Time Series Data in any Domain: HAPI, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-20533, https://doi.org/10.5194/egusphere-egu25-20533, 2025.

Posters on site: Tue, 29 Apr, 10:45–12:30 | Hall X4

The posters scheduled for on-site presentation are only visible in the poster hall in Vienna. If authors uploaded their presentation files, these files are linked from the abstracts below.
Display time: Tue, 29 Apr, 08:30–12:30
Chairpersons: Martina Stockhause, Alba Brobia
X4.1
|
EGU25-7038
Emanuel Soeding, Dorothee Kottmeier, Andrea Poersch, Stanislav Malinovschii, and Sören Lorenz

At the Helmholtz Association, we strive to establish a well-structured, harmonized data space that seamlessly connects information across distributed data infrastructures. Achieving this goal requires the standardization of dataset descriptions through consistent metadata practices, such as leveraging persistent identifier (PID) metadata, to ensure interoperability and machine actionability.

While developing concepts to harmonize PID metadata is a crucial first step in creating a unified data space, it is not sufficient on its own. The practical application of PIDs to facilitate the compilation of rich, relevant metadata for datasets necessitates knowledge, training, support, and cooperation among diverse stakeholder groups, each responsible for different aspects of the information lifecycle.

For example, ORCID is a PID system designed to identify individuals contributing to research. Traditionally, this has primarily applied to scientists publishing journal articles. However, in the context of research data, other stakeholders also play vital roles. These include technicians operating instrumentation, data management personnel curating research data and repositories, and administrative staff maintaining institutional data relevant to research. Currently, these stakeholders are often unaware of their potential roles in data management, and the information they collect is typically not harmonized. To address this, workflows must be implemented to manage, structure, and connect the information they produce to research data where appropriate. In the case of ORCID, these workflows should begin at the earliest stages of the research process, such as during employee onboarding.

PIDINST, a PID system introduced by an RDA working group, provides a simple metadata schema to collect essential information about instruments and registers them with unique IDs. These IDs are invaluable for identifying measurements conducted with the same or similar devices. Therefore, we strongly recommend the adoption of PIDInst within the Helmholtz Association. For PIDINST, successful implementation would involve integrating the workflow into existing processes, starting with the acquisition of an instrument or sensor at the research center. Relevant information would then be passed to technicians responsible for maintaining up-to-date records. For researchers, PIDINST provides reliable identification of devices used in scientific processes.

In this presentation, we highlight critical positions within the centers where minor adjustments to established workflows can significantly support the registration of specific PIDs and the engagement of stakeholder groups. We also explore strategies for implementing these changes across the Helmholtz Association. Furthermore, we assign clear responsibilities for metadata maintenance to appropriate stakeholders. The conclusions drawn from this process aim to redefine roles and responsibilities within our organization, fostering a more integrated and effective approach to data management.

How to cite: Soeding, E., Kottmeier, D., Poersch, A., Malinovschii, S., and Lorenz, S.: Establishing Institutional Workflows to Engage Stakeholder Groups in PID Metadata Maintenance, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7038, https://doi.org/10.5194/egusphere-egu25-7038, 2025.

X4.2
|
EGU25-21042
Eric E. Palmer, Kristina Lopez, and Mike Drum

The Planetary Data System (PDS) provides key structural support for Open Science by meeting the tenants of "Free, unrestricted access1.” Here we will discuss the need to expand our offerings by improving support for the OS tenants of "Ease of use.”

Analysis-ready data (ARD) provides data in formats that, while different than what was provided by the mission team, are orders of magnitude more useful to scientific researchers. 

For NASA Planetary Science Missions, the data is provided to us in stable and long-term formats that are well documented.  However, the data formats for each mission are typically different.  Additionally, many processing steps are not done by the science team for the archived products, such as ortho-rectification, geospatial positioning, or co-alignment with digital terrain models.  Additionally, there is little consensus within Planetary Science for a standard format for almost any data type, for example images that can be in FITS, VICAR, custom IMG formats, or sometimes JPEG.

PDS nodes have begun to host such ARD as either part of the official archive or outside of the archive using the new PDS annexes2.  We have several initiatives to support ARD.  These include the Small Bodies Image Browser and digital terrain models in both ISIS and GeoTiff formats. While generated data in these formats initially requires additional effort, once created they continuously provide value to the data user community.

Analysis-ready data can significantly increase "ease of use" in many different ways.  They typically will be preprocessed, saving data users significant effort that they would have spent learning how to process the data themselves. This preprocessing also lowers the technical barriers and eases the use of complex data sets. In addition to the preprocessing, datasets can be provided in standardized, commonly used data formats that are more useable and accessible than many of the current formats. Streamlining the ARD would greatly ease both researchers' and the public’s ability to use data spanning many different missions in ways that is not currently possible. Focusing on providing the most interoperable and usable data to the community also enables more interdisciplinary collaboration and increases reproducibility — all key goals of Open Science.  

Analysis-ready data in the PDS will be essential to create more open and usable data. As the complexity of planetary mission data increases, ARD can allow the PDS to maximize the scientific return of these valuable datasets.

References:
[1] NASA Science Mission Directorate. (2023). Open-Source Science Guidance, Version 2.1.
[2] Mouginis-Mark, P., Williams, D., Bleacher, J., et al. (2023). Analysis Ready Data (ARD) within the Planetary Data Ecosystem: Benefits for the Science Community. 54th Lunar and Planetary Science Conference.

How to cite: Palmer, E. E., Lopez, K., and Drum, M.: Efforts of the Small Bodies Node in Providing “Analysis Ready Data” to Support Open Science, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-21042, https://doi.org/10.5194/egusphere-egu25-21042, 2025.

X4.3
|
EGU25-4843
Peter Baumann, Colin Price, Vlad Merticariu, Bang Pham Huu, and Dimitar Misev

Air traffic today is immense, with large numbers of humans and goods transported routinely, but also in search and rescue missions, military contexts, and hobby piloting, etc. Still, aviation today is safer than it ever has been, thanks to advanced technology and procedures which are continuously revisited and, where necessary, improved. Of central importance for planning and conducting flights is the atmospheric condition the aircraft is flying in, represented by various relevant weather parameters. Hence, these are continuously monitored.

In the Cube4EnvSec project, a federated datacube demonstrator has been established which illustrates ad-hoc assessment of atmospheric conditions relevant for aircraft. Data stem from two sources, DWD and WWLLN. 

From German Weather Service (Deutscher Wetterdienst, DWD), WAWFOR (World Aviation Weather Forecast) data are obtained, a digital aviation meteorological dataset based on the ICON6_Nest model in support of air traffic management based on geostationary weather satellites. Components currently used are wind speed, icing parameters, Cumulonimbus tops, temperature, tropopause, turbulence, lightning, precipitation radar, volcanic ash, and dust. Updates are provided every 6 hours, temporal resolution is 1 hour with a forecast window of currently 78 hours. The update batches are harvested from DWD and merged into the respective datacubes, extending it by 6 hours further into the future. The 6 hours not overwritten by the new forecast are retained and create a growing "long tail" of historical weather data, currently about 17,000 timeslices. Some datacubes are 3D x/y/t, most however are 4D x/y/h/t with a spatial resolution of 0.0625° x 0.0625° (approximately 6.5km x 6.5km), altitude between ground and 18,000 feet (FL180).

Lightning data are obtained from the World Wide Lightning Location Network (WWLLN) by the Colin Price research group at Tel Aviv University and aggregated into a 3D x/y/t timeseries of lightning strikes observed. Spatial resolution is 0.1°, temporal resolution is 1 hour.

Altogether, the datacubes have a footprint of currently about 20 TB. APIs offered by the Aviation Safety service include the adopted standards WMS, WMTS, WCS, and WCPS, and additionally the OGC drafts OAPI-Coverages and GeoDataCube. Any client conforming to these APIs can be utilized; in the demonstration the rasdaman dashboard will be used which is configurable for manifold datacube interaction techniques (see Figure at bottom).

The demonstration presented includes the following steps:

  • general overview of the datacubes offered by the service;
  • visualization of the combined forecast/history weather datacubes;
  • information relevant for pilot flight planning: weather hazards overview; severe weather conditions along historic routes;
  • same for ad-hoc chosen flight paths, with 4D corridor cutout;
  • various analytical queries related to flight weather conditions.

Most parts of this demo are publicly accessible under https://cube4envsec.org/aviation-dashboard . Any standard Web browser can access it without any plugin etc. to be installed.

Acknowledgement
Cube4EnvSec has received funding by the NATO Science for Peace and Security (SPS) program.

 

 

 


Fig.: Aviation Safety datacube dashboard

 

How to cite: Baumann, P., Price, C., Merticariu, V., Pham Huu, B., and Misev, D.: Aviation Safety Datacubes, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-4843, https://doi.org/10.5194/egusphere-egu25-4843, 2025.

X4.4
|
EGU25-240
|
ECS
Lien Rodríguez-López, Lisandra Bravo Alvarez, Iongel Duran Llacer, David Ruíz-Guirola, Samuel Montejo-Sánchez, Rebeca Martínez-Retureta, Luc Bourel, Frederic Frappart, and Roberto Urrutia

This study examines the dynamics of limnological parameters of a South American lake located in southern Chile with the objective of predicting chlorophyll-a levels, which are a key indicator of algal biomass and water quality, by integrating combined remote sensing and machine learning techniques. Employing four advanced machine learning models, the research focuses on the estimation of chlorophyll-a concentrations at three sampling stations within Lake Ranco. The data span from 1987 to 2020 and are used in three different cases: using only in situ data (Case 1), using in situ and meteorological data (Case 2), using in situ, and meteorological and satellite data from Landsat and Sentinel missions (Case 3). In all cases, each machine learning model shows robust performance, with promising results in predicting chlorophyll-a concentrations. Among these models, LSTM stands out as the most effective, with the best metrics in the estimation, the best performance was Case 1, with R2 = 0.89, an RSME of 0.32 μg/L, an MAE 1.25 μg/L and an MSE 0.25 (μg/L)2, consistently outperforming the others according to the static metrics used for validation. This finding underscores the effectiveness of LSTM in capturing the complex temporal relationships inherent in the dataset. However, increasing the dataset in Case 3 shows a better performance of TCNs (R2 = 0.96; MSE = 0.33 (μg/L)2; RMSE = 0.13 μg/L; and MAE = 0.06 μg/L). The successful application of machine learning algorithms emphasizes their potential to elucidate the dynamics of algal biomass in Lake Ranco, located in the southern region of Chile. These results not only contribute to a deeper understanding of the lake ecosystem but also highlight the utility of advanced computational techniques in environmental research and management.

How to cite: Rodríguez-López, L., Bravo Alvarez, L., Duran Llacer, I., Ruíz-Guirola, D., Montejo-Sánchez, S., Martínez-Retureta, R., Bourel, L., Frappart, F., and Urrutia, R.: Leveraging Machine Learning and Remote Sensing for Water Quality Analysis in Lake Ranco, Southern Chile, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-240, https://doi.org/10.5194/egusphere-egu25-240, 2025.

X4.5
|
EGU25-16668
María Helga Guðmundsdóttir, Kjartan Birgisson, Hrafnkell Hannesson, Kristján Jónasson, Anette Th. Meier, Birgir Vilhelm Óskarsson, and Björn Darri Sigurðsson

The Drill Core Library (DCL) of the Natural Science Institute of Iceland is Iceland’s national repository for drill cores and cuttings. As such, the DCL is responsible for preserving these geological samples and making them available to the scientific community. The library comprises an estimated 100 km of core and a 470 km equivalent of cuttings from over 4,000 boreholes, as well as a growing database of analytical results. The collection spans Iceland’s range of diverse geological environments and houses core from significant research projects including the SUSTAIN drilling project in Surtsey, sponsored in part by the International Continental Scientific Drilling Program, and the Iceland Research Drilling Project. The DCL’s drill cores and cuttings are available for study and sampling for research purposes, and DCL staff are available for consultation and assistance in identifying and collecting suitable samples. The DCL’s on-site facilities are maintained in collaboration with the University of Iceland’s Research Centre in Breiðdalsvík, East Iceland.

An emphasis has been placed on developing digital infrastructure to improve access to the collections for the scientific community. To facilitate sample identification, an online map-based interface and WFS service have been created where the collection can be examined and contextualized with geological data. The database of the DCL has also been partly integrated into the European Plate Observing System (EPOS), a collaborative initiative enabling FAIR (Findable, Accessible, Interoperable, and Reusable) and open access to geoscientific data from across Europe.

The latest advance in digital access is the ongoing population of the DCL database with core photographs. These are linked directly to the WFS and map viewer, forming a “visual library” that enables direct examination of the library collections, thereby facilitating identification of sampling targets by researchers around the world. At present, 16% of the drill core collection has already been photographed, with 50% set as a target for the end of 2025. Further development of the interface will be carried out in consultation with users of the DCL collections, and cores of interest to researchers are prioritized for photography.

How to cite: Guðmundsdóttir, M. H., Birgisson, K., Hannesson, H., Jónasson, K., Meier, A. Th., Óskarsson, B. V., and Sigurðsson, B. D.: The Visual Drill Core Library: A Tool for Improving Access to Samples from the Natural Science Institute of Iceland, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-16668, https://doi.org/10.5194/egusphere-egu25-16668, 2025.

Posters virtual: Tue, 29 Apr, 14:00–15:45 | vPoster spot 4

The posters scheduled for virtual presentation are visible in Gather.Town. Attendees are asked to meet the authors during the scheduled attendance time for live video chats. If authors uploaded their presentation files, these files are also linked from the abstracts below. The button to access Gather.Town appears just before the time block starts. Onsite attendees can also visit the virtual poster sessions at the vPoster spots (equal to PICO spots).
Display time: Tue, 29 Apr, 08:30–18:00
Chairpersons: Filippo Accomando, Andrea Vitale

EGU25-15216 | Posters virtual | VPS19

A relevant accessible and interoperable geotechnical data tool to support the landslide risk management 

Graziella Emanuela Scarcella, Luigi Aceto, and Giovanni Gullà
Tue, 29 Apr, 14:00–15:45 (CEST) | vP4.18

The rising frequency and severity of landslides, exacerbated by the effects of climate change and human development in unstable areas, call for effective risk management strategies. In this context, a systematic collection of all the available data regarding geotechnical aspects, in particular geomaterial parameters, results plays a crucial role, providing a decisive contribution to define strategies for sustainable landslide risk management.

In this work, we present the translation of a geotechnical database to the aims of the project Tech4You Innovation Ecosystem – Goal 1 - Pilot Project 1, useful to identify the typical landslide scenarios, to identify sufficient knowledge for the definition of the geotechnical model and geomaterials typing in similar geo-environmental contexts. The database contains the results of laboratory tests carried out in the past by researchers at CNR IRPI in Rende, relating to 11 sites in Calabria, of which 10 in the Province of Catanzaro and 1 in the Province of Vibo Valentia.  For each site, geotechnical characterisation data of the geomaterials, which represent a key cognitive element, were grouped by type of laboratory test (grain size, indices, Atterberg limits, oedometric, direct shear and triaxial tests). We uploaded these data to validate a tool, named GeoDataTech vers. 2.0, which is an update of a previous version. In particular, we have tested the correct functioning (display, query, extraction data) with a significant sample of data. GeoDataTech vers. 2.0 can manage 2399 laboratory tests to date: 61 oedometric tests, 636 grain size, 537 indices, 78 Atterberg limits, 454 specific gravity, 512 direct shear tests and 121 triaxial tests.

This tool will be available to a wide range of stakeholders (researchers, professionals, territorial administrations, public bodies and citizens) allowing us to acquire, interrogate, export data and to upload their own files to integrate them into the database of the tool, performing advanced analyses with reference to the typification of geomaterials. By enabling the sharing of such data between researchers, practitioners and public institutions, the geotechnical tool will contribute significantly to improving disaster prevention strategies, in particular with regard to the reduction of landslide risks, thereby responding to the growing demand for accessible and interoperable data networks that increase synergic interdisciplinary research on topics such as landslide hazard.

This work was funded by the Next Generation EU—Italian NRRP, Mission 4, Component 2, Investment 1.5, call for the creation and strengthening of ‘Innovation Ecosystems’, building ‘Territorial R&D Leaders’ (Directorial Decree n. 2021/3277)—project Tech4You—Technologies for climate change adaptation and quality of life improvement, n. ECS0000009. This work reflects only the authors' views and opinions, neither the Ministry for University and Research nor the European Commission can be considered responsible for them.

How to cite: Scarcella, G. E., Aceto, L., and Gullà, G.: A relevant accessible and interoperable geotechnical data tool to support the landslide risk management, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-15216, https://doi.org/10.5194/egusphere-egu25-15216, 2025.