- 1Faculty of Geo-information Science and Earth Observation, University of Twente, Enschede, The Netherlands (s.girgin@utwente.nl)
- 2Netherlands eScience Center, Amsterdam, The Netherlands
- 3SURF, Utrecht, The Netherlands
- 4Royal Netherlands Meteorological Institute, Utrecht, The Netherlands
Data accessibility is crucial in modern research across the Natural and Engineering Sciences (NES), including Geosciences, and is central to the push toward Open Science. Yet, accessing and efficiently processing rapidly growing datasets, such as Earth-related spatiotemporal data, remains challenging as sources diversify and collection frequency increases. Most of these datasets are hosted on the Cloud, and cloud-native data access and processing are ramping up as modern digital competences. Cloud-based processing is especially beneficial because bringing computation close to the data boosts efficiency and reduces analysis time. Despite this, many researchers still rely on the inefficient approach of downloading data for local analysis. Sometimes this is unavoidable because the data is not provided in cloud-friendly formats, but often it also reflects a lack of skills for cloud-based access and processing. A similar problem occurs in data publishing, where research datasets are frequently shared in formats that impede efficient cloud access and interoperability, even though cloud-optimized formats could be used at no additional cost.
The CLOUD-NES project, funded by the Dutch Research Council (NWO) via the Thematic Digital Competence Centre NES (TDCC-NES), aims to advance cloud-native tools and workflows for publishing, accessing, and processing research data in the Netherlands. The project demonstrates the benefits of cloud-native approaches through reproducible performance benchmarks and equipes researchers with practical training to strengthen digital competencies. A public cloud-native data repository with co-located analysis capabilities is being developed, featuring object-based scalable storage and STAC-compliant data catalog, and hosting selected datasets from large-scale geospatial data providers in the Netherlands such as PDOK and KNMI, transformed into cloud-optimized formats. Through iterative benchmarking, we are assessing the performance of cloud-native storage formats, access patterns, and analysis workflows, generating reproducible evidence of efficiency gains to support community adoption. All infrastructure, ingestion pipelines, and benchmarking code will be open-source, accompanied by detailed guidelines and documentation. To further accelerate adoption, domain-specific open training materials will be developed and hands-on workshops for researchers and data providers will be organized. Training covers cloud-native data access, workflow design, dataset publishing, and infrastructure deployment, using common domain-specific workflows as case studies. Community events and mini symposia will foster community building and knowledge exchange, while lessons learned and best practices will be disseminated nationally and internationally.
By combining demonstrable benchmarks, practical training, and clear guidance for data providers, CLOUD-NES aims to accelerate the adoption of cloud-native research practices across the Dutch research community and beyond, improving efficiency, reproducibility, and accessibility of large, complex datasets. This presentation provides an overview of the CLOUD-NES project, covering the design and operation of its reproducible cloud-native benchmarking framework and the structure of its open training materials. Planned project activities, including community-building events and mini-symposia on effective cloud-native practices, will also be highlighted.
How to cite: Girgin, S., Nattino, F., Brandt, M., and Plieger, M.: Advancing cloud-native data access and processing for Natural and Engineering Sciences: CLOUD-NES, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-21049, https://doi.org/10.5194/egusphere-egu26-21049, 2026.