EGU26-19451, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-19451
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Monday, 04 May, 14:00–15:45 (CEST), Display time Monday, 04 May, 14:00–18:00
 
Hall X4, X4.117
Making Kilometer-Scale Earth System Model (ESM) simulations usable: A workflow approach from European Eddy RIch ESMs (EERIE) project.
Chathurika Wickramage1, Fabian Wachsmann1, Jürgen Kröger2, Rohith Ghosh3, and Matthias Aengenheyster4
Chathurika Wickramage et al.
  • 1German Climate Computing Center, Data management, Hamburg, Germany (wickramage@dkrz.de)
  • 2Max Planck Institute for Meteorology, Hamburg, Germany
  • 3Alfred Wegener Institute Helmholtz Center for Polar and Marine Research, Bremerhaven, Germany
  • 4European Centre for Medium-Range Weather Forecasts, Reading, UK, Bonn, Germany

Kilometer-scale global climate simulations are now generating petabytes of output at such a rapid pace that data production is surpassing data standardization. Central ESM infrastructures have traditionally followed a “data warehouse” approach: extensive preprocessing, quality control, and formatting are performed before users receive self-describing, FAIR-aligned files. While this delivers highly standardized and interoperable products, it also creates a growing bottleneck, computationally and organizationally, so that routine actions like checking variables, extracting a region and time slice, or comparing experiments can become slow, and hard to reproduce in practice. The EERIE project (https://eerie-project.eu/about/) is a clear example: its eddy-rich Earth System Models generate detailed and valuable output, but at a scale and pace that overwhelms traditional file-by-file workflows and delays usable access.

At DKRZ, we address this with an end-to-end workflow that transforms raw EERIE model output into analysis-ready datasets (ARD) that are easy to discover, subset, and analyze without requiring users to copy or download terabytes of files. The central element of this workflow is to create virtual Zarr datasets of the raw model output received from the modeling groups, by extracting chunk information and storing them in the kerchunk format with VirtualiZarr (https://virtualizarr.readthedocs.io/en/stable/index.html). These native-grid virtual datasets are published through both an intake catalog (https://github.com/eerie-project/intake_catalogues) and a STAC (SpatioTemporal Asset Catalog; https://discover.dkrz.de/external/stac2.cloud.dkrz.de/fastapi/collections/eerie?.language=en) interface, enabling users to examine variables, time period, regions etc., and retrieve only the subset they need while the bulk remains in place. Alongside native model-grid resolution, the data is also provided on a common ¼ degree regular grid to facilitate inter-model comparison.  Finally, we employ widely used standards and publish standardized products through established climate-data services (ESGF; https://esgf-metagrid.cloud.dkrz.de/search and WDCC; https://www.wdc-climate.de/ui/project?acronym=EERIE). We also aim to publish the processing scripts used throughout the pipeline, enabling others to build on the lessons learned from the EERIE approach.

How to cite: Wickramage, C., Wachsmann, F., Kröger, J., Ghosh, R., and Aengenheyster, M.: Making Kilometer-Scale Earth System Model (ESM) simulations usable: A workflow approach from European Eddy RIch ESMs (EERIE) project., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-19451, https://doi.org/10.5194/egusphere-egu26-19451, 2026.