EGU24-17007, updated on 11 Mar 2024
https://doi.org/10.5194/egusphere-egu24-17007
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Building a useful dataset for ICON output

Lukas Kluft and Tobias Kölling
Lukas Kluft and Tobias Kölling
  • Max Planck Institute for Meteorology, Climate Physics, Hamburg, Germany (lukas.kluft@mpimet.mpg.de)

Global kilometer-scale climate models produce vast amounts of output, posing challenges in efficient data utilization. For ICON, we addressed this by creating a consolidated and analysis-ready dataset in the Zarr format, departing from the previous cumbersome directory structure. This new dataset format provides a comprehensive overview of variables and time steps at one glance.

To ensure swift and ergonomic access to the dataset, we employ two key concepts: output hierarchies and multidimensional chunking. We remapped all output onto the HEALPix grid, facilitating hierarchical resolutions, and pre-computed temporal aggregations like daily and monthly averages. This enables users to seamlessly switch between resolutions, reducing computational burdens during post-processing.

Spatial chunking of high-resolution data further allows for efficient extraction of regional subsets, significantly improving the efficiency of common climate science analyses, such as time series and vertical cross-sections. While our efforts primarily integrate established strategies, the synergies achieved in resolution have shown a profound impact on the post-processing efficiency of our global kilometer-scale output.

In summary, our approach, creating a single analysis-ready dataset, pre-computing hierarchies, and employing spatial chunking, addresses challenges in managing and extracting meaningful insights from increasingly large model output. We successfully tested the new analysis-ready datasets during well-attended hackathons, revealing significant usability and performance improvements over a wide range of real-life applications.

How to cite: Kluft, L. and Kölling, T.: Building a useful dataset for ICON output, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-17007, https://doi.org/10.5194/egusphere-egu24-17007, 2024.