EGU26-15196, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-15196
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Monday, 04 May, 14:00–15:45 (CEST), Display time Monday, 04 May, 14:00–18:00
 
Hall X4, X4.95
Zarr at scale: virtualization, sharding, and performance optimizations for Earth science data
Max Jones1, Joe Hamman2, Davis Bennett, Kyle Barron1, and Justus Magin3
Max Jones et al.
  • 1Development Seed, Washington, D.C., United States of America (max@developmentseed.org)
  • 2Earthmover
  • 3LOPS - Laboratoire d'Oceanographie Physique et Spatiale, UMR 6523 CNRS-IFREMER-IRD-Univ.Brest-IUEM

As geoscientific datasets continue to grow in size and complexity, the Zarr community has developed a modern, open-source solution for storage and I/O of multi-dimensional arrays and metadata. Zarr offers a high-performance, highly scalable, cloud-native container for scientific data, which allows scientists to transcend the constraints of individual files and think in terms of coherent datasets. Zarr’s potential has led to widespread adoption across government, industry, and academia. In this presentation, we offer practical guidance for how to leverage the latest and greatest features in the Zarr ecosystem, including:

  • Sharding to reduce the number of files, benefiting HPC users in particular
  • Virtualization via VirtualiZarr and Icechunk to enable high-performance access to data spread across NetCDF4/HDF5, GRIB, or GeoTIFF files
  • Custom data types, compression schemes, and variable chunk grids
  • Client-side (i.e., in-browser) rendering of large multidimensional geospatial datasets

Through concrete examples and best practices, we demonstrate how the Zarr ecosystem enables researchers to work with multi-terabyte datasets as seamlessly as small files.

How to cite: Jones, M., Hamman, J., Bennett, D., Barron, K., and Magin, J.: Zarr at scale: virtualization, sharding, and performance optimizations for Earth science data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-15196, https://doi.org/10.5194/egusphere-egu26-15196, 2026.