EGU24-13651, updated on 09 Mar 2024
https://doi.org/10.5194/egusphere-egu24-13651
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Why and How to Increase Dataset Compression in RDIs and MIPs like CMIP7

Charles Zender
Charles Zender
  • Earth System Science Department, University of California, Irvine, United States of America (zender@uci.edu)

Research data infrastructures (RDIs) like the Coupled Model Intercomparison Project (CMIP) exemplify geoscientific dataset archive organization and applied informatics. The CMIP metadata and data policies have continuously co-evolved with mature and FAIR technologies (e.g., CF, OpenDAP, ESGF) that are, in turn, often adopted by other RDIs. Improved lossy and lossless compression support in the standard netCDF/HDF5 scientific software stack merit consideration for adoption in upcoming MIPs and RDIs like CMIP7. We have proposed a three point plan to CMIP7 to utilize modern lossy and lossless compression to reduce its storage and power requirements (and associated greenhouse gas emissions). The plan will boost the compression ratio of CMIP-like datasets by a factor of about three relative to CMIP6, preserve all scientifically meaningful data, and retain CF-compliance. We will present the plan, and discuss why and how to implement it in CMIP7 and other MIPs and RDIs.

How to cite: Zender, C.: Why and How to Increase Dataset Compression in RDIs and MIPs like CMIP7, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-13651, https://doi.org/10.5194/egusphere-egu24-13651, 2024.