EGU22-9948
https://doi.org/10.5194/egusphere-egu22-9948
EGU General Assembly 2022
© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

Exploring Lossy Compressibility through Statistical Correlations of Geophysical Datasets

Julie Bessac1, David Krasowksa2, Robert Underwood1, Sheng Di1, Jon Calhoun2, and Franck Cappello1
Julie Bessac et al.
  • 1Mathematics and Computer Science Division, Argonne National Laboratory, Lemont, USA
  • 2Holcombe Department of Electrical and Computer Engineering, Clemson University, Clemson , USA

Lossy compression plays a growing role in geophysical and other computer-based simulations where the cost of storing their output data on large-scale systems can span terabytes and even petabytes in some cases. Using error-bounded lossy compression reduces the amount of storage for each simulation; however, there is no known bound for the upper limit on lossy compressibility for a given dataset. Correlation structures in the data, choice of compressor and error bound are factors allowing larger compression ratios and improved quality metrics. Analyzing these three factors provides one direction towards quantifying limits of lossy compressibility. As a first step, we explore statistical methods to characterize correlation structures present in several climate simulations and their relationships, through functional regression models, to compression ratios. In particular, we show results for climate simulations from the Community Earth System Model (CESM) as well as for hurricanes simulations from Hurricane-ISABEL from IEEE Visualization 2004 contest, compression ratios of the widely used lossy compressors for scientific data SZ, ZFP and MGARD exhibit a logarithmic dependence to the global and local correlation ranges when combined with information on the variability of the considered fields through the variance or gradient magnitude. Further works will focus on providing a unified characterization of these relationships across compressors and error bounds. This consists of a first step towards evaluating the theoretical limits of lossy compressibility used to eventually predict compression performance and adapt compressors to correlation structures present in the data. 

How to cite: Bessac, J., Krasowksa, D., Underwood, R., Di, S., Calhoun, J., and Cappello, F.: Exploring Lossy Compressibility through Statistical Correlations of Geophysical Datasets, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-9948, https://doi.org/10.5194/egusphere-egu22-9948, 2022.