- 1University of Zurich, Switzerland
- 2Fulda University of Applied Sciences. Department of Applied Computer Science, Germany
- 3Universidad de Buenos Aires, Facultad de Ciencias Exactas y Naturales, Departamento de Ciencias de la Atmósfera y los Océanos. Buenos Aires, Argentina
- 4CONICET – Universidad de Buenos Aires. Centro de Investigaciones del Mar y la Atmósfera (CIMA). Buenos Aires, Argentina
- 5University of Southern Denmark. Odense, Denmark
Major challenges with modern weather and climate simulations are the resources required to store, analyze and visualize the generated data. This storage problem forces scientists to compromise on data dimensionality, for example by discarding physical variables or the reduction of temporal timestamps
Tensor decomposition and approximation (TA) methods recently got a revival in the context of neural networks, to reduce the number of network parameters. However, TA methods also exhibit interesting properties favorable for the lossy compression of volumetric data. For example, for turbulence volumes created by simulations, compression ratios higher than 300 can be reached while preserving high precision. This allows for more efficient storage of large multi-dimensional data grids. Furthermore, tensor decompositions allow for partial reconstruction as well as the application of linear functions in the compressed domain, making these representations especially suitable for a variety of down-stream tasks analyzing the data such as statistical analysis. However, one open question, as for all lossy compression techniques, is, how the loss influences the quality of said tasks.
For the operationalization of TA methods, another challenge is their parametrization. Various decomposition techniques exist and selecting the most appropriate one is non-trivial. Further, data likely needs to be divided into smaller pieces, e.g. chunks, to achieve the best results, meaning high compression ratios while introducing as little error as possible. The division of the data in this context can mean both, omitting dimension (and hence reducing the dimensionality of the tensor) as well as splitting the data within dimensions. Finally, different tensor decomposition methods allow for different setups, further widening the compression parameter space to explore.
In this work we present an experimental setup that verifies compression performance regarding error metrics directly on the data as well as impact of the compression losses in downstream visualization tasks. We are using an offline TA-based compression scheme in which the data is reconstructed, i.e. decompressed, before saving it again in a standard format and hence being easily able to be fed into downstream visualization applications such as Met.3D. On this example, we will discuss how numerical error metrics, such as the relative error or the RMSE, are not always representative for errors in the visualization of the data in downstream tasks, especially for variables derived from the data. Further, we present different strategies for partitioning the data into chunks and motivate the effectiveness of tensor decomposition methods in the domain of numerical weather forecast data.
How to cite: Croci, J. A., Rautenhaus, M., Hartmann, C., Gacitua Gutierrez, J., Ruiz, J. J., Salio, P., Diehl, A., and Pajarola, R.: Evaluating Tensor Decomposition and Approximation as Lossy Compression for Weather Data Visualization Tasks, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20880, https://doi.org/10.5194/egusphere-egu26-20880, 2026.