- 1Laboratoire d’Oceanographie Physique et Spatiale, Univ. Brest, CNRS, Ifremer, IRD, Brest, France
- 2Laboratoire de Physique de l’École normale supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université Paris Cité, Paris, France
A significant challenge in data integration and ML methodologies on cloud infrastructures is accurately determining correlated statistics. Initially, aligning data to a consistent pixel grid is essential, motivating the use of Discrete Global Grid Systems (DGGS). In geophysical studies, data reside on a sphere, and approximating with tangent planes can distort results. Our solution is the HEALPix pixelization as our DGGS framework, standardizing data on a common grid for consistent statistical analysis. HEALPix's unique features, such as its iso-latitude layout and uniform pixel areas, enable the use of spin-weighted spherical harmonics in managing vector fields. This enables the accurate calculation of correlation statistics, such as between velocity and scalar fields on the sphere, while minimizing biases due to spherical approximations. By utilizing the HEALPix framework, known in cosmology, with TensorFlow or PyTorch as backends, we created the: HEALML library. This library facilitates gradient computations of all derived statistics for AI optimization, and has been validated on the Pangeo-EOSC platform. This library parallelizes the computation of localized spherical harmonics and includes features like scattering covariance calculations, allowing the extraction of more complex nonlinear statistics beyond the power spectrum. We compare these results to traditional 2D planar methods, demonstrating the advantages of sphere-based statistics on platforms like Pangeo-EOSC. Furthermore, we demonstrate: HEALML's ability to emulate using a substantially smaller dataset. This demonstration emphasizes the ways in which incorporating spherical statistical methods into Pangeo-EOSC fosters innovative and efficient statistical analysis within geophysical research.
How to cite: Delouis, J.-M., Allys, E., Mangin, J., Mousset, L., and Odaka, T.: Advancing Geophysical Data Analysis: HEALML for Efficient Sphere-Based Statistics on Pangeo-EOSC, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-6798, https://doi.org/10.5194/egusphere-egu25-6798, 2025.