EGU25-9632, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-9632
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Monday, 28 Apr, 09:40–09:50 (CEST)
 
Room -2.92
Developing Global Embeddings from Sentinel-1 and Sentinel-2 Data to Enhance Earth Observation Analysis
Marcin Kluczek1, Mikolaj Czerkawski2, and Jędrzej S. Bojanowski1
Marcin Kluczek et al.
  • 1CloudFerro S.A., Warsaw, Poland
  • 2Φ-lab, European Space Agency (ESA), Frascati, Italy

The rapid growth of Earth Observation (EO) data from the Copernicus programme presents new opportunities for applying artificial intelligence (AI) and machine learning (ML) techniques. This work introduces a global embedding framework designed to improve the analysis of large EO datasets from Sentinel-1 and Sentinel-2 imagery. Following the Major TOM standard, we process over 8 million images, encompassing 9.368 trillion pixels of raw data, to generate more than 170 million embeddings from 62 terabytes of satellite data.

To enable that, a set of commonly used vision models (from both general and remote sensing domain, such as SigLIP, DINOv2, SSL4EO, DeCUR and MMEarth) are employed to derive efficient embedding representations of the input data. These embeddings support various applications, including text-to-image and image-to-image retrieval, as well as zero-shot classification, allowing for more effective integration of EO data into AI pipelines and providing valuable insights into global phenomena.

The current approach efficiently processes large-scale data, built on the CloudFerro cloud platform, with experiments demonstrating its usefulness in Earth Observation analysis. The results highlight the system’s reliability across different applications, emphasizing its potential to support data-driven decision-making on a global scale. This study also discusses key strategies for scalable cloud computing, GPU optimization, and multithreaded CPU processing to handle large volumes of EO data efficiently. 

How to cite: Kluczek, M., Czerkawski, M., and Bojanowski, J. S.: Developing Global Embeddings from Sentinel-1 and Sentinel-2 Data to Enhance Earth Observation Analysis, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9632, https://doi.org/10.5194/egusphere-egu25-9632, 2025.