EGU25-8460, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-8460
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Monday, 28 Apr, 09:50–10:00 (CEST)
 
Room -2.92
SeCo-Eco: Global multiband seasonal pre-training dataset and self-supervised model for ecological applications
Elena Plekhanova1, Damien Robert2, Johannes Dollinger2, Philipp Brun1, Jan Dirk Wegner2, and Niklaus E. Zimmermann1
Elena Plekhanova et al.
  • 1Land Change Science, Swiss Federal Research Institute WSL, Birmensdorf, Switzerland
  • 2EcoVision Lab, DM3L, University of Zurich, Zurich, Switzerland

With the biodiversity crisis and land use intensification, macroecological questions related to biodiversity assessment and conservation are becoming increasingly pressing. Such questions require global datasets such as satellite imagery. Traditional methods using satellite data rely heavily on supervised learning and annotated datasets, which are limited and difficult to generalize across geographical scales. In recent years, self-supervised learning (SSL) has opened the doors to learning expressive representations of massive datasets without annotations,  thus revolutionizing the analysis of remote sensing imagery. However, currently available datasets for pre-training such models have a skewed geographical distribution, focusing on cities and agricultural areas while failing to adequately represent regions of high ecological interest, such as rainforests or polar latitudes.

We propose a new Sentinel 2A (10m resolution) multiband dataset, globally distributed on a regular grid across the landmass(250k locations). At each location, the dataset captures four different seasons determined based on the local EVI-curve and includes NDVI index, which is widely used in ecological applications. Our temporal sampling is specifically designed to align with plant phenology rather than ad-hoc calendar dates. We use this data to pre-train Momentum Contrast and Seasonal Contrast SSL models that have shown similar performance on commonly-used benchmarks and advanced performance on macroecological downstream tasks, such as species distribution modelling. We anticipate that the dataset and model will be valuable for macroecological applications, such as deep species distribution modeling or large-scale biodiversity assessments.

How to cite: Plekhanova, E., Robert, D., Dollinger, J., Brun, P., Wegner, J. D., and Zimmermann, N. E.: SeCo-Eco: Global multiband seasonal pre-training dataset and self-supervised model for ecological applications, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-8460, https://doi.org/10.5194/egusphere-egu25-8460, 2025.