- 1University of Edinburgh, School of Engineering, Institute for Imaging, Data and Communications, United Kingdom of Great Britain – England, Scotland, Wales (miguel.espinosa@ed.ac.uk)
- 2European Space Agency (ESA), ESRIN, Rome, Italy
- 3Asterisk Labs
The work presented herein showcases early results of COP-GEN, a general-purpose diffusion model supporting flexible zero-shot translation between a number of popular data modalities related to the Copernicus programme: Sentinel-2 (both L1C and L2A), Sentinel-1 RTC, Copernicus DEM-30, Land Use Land Cover Maps, Cloud Masks, geospatial coordinates, and timestamps.
COP-GEN is designed as a diffusion model with a transformer backbone, which offers two concrete advantages. Firstly, the diffusion formulation respects the stochastic nature of cross-modal translation tasks; so that nearly every conditional generation query can be satisfied by a diverse range of plausible outputs rather than a single deterministic sample. Secondly, the sequence-based architecture facilitates the integration of diverse data modalities by flattening their latent representations, along with modality-specific diffusion timesteps, into a single sequence of tokens. Consequently, COP-GEN is capable of synthesising missing data from any subset of modalities in a zero-shot manner.
The model is pre-trained at global scale on MajorTOM, using over one million paired, geographically distributed samples spanning diverse climate zones, land-cover types, and acquisition conditions. By training jointly on matched data modalities, COP-GEN can, for example, estimate Land Use Land Cover, cloud coverage, atmospheric correction, and the spatiotemporal context of the available observations.
The first set of results indicates strong generative capability and high output diversity across modalities. The work concludes by discussing the available open-source implementation along with potential use cases.
How to cite: Espinosa, M., Gmelich Meijling, E., Marsocci, V., Crowley, E. J., and Czerkawski, M.: COP-GEN: Stochastic Generative Modelling of Copernicus Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-10800, https://doi.org/10.5194/egusphere-egu26-10800, 2026.