EGU25-9315, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-9315
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Thursday, 01 May, 08:35–08:55 (CEST)
 
Room L2
Ice Floe Data Augmentation Using Diffusion Models
Justin Bunker1, Martin S. J. Rogers3, Louisa van Zeeland2, Jeremy Wilkinson3, and Mark Girolami1,2
Justin Bunker et al.
  • 1University of Cambridge
  • 2The Alan Turing Institute
  • 3British Antarctic Survey

The monitoring of ice floe is essential for mapping marine ecosystems, ensuring safe ship navigation, and ice hazard forecasting. Satellite imagery, such as Synthetic Aperture Radar (SAR), is a prime candidate for capturing information related to ice floes, due to the ability to discern sea ice conditions in this imagery in cloudy or poor lighting conditions. This SAR imagery can then be passed along to image processing algorithms to extract quantities of interest such as floe size distribution (FSD). Whilst considerable research has used fully supervised machine learning models in this domain, such models require an abundant amount of annotated data for training. The time-consuming, subjective, and costly process of annotating limits the amount of available data that can be used during training and, thus, reduces the performance of the trained model. To alleviate this problem, we turn towards the burgeoning field of generative modeling to create synthetic labeled data.

An important class of generative models, known as diffusion models, has been shown to be particularly efficient. Over the years, a rich plethora of techniques and architectures have been developed to enable these diffusion models to provide realistic samples from an approximate distribution of the training data. Moreover, such models can also be conditioned by additional information, such as texts or images, offering an interesting degree of flexibility to explore and enhance the sampling process. More pertinently, diffusion models have been employed to generate synthetic images of semi-natural areas captured by drones, as well as satellite imagery of rural and urban scenes. However, to date, their application to SAR imagery of the cryosphere remains unexplored.

In this work, we describe a process whereby we use a diffusion model, namely a Denoising Diffusion Probabilistic Model, to model the joint distribution over the space of SAR images and their corresponding labels. In addition to standard error metrics, we use FSD to demonstrate that the synthetic SAR data is consistent with the real data. Furthermore, we show that using a dataset composed of both the real data and the synthetic data results in better performance for segmentation modeling. Additional experiments are performed to show performance as a function of the amount of real and synthetic data. 

How to cite: Bunker, J., Rogers, M. S. J., van Zeeland, L., Wilkinson, J., and Girolami, M.: Ice Floe Data Augmentation Using Diffusion Models, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9315, https://doi.org/10.5194/egusphere-egu25-9315, 2025.