Exploring Transfer Learning Using Segment Anything Model in Optical Remote Sensing

Mohanad Albughdadi; Vasileios Baousis; Tolga Kaprol; Armagan Karatosun; Claudio Pisa

doi:https://doi.org/10.5194/egusphere-egu24-5769

[Back] [Session ESSI1.1]

EGU24-5769, updated on 14 Jan 2025

https://doi.org/10.5194/egusphere-egu24-5769

EGU General Assembly 2024

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Exploring Transfer Learning Using Segment Anything Model in Optical Remote Sensing

Mohanad Albughdadi

¹, Vasileios Baousis

², Tolga Kaprol¹, Armagan Karatosun

¹, and Claudio Pisa

¹

Mohanad Albughdadi et al.

¹European Centre for Medium-Range Weather Forecasts, Bonn, Germany (firstname.lastname@ecmwf.int)
²European Centre for Medium-Range Weather Forecasts, Reading, UK (vasileios.baousis@ecmwf.int)

In the realm of remote sensing, where labeled datasets are scarce, leveraging pre-trained models via transfer learning offers a compelling solution. This study investigates the efficacy of the Segment Anything Model (SAM), a foundational computer vision model, in the domain of optical remote sensing tasks, specifically focusing on image classification and semantic segmentation.

The scarcity of labeled data in remote sensing poses a significant challenge for machine learning development. Transfer learning, a technique utilizing pre-trained models like SAM, circumvents this challenge by leveraging existing data from related domains. SAM, developed and trained by Meta AI, serves as a foundational model for prompt-based image segmentation. It employs over 1 billion masks on 11 million images, facilitating robust zero-shot and few-shot capabilities. SAM's architecture comprises an image encoder, prompt encoder, and mask decoder components, all geared towards swift and accurate segmentation for various prompts, ensuring real-time interactivity and handling ambiguity.

Two distinct use cases leveraging SAM-based models in the domain of optical remote sensing are presented, representing two critical tasks: image classification and semantic segmentation. Through comprehensive analysis and comparative assessments, various model architectures, including linear and convolutional classifiers, SAM-based adaptations, and UNet for semantic segmentation, are examined. Experiments encompass contrasting model performances across different dataset splits and varying training data sizes. The SAM-based models include using a linear, a convolutional or a ViT decoder classifiers on top of the SAM encoder.

Use Case 1: Image Classification with EuroSAT Dataset

The EuroSAT dataset, comprising 27,000 labeled image patches from Sentinel-2 satellite images across ten distinct land cover classes, serves as the testing ground for image classification tasks. SAM-ViT models consistently demonstrate high accuracy, ranging between 89% and 93% on various sizes of training datasets. These models outperform baseline approaches, exhibiting resilience even with limited training data. This use case highlights SAM-ViT's effectiveness in accurately categorizing land cover classes despite data limitations.

Use Case 2: Semantic Segmentation with Road Dataset

In the semantic segmentation domain, the study focuses on the Road dataset, evaluating SAM-based models, particularly SAM-CONV, against the benchmark UNet model. SAM-CONV showcases remarkable superiority, achieving F1-scores and Dice coefficients exceeding 0.84 and 0.82, respectively. Its exceptional performance in pixel-level labeling emphasizes its robustness in delineating roads from surrounding environments, surpassing established benchmarks and demonstrating its applicability in fine-grained analysis.

In conclusion, SAM-driven transfer learning methods hold promise for robust remote sensing analysis. SAM-ViT excels in image classification, while SAM-CONV demonstrates superiority in semantic segmentation, paving the way for their practical use in real-world remote sensing applications despite limited labeled data availability.

How to cite: Albughdadi, M., Baousis, V., Kaprol, T., Karatosun, A., and Pisa, C.: Exploring Transfer Learning Using Segment Anything Model in Optical Remote Sensing, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-5769, https://doi.org/10.5194/egusphere-egu24-5769, 2024.