EGU24-5769, updated on 08 Mar 2024
https://doi.org/10.5194/egusphere-egu24-5769
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Exploring Transfer Learning Using Segment Anything Model in Optical Remote Sensing

Mohanad Albughdadi1, Vasileios Baousis2, Tolga Kaprol1, Armagan Karatosun1, and Claudio Pisa1
Mohanad Albughdadi et al.
  • 1European Centre for Medium-Range Weather Forecasts, Bonn, Germany (firstname.lastname@ecmwf.int)
  • 2European Centre for Medium-Range Weather Forecasts, Reading, UK (vasileios.baousis@ecmwf.int)

In the realm of remote sensing, where labeled datasets are scarce, leveraging pre-trained models via transfer learning offers a compelling solution. This study investigates the efficacy of the Segment Anything Model (SAM), a foundational computer vision model, in the domain of optical remote sensing tasks, specifically focusing on image classification and semantic segmentation.

The scarcity of labeled data in remote sensing poses a significant challenge for machine learning development. Transfer learning, a technique utilizing pre-trained models like SAM, circumvents this challenge by leveraging existing data from related domains. SAM, developed and trained by Meta AI, serves as a foundational model for prompt-based image segmentation. It employs over 1 billion masks on 11 million images, facilitating robust zero-shot and few-shot capabilities. SAM's architecture comprises an image encoder, prompt encoder, and mask decoder components, all geared towards swift and accurate segmentation for various prompts, ensuring real-time interactivity and handling ambiguity.

Two distinct use cases leveraging SAM-based models in the domain of optical remote sensing are presented, representing two critical tasks: image classification and semantic segmentation. Through comprehensive analysis and comparative assessments, various model architectures, including linear and convolutional classifiers, SAM-based adaptations, and UNet for semantic segmentation, are examined. Experiments encompass contrasting model performances across different dataset splits and varying training data sizes. The SAM-based models include using a linear, a convolutional or a ViT decoder classifiers on top of the SAM encoder.

Use Case 1: Image Classification with EuroSAT Dataset

The EuroSAT dataset, comprising 27,000 labeled image patches from Sentinel-2 satellite images across ten distinct land cover classes, serves as the testing ground for image classification tasks. SAM-ViT models consistently demonstrate high accuracy, ranging between 89% and 93% on various sizes of training datasets. These models outperform baseline approaches, exhibiting resilience even with limited training data. This use case highlights SAM-ViT's effectiveness in accurately categorizing land cover classes despite data limitations.

Use Case 2: Semantic Segmentation with Road Dataset

In the semantic segmentation domain, the study focuses on the Road dataset, evaluating SAM-based models, particularly SAM-CONV, against the benchmark UNet model. SAM-CONV showcases remarkable superiority, achieving F1-scores and Dice coefficients exceeding 0.84 and 0.82, respectively. Its exceptional performance in pixel-level labeling emphasizes its robustness in delineating roads from surrounding environments, surpassing established benchmarks and demonstrating its applicability in fine-grained analysis.

In conclusion, SAM-driven transfer learning methods hold promise for robust remote sensing analysis. SAM-ViT excels in image classification, while SAM-CONV demonstrates superiority in semantic segmentation, paving the way for their practical use in real-world remote sensing applications despite limited labeled data availability.

How to cite: Albughdadi, M., Baousis, V., Kaprol, T., Karatosun, A., and Pisa, C.: Exploring Transfer Learning Using Segment Anything Model in Optical Remote Sensing, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-5769, https://doi.org/10.5194/egusphere-egu24-5769, 2024.