EGU24-4772, updated on 08 Mar 2024
https://doi.org/10.5194/egusphere-egu24-4772
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

A Comparison of SimCLR and SwAV Contrastive Self-Supervised Learning Models For Landslide Detection

Hejar Shahabi1, Omid Ghorbanzadeh2, Saeid Homayouni1, and Pedram Ghamisi3,4
Hejar Shahabi et al.
  • 1Centre Eau Terre Environnement, Institut national de la recherche scientifique, Quebec, Canada (hejar.shahabi@inrs.ca; saeid.homayouni@inrs.ca)
  • 2Institute of Geomatics, University of Natural Resources and Life Sciences (BOKU), Vienna, Austria (omid.ghorbanzadeh@boku.ac.at)
  • 3Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany (p.ghamisi@hzdr.de)
  • 4Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, Austria (pedram.ghamisi@iarai.ac.at)

Deep Learning (DL) algorithms have demonstrated superior efficacy compared to traditional Machine Learning (ML) methods in the realm of landslide detection through the analysis of Remote Sensing (RS) imagery. However, their performance is notably contingent upon the quantity of manual annotations utilized during the training process. This investigation delves into the utilization of two distinct Self-Supervised Learning (SSL) models, specifically the Simple Framework for Contrastive Learning of Visual Representations (SimCLR) and Swapping Assignments between multiple Views (SwAV). These models were adapted and enhanced for downstream tasks, particularly in the domain of landslide detection. To train the SSL models, the Landslide4Sense competition dataset was employed, consisting of 3799 training patches, 245 validation patches, and 800 testing patches generated from Sentinel-2 images acquired from diverse regions worldwide. During the training of SimCLR and SwAV models, only the training patches were utilized, with a series of data augmentations applied to the input dataset based on each model's architecture. Both models employed ResNet-50 as the encoder.

For the downstream task of landslide detection, a custom U-Net model was developed. The trained ResNet-50 served as the encoder, and during fine-tuning, only the decoder part was permitted to be trained while the encoder remained frozen. During the fine-tuning process, subsets comprising 1% and 10% of labeled data from the training dataset were randomly selected to train the model, and predictions were exclusively conducted on the testing data. While a conventional supervised ResU-Net model, which was trained on all labeled training datasets, attained an F1 score of 72%, the SSL models achieved F1 scores of 64% and 71% with 1% labeled data, and 68% and 76% with 10% labeled data for SimCLR and SwAV, respectively. In addition, comparisons were conducted with all supervised reference models in the Landslide4Sense competition, revealing that SwAV, with 10% labeled data, outperformed all models, surpassing their top model by 4%. This study underscores the potential of SSL techniques in the segmentation and classification of RS images for natural hazard mapping, particularly in scenarios where labeled data is not available or is limited.

How to cite: Shahabi, H., Ghorbanzadeh, O., Homayouni, S., and Ghamisi, P.: A Comparison of SimCLR and SwAV Contrastive Self-Supervised Learning Models For Landslide Detection, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-4772, https://doi.org/10.5194/egusphere-egu24-4772, 2024.