Detecting Floating Macroplastic Litter with Semi-Supervised Deep Learning

Tianlong Jia; Rinze de Vries; Zoran Kapelan; Riccardo Taormina

doi:https://doi.org/10.5194/egusphere-egu24-9691

[Back] [Session ITS3.24/HS12.9]

EGU24-9691, updated on 08 Mar 2024

https://doi.org/10.5194/egusphere-egu24-9691

EGU General Assembly 2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Detecting Floating Macroplastic Litter with Semi-Supervised Deep Learning

Tianlong Jia¹, Rinze de Vries², Zoran Kapelan¹, and Riccardo Taormina¹

Tianlong Jia et al.

¹Faculty of Civil Engineering and Geosciences, Department of Water Management, Delft University of Technology, Stevinweg 1, 2628 CN Delft, The Netherlands
²Noria Sustainable Innovators, Schieweg 13, 2627 AN Delft, The Netherlands

Researchers are increasingly utilizing Deep Learning methods for computer vision to identify and quantify floating macroplastic litter. While these methods can provide precise assessments of plastic pollution by automatically processing images and videos, they often rely on the availability of large amount of annotated data for supervised learning (SL). Moreover, the manual labeling work is expensive and time-consuming. This hinders obtaining high model generalization capability, which is essential for the development of robust computer vision systems for structural monitoring.

To overcome this challenge, we propose a two-stage semi-supervised learning (SSL) method for detecting floating macroplastic litter based on the SwAV (Swapping Assignments between multiple Views of the same image) approach. SwAV is a self-supervised learning method that extracts the feature representations of data (such as images with macroplastic litter) without manual annotations. In the first stage of the SSL method, we use SwAV to pre-train a ResNet50 (Residual Neural Network with 50 layers) backbone architecture on more than 100K unlabeled images. In the second stage, we add additional layers to ResNet50 to create a Faster R-CNN (Faster Region-based Convolutional Neural Network) architecture, and fine-tune it for object detection using a limited amount of labeled data (<13K images with 2.6K annotations).

We demonstrate the effectiveness and robustness of our methodology for images collected in canals and waterways of the Netherlands and South East Asia. We conduct a thorough comparison with the conventional SL method using the same Faster R-CNN architecture and ImageNet pre-trained weights. The results suggest that our method improves both in-domain and out-of-domain generalization performances over the SL method. Our findings also demonstrate that feature representations learned by the SwAV pre-training on context-related images outperform those learned from much larger, but unrelated, datasets (e.g., ImageNet).

Based on these results, we suggest stakeholders (e.g., researchers, consultants and governmental organizations) to consider SSL methods to develop more robust systems for targeted long-term floating macroplastics monitoring. Future work should focus on scaling up computations by resorting to much larger (e.g., over 1 million images), yet relatively inexpensive, unlabeled datasets to fully exploit SSL.

How to cite: Jia, T., de Vries, R., Kapelan, Z., and Taormina, R.: Detecting Floating Macroplastic Litter with Semi-Supervised Deep Learning, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-9691, https://doi.org/10.5194/egusphere-egu24-9691, 2024.