- 1Sapienza Università di Roma, Dipartimento di Ingegneria Civile, Edile e Ambientale (DICEA), Rome, Italy
- 2Università degli Studi di Palermo, Dipartimento di Scienze della Terra e del Mare (DISTEM), Palermo, Italy
- 3Istituto Nazionale di Geofisica e Vulcanologia, Roma, Italia
Mud volcanoes are highly dynamic geohazard environments in which surface conditions can change over very short timescales due to episodic mud extrusion, flow, drying, cracking and oxidation. The resulting landscapes are spatially heterogeneous and typically include mixtures of fresh and weathered mud, crusted deposits, bare soil and dense or sparse vegetation. Considering the opportunities offered by deep learning for environmental monitoring, a consistent categorization of these surfaces is essential to quantify spatial patterns through time and to assess the evolution of active areas. However, progress is often limited by the lack of high quality, domain-specific labelled datasets. This gap slows the adoption of deep learning models in specialized environmental settings such as mud volcanoes, because the most readily available training datasets are largely drawn from urban and human-centered contexts. While manual annotation can partially compensate for limited training data, it is labor-intensive and difficult to standardize across operators, especially where class transitions are gradual and boundaries are diffuse rather than sharp.
This study investigates how multispectral orthophotos can support separation of key mud volcano surface features and thereby accelerate mask creation for dataset generation. We present a case study at the Aragona mud volcano field (Sicily, Italy), called the Maccalube, using imagery acquired with a DJI Mavic 3 Multispectral and processed into an orthomosaic with Agisoft Metashape. We first evaluated common soil and vegetation oriented spectral indices as separability baselines. In this setting, however, baseline indices can be ambiguous because wet clay-rich substrates and thin surface water films may yield intermediate responses that overlap low cover vegetation. We additionally tested common rapid segmentation methods on the RGB orthomosaic including K-means, Simple Linear Iterative Clustering and Segmentate Anything. These algorithms show poor performance, often merging distinct classes and fragmenting individual ones, which requires substantial manual correction.
We therefore introduce a practical band combination that integrates information from the visible channels with the red-edge and near-infrared bands to improve discrimination between vegetation, wet mud and drier or more weathered mud areas. The calculation is constructed in two steps: first, the visible channels are combined into a neutrality term that increases when RGB responses are similar (low color contrast). Second, this term is multiplied by an inverted red-edge contrast component derived from the near-infrared and red-edge bands, reducing the output where a strong red-edge rise is present. The result of the proposed band combination is a pre-labelling layer that can be thresholded to generate candidate masks with improved vegetation suppression. Remaining ambiguities are mainly confined to non-vegetated materials with similar dark appearance, including very fresh dark mud versus other bare substrates. Overall, the workflow offers a practical way to accelerate mask creation in domains where labelled data are limited. It supports the rapid development of domain specific training datasets for deep learning applications, in light of future automated monitoring of these environments.
How to cite: Guastella, M., Martorana, R., Pisciotta, A., and D'Alessandro, A.: Multispectral pre-labelling workflow for mud volcano training datasets: a case study at the Maccalube of Aragona, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-12274, https://doi.org/10.5194/egusphere-egu26-12274, 2026.