EGU24-15571, updated on 09 Mar 2024
https://doi.org/10.5194/egusphere-egu24-15571
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Self-Supervised Learning Strategies for Clustering Continuous Seismic Data

Joachim Rimpot1, Clément Hibert1,2, Jean-Philippe Malet1,2, Germain Forestier3, and Jonathan Weber3
Joachim Rimpot et al.
  • 1Institut Terre et Environnement de Strasbourg, CNRS UMR 7063, Université de Strasbourg, Strasbourg, France
  • 2Ecole et Observatoire des Sciences de la Terre, CNRS UAR 830, Université de Strasbourg, Strasbourg, France
  • 3Institut de Recherche en Informatique, Mathématiques, Automatique et Signal, UR 7499, Université de Haute-Alsace, Mulhouse, France

Continuous seismological datasets offer insights for the understanding of the dynamics of many geological structures (such as landslides, ice glaciers, and volcanoes) in relation to various forcings (meteorological, climatic, tectonic, anthropic) factors. Recently, the emergence of dense seismic station networks has provided opportunities to document these phenomena, but also introduced challenges for seismologists due to the vast amount of data generated, requiring more sophisticated and automated data analysis  techniques. To tackle this challenge, supervised machine learning demonstrates promising performance; however, it necessitates the creation of training catalogs, a process that is both time-consuming and subject to biases, including pre-detection of events and subjectivity in labeling. To address these biases, manage large data volumes and discover hidden signals in the datasets, we introduce a Self-Supervised Learning (SSL) approach for the unsupervised clustering of continuous seismic data. The method uses siamese deep neural networks to learn from the initial data. The SSL model works by increasing the similarity between pairs of images corresponding to several representations (seismic traces, spectrograms) of the seismic data. The images are positioned in a 512-dimensional space where possible similar events are grouped together. We then identify groups of events using clustering algorithms, either centroid-based or density-based. 

The processing technique is applied to two dense arrays of continuous seismological datasets acquired at the Marie-sur-Tinée landslide and the Pas-de-Chauvet rock glacier, both located in the South French Alps. Both datasets include over a month of continuous data from more than 50 stations. The processing technique is then applied to the continuous data streams from either a single station or from the whole station network. The clustering products show a high number of distinct clusters that could potentially be considered as produced by different types of sources. This includes the anticipated main types of seismicity observed in these contexts: earthquakes, rockfalls, natural and anthropogenic noises as well as potentially yet unknown sources. Our SSL-based clustering approach streamlines the exploration of large datasets, allowing more time for detailed analysis of the mechanisms and processes active in these geological structures.

How to cite: Rimpot, J., Hibert, C., Malet, J.-P., Forestier, G., and Weber, J.: Self-Supervised Learning Strategies for Clustering Continuous Seismic Data, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-15571, https://doi.org/10.5194/egusphere-egu24-15571, 2024.