EGU25-7718, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-7718
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Thursday, 01 May, 15:05–15:15 (CEST)
 
Room -2.31
Dataset preparation for Resolving Apparent Source Time Functions (ASTF) and Evalutions Using Basic ML-models
Runcheng Pang1, Hongyu Yu1, Ge Li2, Haoran Meng3, Zaiwang Liu3, Cheng Su1, Deli Zha1, and Wanli Tian1
Runcheng Pang et al.
  • 1School of Earth Science, Zhejiang University, Hangzhou, China (hongyu.yu@zju.edu.cn)
  • 2Mila-Quebec AI Institute, Montreal, QC, Canada (ge.li@mila.quebec)
  • 3Department of Earth and Space Sciences, Southern University of Science and Technology, Shenzhen, China (menghr@sustech.edu.cn)

Monitoring induced seismic harzards during the fluid injections has been a significant challenge for geo-energy development. Conventional approaches, such as the (adaptive) traffic light protocol and prediction methods based on statistical and machine learning regressions, often yield limited accuracy due to diverse geological conditions across regions. A promising direction lies in developing precursors to monitor fault reactivation more effectively.

Recent seismological studies have shown that aseismic slip loading is more prevalent than previously thought during fault activation induced by fluid injections (Yu et al., 2021a, b; Eyre et al., 2019, 2022). Slow earthquake signals, like Earthquakes characterized by Hybrid-frequency Waveforms (EHW), occur during the transition from fault creep to brittle rupture induced by fluid injection (Guglielmi et al., 2015). These signals are potential indicators for aseismic slip loading and fault reactivation. However, their longer rupture durations distinguish them from typical induced earthquakes, rendering classic source analysis methods ineffective for real-time monitoring.

To address this limitation, we propose ASTF-Net, a machine learning (ML) model designed to predict Apperant Source Time Function (ASTF) by deconvoluting Empirical Green's Function (EGF) waveform from target waveform in time domain. This approach provides reliable real-time estimates of source durations, to identify slow earthquake signals, specifically EHW, and offers a valuable tool for fault activation monitoring. A robust and well-sampled dataset is therefore crucial for the model's performance.

In this study, we present a dataset designed for developing a single-channel version of ASTF-Net and evaluate its effectiveness using basic ML-models. The dataset consists of three parts: ASTFs, EGFs, and target seismic waveforms. Synthetic ASTFs are generated using kinematic forward modeling with an elliptical rupture model to simulate earthquake events with magnitudes ranging from Mw 3.0 to 4.5 and stress drops between 5 and 20 MPa. These random ASTFs are calculated under various ray paths. We collect EGFs from hydraulic fracturing-induced earthquakes (M1.5-2.5) in the Southern Montney Play, western Canada, recorded by a network of 40 nodal/broadband seismic stations between 2017 and 2020. Synthetic target waveforms are then created by convolving ASTFs with corresponding EGFs. The dataset’s inputs consist of single-channel synthetic seismic waveforms and their corresponding EGFs, with ASTFs as outputs. To assess the generalization and robustness of the model, records from different stations are divided into training, validation, and four test sets with varying difficulty levels based on geographical locations and event counts. Notably, Test level 4 is human analysis results reported by Roth et al. (2022).

We evaluate model performance using basic ML architectures, including MLP, CNN, VGG and U-Net. Performance metrics include the Correlation Coefficient (CC) between predictions and labeled ASTFs, and the relative error in apparent source duration. CNN emerges as the most promising candidate for the further optimization, achieving the following CC > 0.9 across test levels: 87.9% (Level 1), 85.9% (Level 2), and 80.3% (Level 3). The percentages of relative error below 10% are 59.3%, 59.3%, and 51.2%, respectively, for the three levels.

How to cite: Pang, R., Yu, H., Li, G., Meng, H., Liu, Z., Su, C., Zha, D., and Tian, W.: Dataset preparation for Resolving Apparent Source Time Functions (ASTF) and Evalutions Using Basic ML-models, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7718, https://doi.org/10.5194/egusphere-egu25-7718, 2025.