Smart data selection – Using machine learning for an automated controlled-source Radio-Magnetotelluric data processing
- 1Helmholtz Centre Potsdam, GFZ, German Research Centre for Geosciences, Department 2: Geophysics , Potsdam, Germany (aplatz@gfz-potsdam.de)
- 2University of Potsdam, Institute of Geoscienes, Potsdam, Germany
- 3Geological Survey of Finland, Espoo, Finland
The Radio-Magnetotelluric (RMT) method is a geophysical near-surface imaging technique with a broad range of possible applications. In 2020, the GFZ Potsdam has acquired a newly developed horizontal magnetic dipole transmitter that allows the application of the RMT method even in regions with an insufficient coverage of radio transmitters which normally serve as source signal. First controlled-source RMT measurements were conducted at three different locations in Chile in 2020. Further measurements were recently conducted in Ireland. As we are able to store the raw time series, we have full control over the subsequent data processing. The processing tools at GFZ include the modular processing suite EMERALD, which was originally designed for MT processing, but has recently been adapted to process RMT data. One main difference is that in RMT the transmitter data is considered as signal, while in natural source MT this would be regarded as electromagnetic noise that needs to be removed using automated robust statistical approaches. However, processing the entire time series in an automated manner has a large drawback: The different emitted frequencies are transmitted in a sweep implying that only a smaller fraction of the time series contains the required signal for a particular target frequency and leading to an unfavourable signal-to-noise ratio. Since it is technically impossible to have the same time base for the data logger and the transmitter with an accuracy of a few nanoseconds, an automated detection scheme is required to find time segments that contain the transmitter signal. Usually, several Gigabytes of raw time series are collected during field measurements, making manual editing and supervision of the time series virtually impossible. However, a careful selection of appropriate time segments is essential for the success of the data processing. To address the challenge, machine learning algorithms have a high potential to solve both problems. Initial experience was gained with a recurrent neural network approach in order to identify suitable time segments (Patzer & Weckmann, EMTF 2021 – conference contribution and personal communication). However, many questions remained open, e.g. if other machine learning algorithms can result in better performances, which machine learning algorithms are in principle suitable for the characteristics and properties of RMT time series and which parameters should be used as input variables (features) for the algorithms. A large number of machine learning algorithms exist, which can be divided into different groups according to their operating principle and their activity fields. We will test unsupervised methods, especially for clustering the data, to identify a set of suitable input variables. Subsequently, we will use these features to train supervised algorithms as logistic regression, support vector machine and different kinds of neural networks to find the best performing algorithm. We will mainly use the RMT data from Chile within the training process. Furthermore, we will test if the trained algorithm is applicable to other new data sets measured at different locations (e.g. Ireland) and/or with different equipment.
How to cite: Platz, A., Weckmann, U., and Patzer, C.: Smart data selection – Using machine learning for an automated controlled-source Radio-Magnetotelluric data processing, EGU General Assembly 2023, Vienna, Austria, 23–28 Apr 2023, EGU23-917, https://doi.org/10.5194/egusphere-egu23-917, 2023.