EGU26-17364, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-17364
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
PICO | Monday, 04 May, 16:54–16:56 (CEST)
 
PICO spot 2, PICO2.14
Filling the Gaps: Machine Learning Prediction of Sparse Mineral Phase Data
Julia Schmitz1,2, Joyce Schmatz1, Mingze Jiang1, Eva Wellmann1, Mara Weiler2, Friedrich Hawemann2, and Virginia Toy2
Julia Schmitz et al.
  • 1MaP - Microstructure and Pores GmbH, Aachen, Germany (julia@m-a-p.expert)
  • 2Institut für Geowissenschaften, Johannes Gutenberg-Universität Mainz, Mainz, Germany

Mineral phase information derived from scanning electron microscopy (SEM) combined with energy-dispersive spectroscopy (EDS) is commonly restricted to selected imaged areas, while large parts of a sample remain unmapped. The main challenge is to predict mineral phase information from the locally measured EDS regions to the full sample surface, relying on BSE imaging that can cover the entire sample because of its short acquisition times. In this study, we analyze three distinct lithologies - granite, marl (Muschelkalk), and sandstone (Bundsandstein) - using the MaPro software (Jiang et al., 2022). MaPro applies a physics-informed decision tree to analyze EDS data in conjunction with high-resolution backscattered electron (BSE) data for each lithology. After thresholding, mineral phases are segmented from the EDS maps, generating pixel-based phase maps that are used as ground truth for subsequent predictions. In comparison with the original EDS data, the ground truth allows pixel-wise phase analysis, which is essential for subsequent data processing. A random forest–based machine learning (ML) model was trained using MaPro phase analyses to predict phases across broader sample areas. The predicted phase distributions show very good agreement with the MaPro ground truth. Prediction accuracy is higher for relatively homogeneous lithologies such as sandstone and granite, and decreases for a more heterogeneous sample such as the marl. The fine-grained domains produce the largest errors in the MaPro analysis and, consequently, in the ML predictions. In these areas, mineral phases with similar compositions are more difficult for the ML classifier to distinguish and therefore require more ground-truth data than compositionally distinct phases. The results enable a reliable assessment of mineral phases across the entire sandstone sample and across large areas of the granite and marl samples, achieving extensive coverage with short analytical times.

How to cite: Schmitz, J., Schmatz, J., Jiang, M., Wellmann, E., Weiler, M., Hawemann, F., and Toy, V.: Filling the Gaps: Machine Learning Prediction of Sparse Mineral Phase Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-17364, https://doi.org/10.5194/egusphere-egu26-17364, 2026.