EGU24-820, updated on 13 Mar 2024
https://doi.org/10.5194/egusphere-egu24-820
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Machine Learning-based Mineral Prospectivity Mapping: Exploring the Role of Negative Training Labels to Enhance Predictive Models

Nyah Bay1, Kyubo Noh1, Mohammad Parsasadr2, and Andrei Swidinsky1
Nyah Bay et al.
  • 1Department of Earth Sciences, University of Toronto, Toronto, Canada
  • 2Geological Survey of Canada, Ottawa, Canada

Mineral Prospectivity Mapping (MPM) is an important tool to identify areas with significant potential to host mineral deposits. Recent advancements in computational sciences, especially the advent of Machine Learning (ML), have enhanced MPM's capabilities. ML techniques enable a higher degree of data integration and extraction compared to traditional statistical methods such as Weights of Evidence, enhancing the accuracy and efficiency of identifying mineral exploration zones. When using ML techniques for MPM, the influence of negative training labels (ie. barren areas with no mineralization) remains a neglected research area, and this study investigates the influence of such label selection to optimize predictive models for Canadian critical mineral exploration.

Previous approaches to ML-based MPM often adopted a random assignment of negative training labels wherever positive training labels were absent. This study aims to refine this method, striving for a more systematic approach in negative label selection. The evolution of MPM, transitioning from traditional statistical methods to modern ML algorithms, signifies a shift towards heightened accuracy and efficiency. Prior research underscores the importance of balanced representation between mineralized and non-mineralized labels in ML models. Techniques such as Synthetic Minority Over-Sampling (SMOTE) and Positive and Unlabelled Learning (PUL) have been highlighted in previous studies, emphasizing the necessity of effectively handling negative training labels to prevent biases and enhance model performance. While SMOTE and PUL synthetically balance datasets by either oversampling minority classes or considering only positive and unlabeled instances, this study focuses on leveraging public exploration data to identify real negative training labels and provide a more authentic representation of non-mineralized areas without synthetic augmentation.

Using datasets compiled by the Geological Survey of Canada containing discoveries & occurrences of magmatic Ni (±Cu ±Co ±PGE), this research incorporates geological, geochemical, and geophysical data from established sources. Public exploration data will be used to identify areas devoid of magmatic Ni (±Cu ±Co ±PGE). These locations will serve as negative training labels for this study. Our particular choice of ML model is a Gradient Boosting Machine (GBM), and validation involves comprehensive evaluation techniques such as confusion matrices and receiver operating characteristic curves to assess model performance.

How to cite: Bay, N., Noh, K., Parsasadr, M., and Swidinsky, A.: Machine Learning-based Mineral Prospectivity Mapping: Exploring the Role of Negative Training Labels to Enhance Predictive Models, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-820, https://doi.org/10.5194/egusphere-egu24-820, 2024.