EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Unsupervised Clustering applied on Raman spectra of dispersed carbonaceous material: a case history from the Paris Basin (France) 

Andrea Schito1, Natalia Amanda Vergara Sassarini2,3, Marta Gasparrini4, Pauline Michel3, and Sveva Corrado2
Andrea Schito et al.
  • 1Department of Geology and Geophysics, University of Aberdeen, School of Geosciences, Aberdeen, United Kingdom of Great Britain – England, Scotland, Wales (
  • 2Department of Sciences, University of Roma Tre, Largo San Leonardo Murialdo 1, 00146, Rome, Italy
  • 3IFP Energies nouvelles, 1-4 Avenue du Bois-Préau, 92852 Rueil-Malmaison, France
  • 4Earth Sciences Department, University of Milan, via Mangiagalli 34, 20133 Milan, Italy

In the last decades, the use of Raman spectroscopy on dispersed carbonaceous material in rocks has become a promising tool for geothermometry and thermal maturity assessment. In diagenesis the main problem is linked to organofacies composition and the need of time-consuming optical classification (Henry et al., 2019, Sanders et al., 2022). In this work, three different methods of clustering analysis on Raman spectra were tested as a potential approach to recognize the three main organofacies (amorphous organic matter, translucid and opaque phytoclasts) that characterize a set of 27 organic-rich samples from the Lower Toarcian source rock interval (Schistes Carton) of the Paris Basin (France).

Raman analyses were performed on concentrated organic matter obtained by acid attacks, with around 60 counts for each sample. Principal Component Analysis (PCA) was applied to reduce the dimensionality of each dataset on a 2-D score-plot. Unsupervised clustering was then performed by using three different clustering algorithms: k-means, Gaussian Mixture Models (GMM), and Density-Based Spatial Clustering for Applications with Noise (DBSCAN). The main task of these algorithms is to correctly assign the number of clusters, their size, orientation, and distribution in the score-plot that related to the heterogeneities in organofacies composition.

Results show the best performances are achieved through the application of GMM clustering that can successfully determine cluster’s geometry and optimal numbers with an accuracy mostly higher than 80% for the translucid phytoclasts group, that is the target for thermal maturity assessment.  This is a preliminary attempt showing promising application for unsupervised learning techniques coupled with Raman spectroscopy that could be applied in industrial routinely organic matter characterization or in the analysis of big dataset in both Earth and planetary sciences.


Henry, D. G., Jarvis, I., Gillmore, G., & Stephenson, M., 2019. Raman spectroscopy as a tool to determine the thermal maturity of organic matter: Application to sedimentary, metamorphic and structural geology. Earth-Science Reviews 198, 102936.

Sanders, M. M., Jubb, A. M., Hackley, P. C., & Peters, K. E., 2022. Molecular mechanisms of solid bitumen and vitrinite reflectance suppression explored using hydrous pyrolysis of artificial source rock. Organic Geochemistry 165, 104371.

How to cite: Schito, A., Vergara Sassarini, N. A., Gasparrini, M., Michel, P., and Corrado, S.: Unsupervised Clustering applied on Raman spectra of dispersed carbonaceous material: a case history from the Paris Basin (France) , EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-14683,, 2023.