EGU24-11107, updated on 08 Mar 2024
https://doi.org/10.5194/egusphere-egu24-11107
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

A new machine-learning model to partition soil organic carbon into its centennially stable and active fractions based on Rock-Eval(r) thermal analysis

Marija Stojanova1, Pierre Barré1, Hugues Clivot2, Lauric Cécillon3, François Baudin4, Thomas Kätterer5, Bent T. Christensen6, Claire Chenu7, Ines Merbach8, Adrián Andriulo9, Sabine Houot10, and Fabien Ferchaud10
Marija Stojanova et al.
  • 1ENS PSL , Geology Department, France
  • 2Université de Reims Champagne-Ardenne, INRAE, FARE, UMR A 614, Reims, France
  • 3Research and innovation office, Ministry of agriculture and food sovereignty, Paris, France
  • 4Sorbonne Université, Institut des sciences de la Terre de Paris (ISTeP)
  • 5Swedish University of Agricultural Sciences
  • 6Aarhus University, Department of Agroecology - Soil Fertility
  • 7INRAE French National Institute for Agricultural Research and AgroParisTech
  • 8Helmholtz-Centre for Environmental Research
  • 9Instituto Nacional de Tecnología Agropecuaria, Buenos Aires
  • 10INRAE French National Institute for Agricultural Research

The quantification of soil organic carbon (SOC) biogeochemical stability is important for assessing soil health and its capacity to store carbon. Models simulating SOC stock evolution divide SOC into different kinetic pools with contrasting residence times. The initialization of compartment sizes is a major source of uncertainty for SOC simulations. In a previous study, Cécillon et al. (2021) developed a machine-learning model (PARTYsoc v2) that uses Rock-Eval(r) thermal analysis results as input variables to quantify the proportion of centennially stable and active SOC fractions using samples from long term bare fallow sites. The outputs of PARTYsoc v2 have been shown to be particularly effective for initializing the AMG model, enabling very accurate simulations of SOC stock evolutions for a dozen French sites (Kanari et al., 2022). The objective of the present work is to build a new version of PARTYsoc, validated on a larger sample set, and extend the usefulness of the AMG model initialized with PARTYsoc to different parts of the world.

To do so, we have first identified sites with known crop yields and SOC stock evolutions and archived samples available for Rock-Eval(r) characterization. We then determined, for each site, the stable SOC stock value leading to the best simulation accuracy of SOC stock evolution with the AMG model. This optimal stable SOC stock allowed us to quantify the stable SOC proportion for all samples from the selected sites. Finally, we developed PARTYsoc v3 using Rock-Eval(r) measurements as input variables to predict stable SOC proportions sensu AMG model.

PARTYsoc v3 is significantly different from PARTYsoc v2. In the v3, the target variable, i.e., the centennially stable SOC proportion, is determined to be optimal for the AMG model whereas in the v2 it was calculated from SOC declines at bare fallow sites. Moreover, the current v3 model uses Support Vector Machine (SVM) regression coupled with a Beta Regression instead of Random Forest. This combination of machine-learning models allows for a non-linear relationship between the target and the features, and predictions are always bounded in the [0, 1] interval. The data set has also been extended to use a larger number of sites (6 sites in the v2, and 12 sites in the v3), including both bare fallows and other types of long-term experiments.

The features (Rock-Eval(r) features) are selected by first removing highly-correlated features (Spearman correlation > 0.9) and then ranking them based on their predictive importance when randomly permuted. This procedure allows us to decrease the effects of overfitting the training data. The final model uses 7 Rock-Eval(r) features (18 for the v2). We obtain satisfactory performance in both internal validation (R2=0.82, RMSE=0.07), as well as Leave-One-Site-Out (LOSO) validation (R2=0.76, RMSE=0.09).

The proposed model builds upon and significantly improves the work laid out by PARTYsoc v2. Currently, we are working on further extending the data set as well as stabilizing the processes of feature selection and model parameters.

How to cite: Stojanova, M., Barré, P., Clivot, H., Cécillon, L., Baudin, F., Kätterer, T., T. Christensen, B., Chenu, C., Merbach, I., Andriulo, A., Houot, S., and Ferchaud, F.: A new machine-learning model to partition soil organic carbon into its centennially stable and active fractions based on Rock-Eval(r) thermal analysis, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-11107, https://doi.org/10.5194/egusphere-egu24-11107, 2024.