EGU23-1500, updated on 22 Feb 2023
https://doi.org/10.5194/egusphere-egu23-1500
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Ensemble machine learning improves pedotransfer functions for predicting soil mineral associated organic carbon

Yi Xiao1,2, Jie Xue1, Xianglin Zhang1, Nan Wang1, Emanuele Lugato3, Dominique Arrouays4, Zhou Shi1, and Songchao Chen1,2
Yi Xiao et al.
  • 1Institute of Applied Remote Sensing and Information Technology, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou 310058, China
  • 2ZJU-Hangzhou Global Scientific and Technological Innovation Center, Hangzhou 311200, China
  • 3European Commission, Joint Research Centre (JRC), Ispra, Italy
  • 4INRAE, Unité InfoSol, Orléans 45075, France

Soil organic carbon (SOC) sequestration is a promising natural climate solution for capturing atmospheric CO2, and it provides crucial co-benefits in improving soil functions and services at the same time. Given that SOC is not a single, uniform entity, further knowledge of SOC fractions with distinctive properties, such as particulate organic carbon (POC) and mineral associated organic carbon (MAOC), is necessary in order to fully comprehend how SOC responds to environmental changes. Despite their enormous significance, POC and MAOC information is still scarce in the soil databases, especially on a large scale. The pedotransfer function (PTF) is a useful method for estimating missing soil parameters, but its application in SOC fractions has not received much attention. We assessed the potential of MAOC prediction using machine learning-based PTF (random forest (RF), Cubist, and gradient boosted machine (GBM)) along with predictor selection methods (recursive feature elimination (RFE), and forward recursive feature selection (FRFS)) on 352 representative mineral topsoil samples (0-20 cm) from across Europe. The repeated validation (100 times) revealed that machine learning-based PTFs were capable of accurately predicting MAOC. RFE can effectively reduce the number of predictors from 21 to 12 with comparable performance to the models using all predictors. With only 6 predictors (SOC, silt + clay, nitrogen, nitrogen deposition, soil erosion, and sand), the suggested FRFS algorithm outperformed RFE and had the best model parsimony. Of the three machine learning models, Cubist performed the best when combined with FRFS. Our results also showed that, when compared to a single machine learning model, five model ensemble approaches can increase model accuracy and robustness. This study offers a valuable reference for coupling PTF and legacy soil databases, in order to improve the spatial coverage and effectiveness of SOC fraction forecasts based on machine learning.

How to cite: Xiao, Y., Xue, J., Zhang, X., Wang, N., Lugato, E., Arrouays, D., Shi, Z., and Chen, S.: Ensemble machine learning improves pedotransfer functions for predicting soil mineral associated organic carbon, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-1500, https://doi.org/10.5194/egusphere-egu23-1500, 2023.

Supplementary materials

Supplementary material file