- 1University of Cambridge, Civil Engineering, United Kingdom
- 2University of Cambridge, Computer Science, United Kingdom
Robust carbon monitoring is fundamental to the credibility of climate mitigation strategies, including carbon markets, nature-based solutions, and ecosystem restoration initiatives. Soil organic carbon (SOC), as a major and dynamic component of the carbon cycle, is traditionally quantified through soil sampling and laboratory analyses. Although accurate at local scales, these methods are costly, time-consuming, and spatially sparse, limiting their suitability for large-scale monitoring, underscoring the need for scalable and robust alternatives.
Recent advances in machine learning (ML), and particularly deep learning (DL), offer substantial potential to integrate heterogeneous data streams and reinforce the scientific basis of carbon accounting. However, the application of DL to soil carbon studies remains limited, with most existing work confined to small spatial domains and relatively modest datasets. This limitation reflects the intrinsic complexity of environmental systems, the scarcity of high-quality reference observations, and persistent challenges in multimodal data integration and model interpretability.
Using the pan-European Land Use/Cover Area Frame Survey (LUCAS) soil dataset, this study presents a multimodal deep learning framework for large-scale prediction of SOC stocks. In addition to SOC, the framework estimates texture-related proxies and ancillary soil attributes relevant to carbon stock assessment. The approach integrates a comprehensive suite of data sources, including multispectral Sentinel-2 imagery, climate time series variables, and land-cover information, to jointly exploit spectral and spatio-temporal dependencies.
The proposed architecture integrates modality-specific components tailored to each data type, enabling a coherent spatio-temporal representation of SOC dynamics. Convolutional neural networks (CNNs) are used to extract spatial patterns and vegetation–soil spectral signatures from multispectral imagery, while recurrent architectures, including long short-term memory (LSTM) networks, encode seasonal to interannual variability driven by climatic conditions. Multiple deep learning encoders are systematically compared, ranging from conventional CNN–LSTM architectures to state-of-the-art transformer and vision transformer models, in order to assess their ability to capture long-range dependencies, cross-modal interactions, and complex non-linear relationships underlying SOC distribution.
A comparative analysis further benchmarks the proposed deep learning framework against widely used machine learning methods in soil science, including Random Forest (RF), Extreme Gradient Boosting (XGB), and Multiple Linear Regression (MLR). Model performance is assessed not only in terms of predictive accuracy, but also with respect to implementation complexity and interpretability, highlighting practical trade-offs for operational deployment.
By integrating heterogeneous data sources, this work demonstrates how artificial intelligence can bridge the gap between point-based field measurements and policy-relevant carbon assessments, while supporting state-of-the-art monitoring, reporting, and verification (MRV) frameworks. This analysis contributes to ongoing efforts to develop transparent, scalable, and evidence-based carbon monitoring tools, while explicitly highlighting persistent challenges related to data bias, spatial transferability, and model interpretability.
How to cite: Mercier, M., Marinoni, A., and Selvakumaran, S.: Multimodal Machine and Deep Learning Frameworks for Soil Organic Carbon Monitoring , EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-13805, https://doi.org/10.5194/egusphere-egu26-13805, 2026.