- 1School of Agriculture and Biotechnology, Sun Yat-sen University, China (zhouw85@sysu.edu.cn)
- 2Agroecosystem Sustainability Center, Institute for Sustainability, Energy, and Environment, University of Illinois at Urbana-Champaign, USA
- 3Department of Bioproducts and Biosystems Engineering, University of Minnesota, USA
- 4Department of Natural Resources and Environmental Sciences, College of Agricultural, Consumer and Environmental Sciences, University of Illinois at Urbana-Champaign, USA
- 5Department of Crop Sciences, College of Agricultural, Consumer and Environmental Sciences, University of Illinois at Urbana-Champaign, USA
- 6Land-CRAFT, Department of Agroecology, Aarhus University, Denmark
Quantifying carbon outcomes from agroecosystems plays an important role in mitigating global warming and ensuring food security through sustainable production. However, high spatial-temporal-resolution (e.g., ~100m, daily), accurate, well-resolved carbon budgets and crop yield in agroecosystems are extremely challenging to quantify due to the complexity of involved processes and large variations in environmental and management drivers. Traditional process-based-modeling approaches are computationally expensive to achieve field-scale resolution and contain large uncertainty due to underdetermined model structure and parameters. Knowledge-guided machine learning (KGML) is a hybrid modeling approach that leverages recent advances in machine learning combined with known physical principles and relationships to enhance the training and application processes, which helps open the “black box” of conventional ML models, and enable better predictions that capture variability in both time and space. Here we proposed a data-efficient KGML framework that effectively predicts daily variations in agricultural CO2 emissions, crop yields, and soil carbon storage at field scale, as successfully demonstrated for the US Midwest. Multi-source data and pretraining with outputs from a well-validated agroecosystem model were incorporated into a hierarchically structured deep learning neural network that greatly outperformed both process-based and pure machine learning models, especially in data-limited cases. This work demonstrates the advantages of integrating domain knowledge with state-of-the-art artificial intelligence in agroecosystem modeling that will lead toward broader use of KGML in geoscience.
How to cite: Zhou, W., Liu, L., Guan, K., Jin, Z., Peng, B., and Wang, S.: Scalable quantification of agroecosystem carbon budget and crop yield based on knowledge-guided machine learning, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7574, https://doi.org/10.5194/egusphere-egu25-7574, 2025.