Scalable quantification of agroecosystem carbon budget and crop yield based on knowledge-guided machine learning

Wang Zhou; Licheng Liu; Kaiyu Guan; Zhenong Jin; Bin Peng; Sheng Wang

doi:https://doi.org/10.5194/egusphere-egu25-7574

[Back] [Session BG8.2]

EGU25-7574, updated on 14 Mar 2025

https://doi.org/10.5194/egusphere-egu25-7574

EGU General Assembly 2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Oral | Tuesday, 29 Apr, 11:10–11:20 (CEST)

Room 2.17

Scalable quantification of agroecosystem carbon budget and crop yield based on knowledge-guided machine learning

Wang Zhou^1,2, Licheng Liu³, Kaiyu Guan^2,4, Zhenong Jin³, Bin Peng^2,5, and Sheng Wang^2,6

Wang Zhou et al.

¹School of Agriculture and Biotechnology, Sun Yat-sen University, China (zhouw85@sysu.edu.cn)
²Agroecosystem Sustainability Center, Institute for Sustainability, Energy, and Environment, University of Illinois at Urbana-Champaign, USA
³Department of Bioproducts and Biosystems Engineering, University of Minnesota, USA
⁴Department of Natural Resources and Environmental Sciences, College of Agricultural, Consumer and Environmental Sciences, University of Illinois at Urbana-Champaign, USA
⁵Department of Crop Sciences, College of Agricultural, Consumer and Environmental Sciences, University of Illinois at Urbana-Champaign, USA
⁶Land-CRAFT, Department of Agroecology, Aarhus University, Denmark

Quantifying carbon outcomes from agroecosystems plays an important role in mitigating global warming and ensuring food security through sustainable production. However, high spatial-temporal-resolution (e.g., ~100m, daily), accurate, well-resolved carbon budgets and crop yield in agroecosystems are extremely challenging to quantify due to the complexity of involved processes and large variations in environmental and management drivers. Traditional process-based-modeling approaches are computationally expensive to achieve field-scale resolution and contain large uncertainty due to underdetermined model structure and parameters. Knowledge-guided machine learning (KGML) is a hybrid modeling approach that leverages recent advances in machine learning combined with known physical principles and relationships to enhance the training and application processes, which helps open the “black box” of conventional ML models, and enable better predictions that capture variability in both time and space. Here we proposed a data-efficient KGML framework that effectively predicts daily variations in agricultural CO2 emissions, crop yields, and soil carbon storage at field scale, as successfully demonstrated for the US Midwest. Multi-source data and pretraining with outputs from a well-validated agroecosystem model were incorporated into a hierarchically structured deep learning neural network that greatly outperformed both process-based and pure machine learning models, especially in data-limited cases. This work demonstrates the advantages of integrating domain knowledge with state-of-the-art artificial intelligence in agroecosystem modeling that will lead toward broader use of KGML in geoscience.

How to cite: Zhou, W., Liu, L., Guan, K., Jin, Z., Peng, B., and Wang, S.: Scalable quantification of agroecosystem carbon budget and crop yield based on knowledge-guided machine learning, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-7574, https://doi.org/10.5194/egusphere-egu25-7574, 2025.