Modelling Maize Yield and Agronomic Efficiency Using Machine Learning Models: A Comparative Analysis

Eric Asamoah; Gerard Heuvelink; Ikram Chairi; Prem Bindraban; Vincent Logah

doi:https://doi.org/10.5194/egusphere-egu25-9987

[Back] [Session ITS1.4/CL0.10]

EGU25-9987, updated on 15 Mar 2025

https://doi.org/10.5194/egusphere-egu25-9987

EGU General Assembly 2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Poster | Wednesday, 30 Apr, 16:15–18:00 (CEST), Display time Wednesday, 30 Apr, 14:00–18:00

Hall X5, X5.133

Modelling Maize Yield and Agronomic Efficiency Using Machine Learning Models: A Comparative Analysis

Eric Asamoah¹, Gerard Heuvelink¹, Ikram Chairi², Prem Bindraban³, and Vincent Logah⁴

Eric Asamoah et al.

¹Wageningen University and Research, Environmental Science, Soil Geography and Landscape, Wageningen, Netherlands (eric.asamoah@wur.nl)
²College of Computing, Mohammed VI Polytechnic University, Lot 660, Hay Moulay Rachid, Benguerir 43150, Morocco
³International Fertilizer Development Center, Muscle Shoals, AL, 35662, USA
⁴Department of Crop and Soil Sciences, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana

Background: Agriculture is increasingly leveraging machine learning (ML) to enhance yield predictions and optimize agronomic practices. Maize, a staple crop in Ghana, offers a valuable case study for evaluating the effectiveness of diverse ML models in yield prediction and resource management.

Objective: This study aims to evaluate the predictive performance of four ML models namely Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbours (KNN), and Extreme Gradient Boosting (XGBoost) for maize yield and agronomic efficiency prediction. It also compares variable importance across these models to determine key explanatory variables.

Methods: The study utilized 4,496 georeferenced maize trial datasets from various agroecological zones in Ghana. Thirty-five explanatory variables included soil properties, climate, topography, crop management practices, and fertilizer application datasets. Model performance was evaluated using leave-one-out, leave-site-out, and leave-agroecological-zone-out cross-validation techniques. Metrics including Mean Error (ME), Root Mean Squared Error (RMSE), and Model Efficiency Coefficient (MEC) were used to compare model accuracy, while a permutation-based approach was employed to assess variable importance.

Results: XGBoost emerged as the most accurate model, achieving the lowest RMSE for yield (639.5 kg ha⁻¹) and agronomic efficiency (11.6 kg kg⁻¹), particularly for nitrogen (AE-N). RF demonstrated competitive performance, while KNN and SVM yielded inconsistent results under rigorous cross-validation conditions. Key explanatory variables identified across models included nitrogen fertilizer, rainfall, and crop genotype, underscoring their critical role in yield and agronomic efficiency outcomes.

Conclusion: XGBoost was the most robust and accurate model for maize yield and agronomic efficiency predictions, offering a reliable tool for data-driven agricultural planning in diverse agroecological settings. The findings underscore the transformative role of advanced ML techniques in modern agriculture, particularly in optimizing staple crop production in sub-Saharan Africa.

How to cite: Asamoah, E., Heuvelink, G., Chairi, I., Bindraban, P., and Logah, V.: Modelling Maize Yield and Agronomic Efficiency Using Machine Learning Models: A Comparative Analysis, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9987, https://doi.org/10.5194/egusphere-egu25-9987, 2025.