- 1Wageningen University and Research, Environmental Science, Soil Geography and Landscape, Wageningen, Netherlands (eric.asamoah@wur.nl)
- 2College of Computing, Mohammed VI Polytechnic University, Lot 660, Hay Moulay Rachid, Benguerir 43150, Morocco
- 3International Fertilizer Development Center, Muscle Shoals, AL, 35662, USA
- 4Department of Crop and Soil Sciences, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
Background: Agriculture is increasingly leveraging machine learning (ML) to enhance yield predictions and optimize agronomic practices. Maize, a staple crop in Ghana, offers a valuable case study for evaluating the effectiveness of diverse ML models in yield prediction and resource management.
Objective: This study aims to evaluate the predictive performance of four ML models namely Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbours (KNN), and Extreme Gradient Boosting (XGBoost) for maize yield and agronomic efficiency prediction. It also compares variable importance across these models to determine key explanatory variables.
Methods: The study utilized 4,496 georeferenced maize trial datasets from various agroecological zones in Ghana. Thirty-five explanatory variables included soil properties, climate, topography, crop management practices, and fertilizer application datasets. Model performance was evaluated using leave-one-out, leave-site-out, and leave-agroecological-zone-out cross-validation techniques. Metrics including Mean Error (ME), Root Mean Squared Error (RMSE), and Model Efficiency Coefficient (MEC) were used to compare model accuracy, while a permutation-based approach was employed to assess variable importance.
Results: XGBoost emerged as the most accurate model, achieving the lowest RMSE for yield (639.5 kg ha⁻¹) and agronomic efficiency (11.6 kg kg⁻¹), particularly for nitrogen (AE-N). RF demonstrated competitive performance, while KNN and SVM yielded inconsistent results under rigorous cross-validation conditions. Key explanatory variables identified across models included nitrogen fertilizer, rainfall, and crop genotype, underscoring their critical role in yield and agronomic efficiency outcomes.
Conclusion: XGBoost was the most robust and accurate model for maize yield and agronomic efficiency predictions, offering a reliable tool for data-driven agricultural planning in diverse agroecological settings. The findings underscore the transformative role of advanced ML techniques in modern agriculture, particularly in optimizing staple crop production in sub-Saharan Africa.
How to cite: Asamoah, E., Heuvelink, G., Chairi, I., Bindraban, P., and Logah, V.: Modelling Maize Yield and Agronomic Efficiency Using Machine Learning Models: A Comparative Analysis, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9987, https://doi.org/10.5194/egusphere-egu25-9987, 2025.