- Ghent University, Faculty of Bioscience Engineering, Department of Plants and Crops, Belgium (eline.eeckhout@ugent.be)
Rapid advancements in technology, particularly the rise of artificial intelligence (AI) and the integration of uncrewed aerial vehicles (UAVs) equipped with RGB, multi- and hyperspectral sensors, have boosted agricultural research on crop disease detection. This has led to a surge in studies exploring high-technology approaches to detecting crop diseases. While numerous studies have demonstrated high accuracy in detecting specific diseases or pests in crops, concerns arise regarding their reproducibility and generalisability.
We conducted a meta-analysis of over 100 research papers to examine how models are trained and validated, with a focus on how datasets for training, validation and testing were handled. In principle, a model can only be considered robust and widely applicable if it performs well on an entirely new dataset, i.e., a dataset it wasn’t specifically trained one. Otherwise, AI models risk overfitting to specific datasets or fields, potentially detecting signals that are not universal or not related to the targeted pest or disease. This issue arises when datasets are randomly split in training, validation and test subsets.
Our analysis revealed significant limitation in current practices. Nearly half of the reviewed papers relied on a single dataset (one single field, one single flight) for both model training and validation. About one-quarter of the studies used data from a single field with repeated flights during the same growing season. Only another quarter utilized datasets from multiple fields; however, the majority of these studies still used a random split for training and testing, meaning their models were not evaluated on independent datasets. In addition, a handful of studies using RGB data, applied transfer learning, with models pretrained on public (non-UAV) datasets and then applied to UAV datasets.
Overall, only 10% of the reviewed papers validated their models on fully independent datasets, i.e, using transfer learning or using an independent (untrained) separate field to test the model. We found that particularly models constructed with multispectral or hyperspectral data did not use independent datasets. On top of that, none of the studies explicitly tested whether their models were pest- or disease-specific, i.e., whether the models were sensitive only to the pest or disease they were trained to detect.
These findings highlight a critical limitation in the robustness and scalability of current AI-approaches to crop disease detection with UAVs. To address this, we call on researchers to include independent test datasets in their studies, and urge journals and reviewers to prioritize this criterion during evaluations. Additionally, we advocate for the public sharing of datasets to enable the development of robust and generalisable methods.
How to cite: Eeckhout, E., Spanoghe, P., and Maes, W.: UAV-based disease and pest detection using AI: Time to reconsider our approach?, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-6149, https://doi.org/10.5194/egusphere-egu25-6149, 2025.