- 1University of Twente, Faculty of Geo-information Science and Earth Observation (ITC), Enschede, The Netherlands
- 2Jülich Supercomputing Centre, Forschungszentrum Jülich, Jülich, Germany
- 3Faculty of Electrical and Computer Engineering, University of Iceland, Reykjavík, Iceland
- 4European Space Agency (ESA), phi-lab
Recent advances in Earth Observation (EO) data and multimodal Geo-Foundation Models have sharply improved the ability to generate accurate crop-type maps by leveraging rich spatio-temporal representations. These models are inherently scalable across diverse and heterogeneous agricultural landscapes, thus exhibiting strong generalisation. However, timely and high-quality reference data remain a major bottleneck for reliable agricultural mapping and monitoring. Agricultural landscapes are highly dynamic, with frequent crop rotations that require seasonal or annual updates. In addition, European agriculture is increasingly affected by weather extremes (e.g., droughts, hail, and storms), which are expected to intensify in both magnitude and frequency.
Traditional approaches rely on time-consuming and costly manual annotations or field surveys, which are difficult to sustain on a continuous basis and at large spatial scales (e.g., continental monitoring). In this context, geo- and time-tagged field photos represent a promising complementary data source. Each field photo can be linked to satellite image time series acquired over the same location up to the acquisition date. Compared to conventional in-situ surveys based on manual annotations, the combined use of satellite image time series and field photos provides a richer semantic representation of agricultural areas. While satellite image time series capture the temporal dynamics of crop development, field photos offer ground-level information at high resolution on crop condition, phenological stage, and management practices.
Despite their potential, the operational use of field photos in agricultural monitoring remains limited, in part due to challenges in translating heterogeneous images into structured information. Recent advances in Vision–Language Models (VLMs) have unlocked substantial progress in the automatic interpretation and semantic extraction of information from raw field photos. By aligning visual features with semantic concepts expressed in natural language, VLMs provide a powerful mechanism for mapping unstructured field photos to standardised crop-type labels.
This study investigates the potential of combining satellite image time series and geo-tagged field photos to expand, update, and complement existing reference datasets to support continuous large-scale agricultural monitoring. Preliminary results of mapping seven crop types (i.e., maize, wheat, rape, sugarbeet, oat, barley, and sunflower) in Europe indicate that, even in a zero-shot setting and when using simple prompts, the CLIP VLM can correctly identify crop types from field photos when a distinct phenological stage is visible. Incorporating phenological information derived from the temporal patterns of satellite image time series is therefore crucial, as it allows for the filtering of irrelevant images (e.g., post-harvest fields) and the selection of samples for which reliable classification is feasible. Furthermore, when consistency of label predictions obtained independently from field photos (using CLIP) and from Sentinel-1 and Sentinel-2 time series (using a simple Random Forest classifier) is used as a data reliability strategy, highly accurate classification performance across all considered crop types can be obtained. Overall, these findings highlight the strong potential of jointly exploiting satellite image time series and geo-tagged field photos for the efficient and reliable preparation of crop-type reference datasets.
How to cite: Paris, C., Celik, M. F., Maurogiovanni, S., Sedona, R., Cavallaro, G., Cartuyvels, R., and Marsocci, V.: From Satellite Data and Geo-tagged Field Photos to Reliable Agricultural Reference Data, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16567, https://doi.org/10.5194/egusphere-egu26-16567, 2026.