- 1Universidad Politécnica de Madrid, CEIGRAM, Madrid, Spain (ernesto.sanz@upm.es)
- 2Grupo de Sistemas Complejos, Universidad Politécnica de Madrid, 28040 Madrid, Spain
- 3Centro Tecnológico de la Energía y el Medio Ambiente (CETENMA), P.I. Cabezo Beaza, C/Sofia 6-13, 30353 Cartagena, Spain
Soil health is a critical factor influencing ecosystem functions, agricultural productivity, and environmental sustainability. However, the spatial variability of soil properties across Europe poses significant challenges to understanding and managing soil health at regional and continental scales. This study utilizes clustering techniques to analyze and classify soil health across Europe using the LUCAS (Land Use and Coverage Area Frame Survey) soil dataset, one of the most comprehensive databases of soil properties in Europe.
The LUCAS dataset includes key physical, chemical, and biological soil indicators such as soil organic carbon (SOC), pH, texture, and bulk density, providing a robust foundation for clustering. Data preprocessing involved standardizing soil attributes and addressing missing values through imputation. Clustering algorithms were applied to group soils with similar health profiles, capturing spatial patterns and interrelations among soil properties. The resulting clusters were mapped and analyzed to identify dominant soil health characteristics and their distribution across Europe.
Preliminary results reveal distinct clusters reflecting gradients in soil fertility, organic matter content, and degradation levels. These clusters align with known ecological and climatic gradients, validating the methodology and providing insights into the spatial variability of soil health. Furthermore, this clustering approach highlights regions requiring targeted soil management interventions, contributing to data-driven decision-making for sustainable land use and agricultural practices.
This research demonstrates the potential of unsupervised learning to leverage large-scale datasets for spatial soil health analysis, offering a scalable framework for soil health monitoring and management at regional and continental scales. Future work will incorporate temporal data to assess changes in soil health over time, further enhancing the utility of this approach in dynamic soil monitoring systems.
Keywords—soil health, soil indicators, random forest, agriculture, soil monitoring
Acknowledgements: The iCOSHELLs project is funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Executive Agency (REA). Neither the European Union nor the granting authority can be held responsible for them.
References: Sanz, E., Sotoca, J. J. M., Saa-Requejo, A., Díaz-Ambrona, C. H., RuizRamos, M., Rodríguez, A., & Tarquis, A. M. (2022). Clustering arid rangelands based on NDVI annual patterns and their persistence. Remote Sensing, 14(19), 4949.
Boluwade, Alaba (2019). Regionalization and partitioning of soil health indicators for Nigeria using spatially contiguous clustering for economic and social-cultural developments. ISPRS International Journal of Geo-Information 8.10: 458.
Suchithra, M. S., and Maya L. Pai (2020). Data mining based geospatial clustering for suitable recommendation system. 2020 International Conference on Inventive Computation Technologies (ICICT). IEEE.
How to cite: Sanz, E., Almeida-Ñauñay, A. F., Soriano Disla, J. M., Soriano, B., Bardají, I., and Tarquis, A. M.: Clustering Soil Health Across Europe Using LUCAS Soil Dataset and Unsupervised Learning Techniques, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9337, https://doi.org/10.5194/egusphere-egu25-9337, 2025.