Comparative examination of methods for the prediction of subsoil texture composition.

Konstantinos Soulis; Evangelos Nikitakis; Stelios Gerontidis; Alexandros Stavropoulos; Dionissios Kalivas

doi:https://doi.org/10.5194/egusphere-egu26-6486

[Back] [Session SSS10.6]

EGU26-6486, updated on 13 Mar 2026

https://doi.org/10.5194/egusphere-egu26-6486

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Comparative examination of methods for the prediction of subsoil texture composition.

Konstantinos Soulis, Evangelos Nikitakis, Stelios Gerontidis, Alexandros Stavropoulos, and Dionissios Kalivas

Konstantinos Soulis et al.

Agricultural University of Athens, Dep. Natural Resources Management and Agricultural Engineering, Lab. of Soil Science and Agricultural Chemistry, GIS Research Unit, Athens, Greece (soco@aua.gr)

While topsoil layers (0–30 cm) are routinely sampled in most soil surveys, subsoil data (30–60 cm) are collected less consistently, and deeper soil layers (>60 cm) are only rarely investigated, typically in detailed or site-specific studies. This is to be expected, as deeper sampling increases costs and soil surveys are typically performed to inform short-term agronomic decisions, where the specific characteristics of the soil below the plough layer are largely irrelevant. Consequently, large-scale soil databases exhibit a pronounced vertical data gap, with dense information available for topsoil layers but sparse or missing measurements at depth. This depth bias introduces significant limitations for applications that depend on the full soil profile, such as hydrological modeling, groundwater recharge estimation, and nutrient leaching assessments, where subsoil and deeper soil properties play a critical role. To combat this limitation, we are examining Machine Learning and geostatistical frameworks of predicting subsoil textural composition on the heterogeneous landscape of Greece. More specifically, (i) the prediction based on raw compositional data versus isometric log-ratio (ilr)–transformed coordinates, (ii) the integration of spatial information within machine-learning frameworks, (iii) univariate per-component regression versus multivariate regression approaches, and (iv) the inclusion and exclusion of predictor variables are being examined. Additionally, all machine-learning models are benchmarked against equivalent ordinary least squares (OLS) regressions, which serve as baseline models. This comparison enables the assessment of potential relationships between the performance of simple, interpretable regression models and that of more complex, traditionally less-interpretable machine-learning approaches. Preliminary results are encouraging, with R² values of 0.77, 0.75, and 0.61 for the prediction of subsoil clay, sand, and silt contents, respectively, on the most robust univariate raw compositional spatially aware Random Forest model. Previous studies suggest that predictive performance may be further improved through compositional data pre-processing using isometric log-ratio (ilr) transformation and multivariate Random Forest modeling.

How to cite: Soulis, K., Nikitakis, E., Gerontidis, S., Stavropoulos, A., and Kalivas, D.: Comparative examination of methods for the prediction of subsoil texture composition., EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6486, https://doi.org/10.5194/egusphere-egu26-6486, 2026.