EGU23-5543
https://doi.org/10.5194/egusphere-egu23-5543
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Smoothed predicted distributions in digital soil mapping – a comprehensive comparative study to predict soil texture for irrigation

Madlene Nussbaum, Stefan Vogel, Stefan Oechslin, Simon Tanner, and Stéphane Burgos
Madlene Nussbaum et al.
  • Bern University of Applied Sciences (BFH), School for Agricultural, Forest and Food Sciences (HAFL), Zollikofen, Switzerland (madlene.nussbaum@bfh.ch)

Spatial predictions for mapping soil properties are often prone to smoothing of distribution tails. As a result small values are overestimated and large values are underestimated. For many applications there might be not harm, but it is critical for map uses where soil property interpretation for small or large values have a substantial effect. For example, soil texture maps are relevant to implement irrigation strategies or to adjust irrigation soil moisture probes. Texture at the margin of the distributions have a much larger impact on probe adjustment than intermediate textures.

To investigate the effect of different statistical approaches on smoothing of soil texture prediction we analyzed four Swiss data sets originating from different surveys with different strength of response-covariate relationships (weak: arable land north of Berne, n = 1650; weak to medium: arable land of Canton of Zurich, n = 3920; medium: strongly cultivated Gleysols and Histosol in Seeland/Grosses Moos, n = 2510; strong: cultivated Histosols in Rhine Valley, n = 2590). We evaluated behavior around lower and upper tails of predicted distributions for commonly used methods fitted to clay, silt and sand and to additive log-ratio transformed responses: random forest, gradient boosted trees, support vector machines, Cubist regression, k-nearest neighbor, robust external-drift kriging and group lasso. In addition, we applied approaches that are supposed to alleviate the problem of smoothed predicted distributions: 1) post-processing transformation to match original variance, 2) SMOTER algorithm for imbalanced regression, 3) constrained kriging, 4) random forest with resampling weights inverse to histogram, 5) univariate distributional random forest with a distribution loss criteria and a 6) multivariate response variant of distributional random forest.

Validation was done by surveying a design-based dataset for Rhine Valley and Berne and by data-splitting otherwise. Besides computing validation statistics of mean model performance, we evaluated goodness-of-fit of univariate and multivariate distributions. Further, we judged the multivariate accuracy regarding HYPRES and USDA texture classes often used within irrigation applications.

The comparative analysis of the used methods showed that no approach outperformed the others on all datasets regarding mean overall accuracy and at the same time satisfactory prediction of tails. Sometimes random forest with inverse histogram weights was resulting in slightly better predictions at the tails, but it was closely followed by an unaltered random forest. Hence, producing proper prediction along the full distribution of a response remains a challenge.

How to cite: Nussbaum, M., Vogel, S., Oechslin, S., Tanner, S., and Burgos, S.: Smoothed predicted distributions in digital soil mapping – a comprehensive comparative study to predict soil texture for irrigation, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-5543, https://doi.org/10.5194/egusphere-egu23-5543, 2023.