- 1Institute of Geography, Friedrich Schiller University, Jena, Germany
- 2ELLIS Unit, Jena, Germany
- 3Max Planck Institute for Biogeochemistry, Jena, Germany
Decisions concerning the management of natural resources are often based on binary criteria that determine whether a specific environmental target is met or exceeded. A common example is the designation of “polluted” areas, where mitigation measures must be implemented once concentrations surpass a regulatory threshold. In practice, maps of such exceedances are commonly derived from regionalized concentration estimates. However, most conventional spatial interpolation and prediction procedures introduce systematic bias in the estimated extent of polluted areas.
To overcome this issue, we apply a bias-corrected mapping procedure that is compatible with any geostatistical or machine learning method capable of providing valid probability estimates. For the case study, we mainly focus on a trans-Gaussian regression-kriging (TRGK) framework, selected for its interpretability and transparent decomposition of predictions. To assess the potential added value of nonparametric approaches, we additionally compare TRGK with quantile regression forest (QRF) in a sub-region.
The TRGK model follows a structured, non-stationary design: (i) raw concentrations are transformed to log10 scale; (ii) a nationwide global linear model captures broad-scale relationships; (iii) major hydrogeological districts serve as units for local linear refinements to account for non-stationarity; (iv) residuals are transformed using a Gaussian anamorphosis; and (v) the transformed residuals are interpolated via ordinary kriging, from which probability estimates are derived. This setup improves flexibility while maintaining interpretability and coherent uncertainty quantification.
Bias correction is performed by estimating the total exceedance area implied by the data and determining a calibrated probability threshold that ensures an unbiased delineation of the polluted area. In this study, we jointly evaluate a threshold exceedance criterion and a temporal trend criterion.
Groundwater nitrate mapping at national scale represents a challenging test case due to strong non-normality, spatial heterogeneity, and pronounced non-stationarity. The approach nonetheless performs robustly. Linear model components exhibit R2 values between 0.15 and 0.62, while semivariogram practical ranges vary from 0.3 to 22.3 km. In the sub-region comparison, QRF showed a small discrimination advantage over TRGK (AUC 0.86 vs. 0.82) but relied more heavily on calibration (underestimation without calibration 94.9% vs. 5.1%).
Overall, the results demonstrate that the bias-corrected probability-based framework provides a flexible, robust and- when coupled with geostatistics- transparent solution for large-scale pollution mapping.
How to cite: Frank, J., Suesse, T., Jiang, S., and Brenning, A.: Bias-corrected pollution mapping with non-stationary geostatistics and spatial machine learning for environmental decision making: The case of groundwater nitrate, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1406, https://doi.org/10.5194/egusphere-egu26-1406, 2026.