Bias-corrected pollution mapping with non-stationary geostatistics and spatial machine learning for environmental decision making: The case of groundwater nitrate

Jonathan Frank; Thomas Suesse; Shijie Jiang; Alexander Brenning

doi:https://doi.org/10.5194/egusphere-egu26-1406

[Back] [Session HS3.4]

EGU26-1406, updated on 13 Mar 2026

https://doi.org/10.5194/egusphere-egu26-1406

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Bias-corrected pollution mapping with non-stationary geostatistics and spatial machine learning for environmental decision making: The case of groundwater nitrate

Jonathan Frank¹, Thomas Suesse^1,2, Shijie Jiang^2,3, and Alexander Brenning^1,2

Jonathan Frank et al.

¹Institute of Geography, Friedrich Schiller University, Jena, Germany
²ELLIS Unit, Jena, Germany
³Max Planck Institute for Biogeochemistry, Jena, Germany

Decisions concerning the management of natural resources are often based on binary criteria that determine whether a specific environmental target is met or exceeded. A common example is the designation of “polluted” areas, where mitigation measures must be implemented once concentrations surpass a regulatory threshold. In practice, maps of such exceedances are commonly derived from regionalized concentration estimates. However, most conventional spatial interpolation and prediction procedures introduce systematic bias in the estimated extent of polluted areas.

To overcome this issue, we apply a bias-corrected mapping procedure that is compatible with any geostatistical or machine learning method capable of providing valid probability estimates. For the case study, we mainly focus on a trans-Gaussian regression-kriging (TRGK) framework, selected for its interpretability and transparent decomposition of predictions. To assess the potential added value of nonparametric approaches, we additionally compare TRGK with quantile regression forest (QRF) in a sub-region.

The TRGK model follows a structured, non-stationary design: (i) raw concentrations are transformed to log₁₀ scale; (ii) a nationwide global linear model captures broad-scale relationships; (iii) major hydrogeological districts serve as units for local linear refinements to account for non-stationarity; (iv) residuals are transformed using a Gaussian anamorphosis; and (v) the transformed residuals are interpolated via ordinary kriging, from which probability estimates are derived. This setup improves flexibility while maintaining interpretability and coherent uncertainty quantification.

Bias correction is performed by estimating the total exceedance area implied by the data and determining a calibrated probability threshold that ensures an unbiased delineation of the polluted area. In this study, we jointly evaluate a threshold exceedance criterion and a temporal trend criterion.

Groundwater nitrate mapping at national scale represents a challenging test case due to strong non-normality, spatial heterogeneity, and pronounced non-stationarity. The approach nonetheless performs robustly. Linear model components exhibit R² values between 0.15 and 0.62, while semivariogram practical ranges vary from 0.3 to 22.3 km. In the sub-region comparison, QRF showed a small discrimination advantage over TRGK (AUC 0.86 vs. 0.82) but relied more heavily on calibration (underestimation without calibration 94.9% vs. 5.1%).

Overall, the results demonstrate that the bias-corrected probability-based framework provides a flexible, robust and- when coupled with geostatistics- transparent solution for large-scale pollution mapping.

How to cite: Frank, J., Suesse, T., Jiang, S., and Brenning, A.: Bias-corrected pollution mapping with non-stationary geostatistics and spatial machine learning for environmental decision making: The case of groundwater nitrate, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-1406, https://doi.org/10.5194/egusphere-egu26-1406, 2026.