EGU24-12713, updated on 09 Mar 2024
https://doi.org/10.5194/egusphere-egu24-12713
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Data-driven models for groundwater nitrate contamination prediction: A nation-wide approach for Mexico

Jurgen Mahlknecht1, Juan Antonio Torres-Martínez1, Abrahan Mora1, Manish Kumar1, Dugin Kaown2, and Frank J Loge3
Jurgen Mahlknecht et al.
  • 1Escuela de Ingeniería y Ciencias, Tecnologico de Monterrey, Monterrey, Mexico (jurgen@tec.mx)
  • 2School of Earth and Environmental Sciences, Seoul National University, Seoul, South Korea
  • 3Department of Civil and Environmental Engineering, University of California Davis, Davis, United States of America

Nitrate (NO3-N) stands as one of the prevalent chemical contaminants in groundwater, posing potential repercussions on both the environment and public health. However, the monitoring of this parameter on a national scale is notably limited, especially in developing regions. To address this gap, we applied distinct machine learning (ML) algorithms (Extreme Gradient Boosting, Boosted Regression Trees, Random Forest, and Support Vector Machines) capable of quantifying/predicting NO3-N concentrations in groundwater. These algorithms were validated through comprehensive application across Mexico. The models initially considered 68 covariates and identified significant predictors of NO3-N concentration spanning from climate, geomorphology, soil, hydrogeology, and human factors. We achieved an outstanding performance with about 10 times less availability of information compared to previous large-scale assessments, and thus efficiently countered the challenge of limited data availability/monitoring stations. Our success can be attributed mainly to the implementation of the 'Support Points-based Split Approach' during pre-processing, which effectively transformed the limited national groundwater quality database into spatial points suitable for appropriate train/test datasets. Areas exhibiting NO3-N concentrations exceeding the drinking water standard (>10 mg/L) were identified, notably in the north-central and northeast regions of the country, linked to agricultural and industrial activities. Individuals living in these regions face potential exposure to elevated NO3-N levels in groundwater. These NO3-N hotspots align with reported health implications such as gastric and colorectal cancer. This study not only showcases the potential of ML in data-scarce regions but also provides actionable insights for policy and management strategies.

How to cite: Mahlknecht, J., Torres-Martínez, J. A., Mora, A., Kumar, M., Kaown, D., and Loge, F. J.: Data-driven models for groundwater nitrate contamination prediction: A nation-wide approach for Mexico, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-12713, https://doi.org/10.5194/egusphere-egu24-12713, 2024.