EGU26-7015, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-7015
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Wednesday, 06 May, 08:30–10:15 (CEST), Display time Wednesday, 06 May, 08:30–12:30
 
Hall X4, X4.27
Predicting CEC concentration in soil using machine learning algorithms
Mirjana Radulović, Branislav Živaljević, Maria Kireeva, and Gordan Mimić
Mirjana Radulović et al.
  • BioSense Institute, University of Novi Sad, Serbia

Contaminants of emerging concern (CECs) have received increasing attention due to their persistence and potential ecological risks in freshwater environments. However, their spatial patterns, source contributions, and transfer from aquatic systems to surrounding soils remain insufficiently understood. Moreover, CECs are often poorly regulated, partly because interactions between individual contaminants and groups of contaminants are complex, making it hard to assess the risks they may cause. Given their adverse effects on ecosystems and human health through direct and indirect exposure, this study presents a first attempt to predict the occurrence of CECs in soil in a highly agricultural area in Serbia using machine learning techniques.

The investigation was conducted along the Veliki Bački Canal in Vojvodina (Serbia), which presents one of the main water supplies for irrigation. Initially, concentrations of CECs in canal water were measured and their spatial distribution mapped for the most frequently detected substances, including 4-acetamidoantipyrine, acesulfame calcium, estradiol, venlafaxine, and carbamazepine. Based on these results, representative locations for soil sampling on agricultural land were selected, and two soil sampling campaigns were carried out, where 96 samples were collected.

The analysis revealed that the dominant soil contaminants were primarily of industrial origin, such as tributyl phosphate, dodecyl sulfate, 2,5-di-tert-butylhydroquinone, and triethylene glycol bis (2-ethylhexanoate). Using soil and terrain characteristics as predictor variables, three machine learning algorithms were trained and evaluated - Multiple Linear Regression, Support Vector Machine, and Random Forest. Random Forest models showed strong predictive capability, particularly for industrial contaminants, such as tributyl phosphate, with a coefficient of determination (R²) of 0.65 and a mean squared error of 38.45 ng/g. Only in one case, for the prediction of 2,5-di-tert-butylhydroquinone, the Multiple Linear Regression model outperformed Random Forest. Feature importance analysis indicated that soil sand content and flow accumulation were the most influential factors controlling contaminant distribution in soil.

Although model performance is constrained by limited soil sampling data, the proposed framework provides a robust foundation for predicting soil contamination patterns and supports improved risk assessment and monitoring strategies in freshwater-influenced agricultural landscapes.

How to cite: Radulović, M., Živaljević, B., Kireeva, M., and Mimić, G.: Predicting CEC concentration in soil using machine learning algorithms, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-7015, https://doi.org/10.5194/egusphere-egu26-7015, 2026.