Combining compositional data analysis and machine learning to recognize where soil geochemistry is influenced by the presence of pyroclastic covers in Campania region (Southern Italy)
- 1Department of Science and Technology, University of Sannio, Benevento, Italy (cidom@unisannio.it)
- 2Department of Earth, Environmental and Resources Sciences, University of Naples Federico II, Naples, Italy (stefano.albanese@unina.it)
- 3Department of Law, Economics, Management and Quantitative Methods, University of Sannio, Benevento, Italy (alucadam@unisannio.it)
In 2015, an environmental monitoring plan (http://www.campaniatrasparente.it) was launched with the aim of assessing the conditions of all environmental compartments (air, water, top and bottom soils, vegetables, biological samples) of the Campania region. A total of 5,333 topsoil samples were collected and analysed to determine the concentration of 52 chemical elements by means of Aqua Regia followed by ICP-MS. The main aim of prospecting campaign was to establish the ranges of the natural geochemical background for a few potentially toxic elements (PTEs) to be used as reference to define the degree of contamination of anthropized areas.
In the study area (about 13,600 km2) four volcanic areas are present and their pyroclastic products are spread across the regional territory due to a common (Plinian) explosive behaviour.
Due to the natural enrichment in some PTEs of soil developed on pyroclastic products, to discriminate the anthropic signals from the natural ones using geochemical data it is not a simple task when dealing with Campania soils. Therefore, as a preparatory work, to precisely identify regional areas mantled by “volcanic” soils we trained five machine learning algorithms (MLAs) to recognize when soil geochemistry is linked with the presence of volcanic products. All MLAs were implemented on centered log-ratio transformed data to reduce the closure and scaling effect commonly affecting geochemical data. In total, 1277 volcanic soils (VS) and 353 non-volcanic soils (NVS), respectively, were selected for the training phase. Data related with VS were selected based on the proximity of the samples with the volcanic centres, excluding highly anthropized areas. Data related with NVS were selected by consulting available detailed geological maps of those areas located faraway from volcanic areas where pyroclastic covers are completely absent. During the training phase, a cross-validation procedure was applied for parameters optimization. In the test phase all the MLAs showed an accuracy greater than 98% and the Random Forest algorithm proved to be the most accurate for the prediction of the remaining 3903 unlabelled soils. Therefore, a total of 1739 samples were classified as NVS and 2164 as VS. A subsequent comparison of the results with the existing distribution models of volcanic products has shown that samples classified as VS mainly fall in areas characterized by a high thickness of the pyroclastic fall deposits normally related to i) eruptions occurred in the last 10 Ky; ii) Campanian Ignimbrite eruption (ca. 39 ky BP); iii) Codola eruption (ca. 25 ky BP).
The MLAs results suggested that the most important chemical variables for the specific classification purpose were Ni, Cs, Ca, Co, Rb, Sc, Mn, U, Na. It is also evident that a first classification could be made by using few of these elements, as well. Our findings could be used as a valuable tool to better discriminate soil nature and geochemical characteristic aiming at a more effective assessment of natural background ranges for those elements sourced by both natural processes and human activities.
How to cite: Ambrosino, M., Albanese, S., Lucadamo, A., and Cicchella, D.: Combining compositional data analysis and machine learning to recognize where soil geochemistry is influenced by the presence of pyroclastic covers in Campania region (Southern Italy), EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-8484, https://doi.org/10.5194/egusphere-egu23-8484, 2023.