EGU21-2485, updated on 03 Mar 2021
https://doi.org/10.5194/egusphere-egu21-2485
EGU General Assembly 2021
© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.

Hybrid feature selection and machine learning approaches for assessing the arsenic awareness of local farming communities in Bengal Basin

Debasish Mishra1, Bhabani S. Das1, and Manoj Menon2
Debasish Mishra et al.
  • 1Indian Institute of Technology Kharagpur, WB, 721302, India
  • 2Department of Geography, University of Sheffield, S102TN, United Kingdom

High levels of arsenic in drinking water and food materials continue to pose a global health challenge. Over 127 million people alone in Bangladesh (BD) and West Bengal (WB) state of India are exposed to elevated levels of arsenic in drinking water. Despite decades of research and outreach, arsenic awareness in communities continues to be low. Specifically, very few studies have reported arsenic awareness among low-income farming communities. A comprehensive approach to assess arsenic awareness, hence, is a key step in identifying research and development priorities so that appropriate stakeholder engagement may be designed to tackle arsenic menace. In this study, we developed a 12-point comprehensive arsenic awareness index (CAAI) and identified key awareness drivers (KADs) associated with CAAI using hybrid feature selection for analysing the responses from the survey conducted in arsenic-affected areas of WB and BD. The two questionnaire surveys comprised of 73 questions each, covering the health, water and community, and food related aspect of arsenic contamination. Comparison of CAAIs showed that the BD farmers were generally more arsenic-aware (CAAI = 7.7) than WB farmers (CAAI = 6.8). Interestingly, the reverse was true for the awareness linked to arsenic in the food chain. Application of hybrid feature selection identified 15 KADs, which included factors related to stakeholder interventions and cropping practices. Inclusion of Boruta wrapper in the hybrid feature selection aided in discarding the randomly associated chi-square (χ2) significant variables (p < 0.05), which included the commonly perceived socio-economic factors such as age, gender and income. An inter-comparison of KADs revealed the differences in objectives and importance laid on various interventions under different government regimes for tackling arsenic menace. Hence, the CAAI and KADs combination revealed a contrasting arsenic awareness between the two farming communities, albeit their cultural similarities. For analysing the predictive power of the KADs for CAAI, both linear and non-linear machine learning models were deployed. Among ML algorithms, classification and regression trees and single C5.0 tree could estimate CAAIs with an average accuracy of 84%. Both communities agreed on policy changes on water testing and clean water supply, while there was less importance laid by both farming communities in testing food for arsenic concentration. Specifically, our study shows the need for increasing awareness of risks through the food chain in BD, whereas awareness campaigns should be strengthened to raise overall awareness in WB possibly through media channels as deemed effective in BD. Overall, this study addresses the UN sustainable development goals (SDGs) such as clean water and sanitation (SDG6), zero hunger (SDG2), good health and well-being (SDG3), and echoes with the WHO’s comprehensive action plan of involving water testing, awareness-building campaigns, and mitigation options to combat arsenic toxicity menace. 

How to cite: Mishra, D., Das, B. S., and Menon, M.: Hybrid feature selection and machine learning approaches for assessing the arsenic awareness of local farming communities in Bengal Basin, EGU General Assembly 2021, online, 19–30 Apr 2021, EGU21-2485, https://doi.org/10.5194/egusphere-egu21-2485, 2021.

Corresponding displays formerly uploaded have been withdrawn.