EGU24-16338, updated on 09 Mar 2024
https://doi.org/10.5194/egusphere-egu24-16338
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Using explainable machine learning to better understand source and process contributions to atmospheric bio-aerosol

Hao Zhang1, Congbo Song2, David Topping1, Ian Crawford1, Martin Gallagher1, Man Nin Chan3, Hing Bun martin Lee3, Sinan Xing3, Tsin Hung Ng3, and Amos Tai3
Hao Zhang et al.
  • 1The University of Manchester, Center of Atmospheric Sciences, Department of Earth and Environmental Sciences, Manchester, United Kingdom of Great Britain – England, Scotland, Wales (hao.zhang-26@postgrad.manchester.ac.uk)
  • 2National Centre for Atmospheric Science (NCAS), The University of Manchester, Manchester, United Kingdom of Great Britain – England, Scotland, Wales
  • 3Faculty of Science, The Chinese University of Hong Kong, Hong Kong, China

The role of atmospheric bio-aerosols as determinants of environmental and human health outcomes is receiving more attention. However, a lack of fully evaluated end-to-end detection techniques hinders our understanding of identifying bioaerosol types and their environmental drivers, particularly in complex environments. In this study we mitigate these challenges through development of a novel machine learning framework that combines unsupervised deep learning and explainable machine learning techniques. The first step combines bidirectional long short-term memory autoencoder (Bilstm-AE) and a relatively new hierarchical, fast, clustering technique. Our results indicate that this approach outperforms other models, successfully distinguishing between fungal spores, non-biological aerosols, and pollen solely based on fluorescence information without the need for training data. Subsequently using automated machine learning and the SHapley Additive eXplanation (SHAP) method, we quantitatively discerned the environmental drivers of bioaerosol types. The variation of SHAP value indicated that the elevated pollen concentrations at night could be attributed to changes in its air mass composition and origins. More importantly, we find ambient evidence that pollen may break into smaller fragments when RH is over 90, leading to significant changes in its fluorescence spectrum and a rapid increase in its concentration. Overall we find that combining unsupervised deep learning and explainable machine learning could provide new insights into type-specific bioaerosols process.

How to cite: Zhang, H., Song, C., Topping, D., Crawford, I., Gallagher, M., Chan, M. N., Lee, H. B. M., Xing, S., Ng, T. H., and Tai, A.: Using explainable machine learning to better understand source and process contributions to atmospheric bio-aerosol, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-16338, https://doi.org/10.5194/egusphere-egu24-16338, 2024.