EGU24-2504, updated on 08 Mar 2024
https://doi.org/10.5194/egusphere-egu24-2504
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Estimation of daily NO2 with explainable machine learning model in China, 2007-2020

Yanchuan Shao1,2, Wei Zhao1, Riyang Liu1, Jianxun Yang1, Miaomiao Liu1, Wen Fang1, Litiao Hu1, Matthew Adams2, Jun Bi1,3, and Zongwei Ma1,3
Yanchuan Shao et al.
  • 1Nanjing University, School of Environments, Environmental planning, China (ycshaoes@gmail.com)
  • 2Department of Geography & Planning, University of Toronto Mississauga, Mississauga, Canada
  • 3Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing University of Information Science & Technology, Nanjing, Jiangsu, China

Surface nitrogen dioxide (NO2) is an effective indicator of anthropogenic combustion and is associated with regional burden of disease. Though satellite-borne column NO2 is widely used to acquire surface concentration through the integration of sophisticated models, long-term and full-coverage estimation is hindered by the incomplete retrieval of satellite data. Moreover, the mechanical relationship between surface and tropospheric NO2 is often ignored in the context of machine learning (ML) approach. Here we develop a gap-filling method to obtain full-coverage column NO2 by fusing satellite data from different sources. The surface NO2 is then estimated during 2007-2020 in China using the XGBoost model, with daily out-of-sample cross-validation (CV) R2 of 0.75 and root-mean-square error (RMSE) of 9.11 µg/m3. The back-extrapolation performance is verified through by-year CV (daily R2 = 0.60 and RMSE = 11.46 µg/m3) and external estimations in Taiwan before 2013 (daily R2 = 0.69 and RMSE = 8.59 µg/m3). We explore the variable impacts in three hotspots of eastern China through SHAP (Shapley additive explanation) values. We find the driving contributions of column NO2 to the variation of ground pollution during 2007-2020 (average SHAP = 5.09 µg/m3 compared with the baseline concentration of 33.39 µg/m3). The estimated effect is also compared with ordinary least squares (OLS) model to provide a straightforward understanding. The related health burden is further calculated by using the annual NO2. We demonstrate the employment of explainable ML model is beneficial for comprehend the coupled relationship in surface NO2 change.

How to cite: Shao, Y., Zhao, W., Liu, R., Yang, J., Liu, M., Fang, W., Hu, L., Adams, M., Bi, J., and Ma, Z.: Estimation of daily NO2 with explainable machine learning model in China, 2007-2020, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2504, https://doi.org/10.5194/egusphere-egu24-2504, 2024.

Supplementary materials

Supplementary material file

Comments on the supplementary material

AC: Author Comment | CC: Community Comment | Report abuse

supplementary materials version 1 – uploaded on 18 Apr 2024, no comments