EGU23-8715, updated on 26 Feb 2023
https://doi.org/10.5194/egusphere-egu23-8715
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Modelling of near-surface NO2 and O3 concentration over Germany using machine learning 

Vigneshkumar Balamurugan1, Jia Chen1, Adrian Wenzel1, and Frank N. Keutsch2,3
Vigneshkumar Balamurugan et al.
  • 1Environmental sensing and modelling, Technical University of Munich, Germany
  • 2School of Engineering and Applied Science, Harvard University, Cambridge, MA, USA.
  • 3Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA

Chemical transport models (CTMs) are commonly used to model air pollutant concentrations. CTMs, on the other hand, require a lot of computing power and sometimes yield biased findings that result from emission inventories and chemical mechanisms employed. Machine learning algorithms are used in a wide range of fields, including Earth system science. Its popularity stems from its ability to learn complex non-linear relationships. As a follow-up of our previous study [1], we attempted to deduce the capability of Machine Learning (ML) in modelling air pollutant concentrations.

In this study, we employed the Gradient Boosted Tree (GBT) algorithm to model near-surface NO2 and O3 over Germany at 0.1 degree resolution and daily intervals. The GBT model is trained using TROPOMI satellite column NO2, O3, HCHO data, as well as meteorology and road density as an information for NOX emission sources. Government air quality (NO2 and O3) observations from urban, suburban, and background stations are used as target variables; 321 stations are considered for NO2 ML model training and 256 stations are considered for O3 ML model training. The GBT model trained for near-surface NO2 explains 68-88% of observed concentrations, whereas, for near-surface O3, the GBT model explains 74-92% of observed concentrations. 

Road density and TROPOMI NO2 data are the most important features in the fitted model for near-surface NO2. This is due to the fact that road density (a proxy for traffic) is the main source of near-surface NOX emission, and the TROPOMI tropospheric NO2 column is a good representation of near-surface NO2 concentration. The downward UV radiation (DUV) at the surface and temperature are the most important features in the fitted model for near-surface O3. Since O3 is formed from the photolysis of NO2, DUV plays an important role in the fitted model for O3. Temperature is the driver of biogenic Volatile Organic Compounds (VOCs), which are an important precursor to O3

In all cases, the GBT model outperforms feed-forward neural networks. Furthermore, the developed GBT model for near-surface O3 is reliably transferable to other locations and countries (R2=0.87-0.94), whereas the developed model for near-surface NO2 is moderately transferable (R2=0.32-0.68). The reason could be that the road density is not the best representative of traffic NOX emissions and can be improved in a future study. Overall, we developed a new machine learning model to cost-effectively model the near-surface NO2 and O3 concentrations, which could help us to better understand the air pollution distribution at a moderate resolution.

References:

Balamurugan, V., Balamurugan, V. and Chen, J., 2022. Importance of ozone precursors information in modelling urban surface ozone variability using machine learning algorithm. Scientific reports12(1), pp.1-8.

How to cite: Balamurugan, V., Chen, J., Wenzel, A., and Keutsch, F. N.: Modelling of near-surface NO2 and O3 concentration over Germany using machine learning , EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-8715, https://doi.org/10.5194/egusphere-egu23-8715, 2023.