Inferring Surface NO2 over Western Europe: A Machine Learning Approach with Uncertainty Quantification
- 1Royal Belgian Institute for Space Aeronomy (BIRA-IASB), Brussels, Belgium
- 2Université libre de Bruxelles (ULB), Spectroscopy, Quantum Chemistry and Atmospheric Remote Sensing (SQUARES), Brussels, Belgium
- 3Ф-lab, European Space Agency (ESA), Frascati, Italy
Nitrogen oxides (NOx = NO + NO2) are of great concern due to their impact on human health and the environment. Machine learning (ML) techniques are increasingly employed for surface NO2 estimation following fast-paced developments in artificial intelligence, computational power, and big data management. However, the uncertainties inherent in these retrievals are critical but are rarely studied in the rapid expansion of ML applications in atmospheric research.
In this study, we have developed a novel ML framework enhanced with uncertainty quantification techniques, named Boosting Ensemble Conformal Quantile Estimator (BEnCQE), to estimate surface NO2 and assess the corresponding uncertainty arising from data. Quantifying such data-induced uncertainty is essential for ML applications as the ML models are data-driven. We apply the BEnCQE model with multi-source data to infer surface NO2 concentrations over Western Europe at the daily scale and 1 km spatial resolution, from May 2018 to December 2021. The space-based cross-validation with in-situ station measurements shows that our model achieves accurate point estimates (r = 0.8, R2 = 0.64, root mean square error = 8.08 ug/m3) and reliable prediction intervals (coverage probability, PI-66%: 66.4%, PI-90%: 90.4%). The model result is also in good agreement with the Copernicus Atmosphere Monitoring Service (CAMS) model output. Furthermore, the quantile estimation strategy used in our model enables us to understand the variations in the predictors’ importance for different NO2 level estimates. Additionally, integrating uncertainty information can uncover potential exceedances of the World Health Organization (WHO) 2021 NO2 limits in some locations, an exceedance risk that point estimates alone may fail to fully capture. Meanwhile, uncertainty quantification, by providing information on the uncertainty of each estimate, allows us to assess the robustness of the model outside of existing in-situ station measurements. The variations in uncertainty suggest that the model's robustness is related to conflicts between seasonal and spatial NO2 patterns influenced by multi-source data. It also reveals challenges in urban and mountainous areas where NO2 is highly variable and heterogeneously distributed.
How to cite: Sun, W., Tack, F., Clarisse, L., Schneider, R., Stavrakou, T., and Van Roozendael, M.: Inferring Surface NO2 over Western Europe: A Machine Learning Approach with Uncertainty Quantification, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-7936, https://doi.org/10.5194/egusphere-egu24-7936, 2024.