EGU General Assembly 2020
© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.

Site-scale estimation of Ozone in Northern Bavaria using Gradient Boosting Machines, Deterministic Regional Air Quality Models and a Hybrid Model

seyed omid nabavi1,2, Anke Nölscher2,3, Leopold Haimberger4, Juan Cuesta5, Christoph Thomas2,6, Andreas Held7, and Cyrus Samimi1,2
seyed omid nabavi et al.
  • 1Working Group of Climatology, University of Bayreuth, Bayreuth, Germany
  • 2BayCEER, University of Bayreuth, Bayreuth, Germany
  • 3Working Group of Atmospheric Chemistry
  • 4Department of Meteorology and Geophysics, University of Vienna, Vienna, Austria
  • 5Laboratoire Inter-universitaire des Systèmes Atmosphériques (LISA), UMR7583, Universités Paris-Est Créteil et Paris Diderot, CNRS, Créteil, France
  • 6Working Group of Micrometeorology, University of Bayreuth, Bayreuth, Germany
  • 7Chair of Environmental Chemistry and Air Quality, Department of Environmental Science and Technology, TU Berlin, Germany

This study is part of the Mitigation of Urban Climate and Ozone Risks (MiSKOR) project. MiSKOR aims to use a collection of tools to mitigate the problems of the urban heat island effect and ozone (O3) pollution in and around medium sized cities in northern Bavaria (NB).  In this study, we developed modelling tools to estimate (hindcast), classify (O3 >= 120 ug/m3 or O3 < 120 ug/m3), and forecast hourly O3 concentrations at nine unmonitored sites in NB. Three machine learning algorithms (MLAs) including linear- and tree-based eXtreme Gradient Boosting Machines (MLR-XGBM and Tree-XGBM) and logistic regression (LR) are used for O3 modelling. MLAs are trained by using hourly observations of O3 and its chemical and meteorological precursors from seven monitored sites in NB. In addition, the daily average of surface O3 observations along 6-hour back trajectories, produced by HYSPLIT model, is fed into MLAs to provide a rough estimation of O3 transport in a local scale. MLAs are compared with two state of the art regional deterministic models (DMs) namely the ECMWF Copernicus Atmosphere Monitoring Service (CAMS) regional air quality model for Europe (CAMS-EU) and the DLR WRF-POLYPHEMUS air quality system (used only for O3 forecast purpose). Finally, we created a new hybrid model by combining the O3 estimations from the best MLA model and the regional air quality model CAMS-EU.

According to averaged metrics from leave-one-site-out cross-validation (LOOCV), MLR-XGBM outperformed other models in the estimation of O3. This model yielded summertime RMSE and Spearman correlation coefficient (SCC) of 13.6 µg/m3 and 0.91 respectively. Interestingly, the hybrid model significantly improved the accuracy of O3 estimations. It reduced the summertime seasonal RMSE to 11.4 µg/m3 and increased the lowest seasonal SCC to 0.95. MLR-XGBM also yielded the best performance in O3 forecast compared to CAMS-EU and WRF-POLYPHEMUS. With regard to O3 classification LR outperformed other models. We also found that using remotely sensed lower troposphere O3, from IASI/GOME2, improves the classification of high extreme O3 in summertime.

How to cite: nabavi, S. O., Nölscher, A., Haimberger, L., Cuesta, J., Thomas, C., Held, A., and Samimi, C.: Site-scale estimation of Ozone in Northern Bavaria using Gradient Boosting Machines, Deterministic Regional Air Quality Models and a Hybrid Model, EGU General Assembly 2020, Online, 4–8 May 2020, EGU2020-11624,, 2020


Display file

Comments on the display

AC: Author Comment | CC: Community Comment | Report abuse

displays version 1 – uploaded on 06 May 2020, no comments