EMS Annual Meeting Abstracts
Vol. 22, EMS2025-455, 2025, updated on 30 Jun 2025
https://doi.org/10.5194/ems2025-455
EMS Annual Meeting 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Machine Learning-Based Prediction of Particulate Matter Concentrations in the Brazilian Cerrado
Márcio Teixeira and Marco Franco
Márcio Teixeira and Marco Franco
  • University of Campinas (UNICAMP), School of Technology, (mjt@unicamp.br)

The Brazilian Cerrado, a critical but rapidly degrading biome, faces significant air quality challenges due to land-use changes, agricultural expansion, and recurrent biomass burning. These activities contribute to elevated concentrations of particulate matter (PM2.5 and PM10) and other pollutants, posing severe health and environmental risks. Understanding the interplay between meteorology and air pollution is essential for improving predictive capabilities and mitigation strategies.

In this study, we analyze the seasonal variability of PM2.5, PM10, nitrogen oxides (NOX), carbon monoxide (CO), and ozone (O₃) using ground-based measurements from CETESB (Environmental Company of the State of São Paulo) between 2017 and 2023 in an urbanized region of the Cerrado. Our findings reveal distinct seasonal patterns: PM and NOx concentrations peak during the dry winter months due to increased biomass burning and reduced precipitation, while O₃ peaks in spring, likely influenced by cloud dynamics. Alarmingly, daily WHO air quality guidelines for NOx, PM10, and PM25 were exceeded by 15%, 22%, and 35%, respectively, underscoring the region’s air pollution crisis.

To enhance predictive accuracy, we evaluate machine learning models—Random Forest (RF) and XGBoost  on meteorological and air pollution variables. The RF model demonstrated superior performance, achieving R² values of 0.79 (train) and 0.92 (test) for PM10, with RMSEs of 10.7 and 6.5 µg m⁻³, respectively. For PM25, RF yielded R² values of 0.74 (train) and 0.91 (test), with RMSEs of 4.3 and 2.6 µg m⁻³. The XGBoost model showed applied to PM10 predictions showed RMSE of 2.67 and 8.01 µg m⁻³ for training and test respectively with R² values of 0.98 (train) and 0.85 (test). The PM2.5 predictions showed RMSE of 0.57 and 4.04 µg m⁻³  for training and test and the values of R² of 0.96 (train) and 0.73 (test).

This study highlights the potential of machine learning in improving air quality forecasting in tropical biomes, where complex interactions between meteorology and pollution dynamics exist. Future work will expand model validation across multiple Cerrado stations, enabling spatialized PM predictions and identifying high-emission zones and train alternate models based on Artificial Neural Networks (ANNs).  

How to cite: Teixeira, M. and Franco, M.: Machine Learning-Based Prediction of Particulate Matter Concentrations in the Brazilian Cerrado, EMS Annual Meeting 2025, Ljubljana, Slovenia, 7–12 Sep 2025, EMS2025-455, https://doi.org/10.5194/ems2025-455, 2025.

Recorded presentation

Show EMS2025-455 recording (11min) recording