- 1Indian Institute of technology, Delhi, Centre for Atmospheric Sciences, New Delhi, India (shreya.srivastava.iitd@gmail.com)
- 2Indian Institute of technology, Delhi, Centre for Atmospheric Sciences, New Delhi, India (sagnik@cas.iitd.ac.in),
Aerosol composition information is crucial for determining the detrimental effects of aerosols on climate, air quality, and human health, given the differential effects of varying aerosol types. Conversely, developing south Asian countries lack systematic data on aerosol composition, with the available composition information limited to a few sites and a limited time. Moreover, the large spatio-temporal coverage of satellite observations is relatively unexplored for aerosol characterization.
In this study, we utilize satellite sensor Multi-angle Imaging Spectro-Radiometer (MISR) Level 2 version 23 aerosol products (spatial resolution = 4.4 km x 4.4 km) to calculate fractional aerosol optical depths (fAODs) for 2015-2016. These fAODs represent the proportion of total AOD attributable to the eight aerosol models assumed in MISR's aerosol retrieval algorithm, categorized based on size, shape, and refractive indices. The fractional AOD of aerosol model i is represented by fAODi. In this study, we use these eight fAODs, EDGAR emission data, together with land use and meteorological variables, as predictors in a machine learning (ML) model. The model is trained on a quarter-degree grid covering south Asia, using the chemical model simulated aerosol species mass fraction as the target variable.
We train models to predict six aerosol species-sulfate, nitrate, ammonium, black carbon (BC), organic carbon (OC) and dust. We employ two models: Random Forest with out-of-bag bootstrap sampling for cross-validation and Support Vector Regression (SVR) with a 5-fold cross-validation, utilizing 80% of the data for training and 20% for testing. The SVR model shows a mean cross-validation R² of 0.79, 0.83, 0.72, 0.81, 0.73 and 0.81 for sulfate, nitrate, ammonium, dust, OC and BC, respectively, with corresponding RMSE values of 0.02, 0.03, 0.01, 0.05, 0.04 and 0.01 on the test data. The Random Forest model performs better, with R² values of 0.87, 0.92, 0.85, 0.89, 0.90 and 0.88 for the same aerosol species and RMSE values of 0.02, 0.03, 0.01, 0.001, 0.03 and 0.01 for the test data. Permutation feature importance analysis shows that MISR-derived fAODs significantly influence the model’s predictions. The model anticipates aerosol composition to strengthen climate and health effect assessments of aerosols by focussing on their differential effects in low-income south Asian countries.
How to cite: Srivastava, S. and Dey, S.: Estimation of surface-based aerosol composition from satellite data-driven machine learning model over south Asia, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-873, https://doi.org/10.5194/egusphere-egu25-873, 2025.