Applications of an advanced clustering tool for EU AQ monitoring network data analysis
- 1Norwegian Institute for Air Reasearch, Kjeller, Norway (jos@nilu.no)
- 2Environment and Climate Change Canada, North York, Canada
Air quality monitoring networks provide invaluable data for studying human health, environmental impacts, and the effects of policy changes. In a European legislative context, the data collected constitutes the basis for reporting air quality status and exceedances under the Ambient Air Quality Directives (AAQD) following specific requirements. Consequently, the network's representativity and ability to accurately assess the air pollution situation in European countries become a key issue. The combined use of models and measurements is currently understood as the most robust way to map the status of air pollution in an area, allowing it to quantify both the spatial and temporal distribution of pollution. This spatial-temporal information can be used to evaluate the representativeness of the monitoring network and support air quality monitoring design using hierarchical clustering techniques.
The hierarchical clustering methodology applied in this context can be used as a screening tool to analyse the level of similarity or dissimilarity of the air concentration data (time-series) within a monitoring network. Hierarchical clustering assumes that the data contains a level of (dis)similarity and groups the station records based on the characteristics of the actual data. The advantage of this type of clustering is that it does not require an a priori assumption about how many clusters there might be, but it can become computationally expensive as the number of time-series increases in size. Three dissimilarity metrics are used to establish the level of similarity (or dissimilarity) of the different air quality measurements across the monitoring network: (1) 1-R, where R is the Pearson linear correlation coefficient, (2) the Euclidean distance (EuD), and (3) multiplication of metric (1) and (2). The metric based on correlation assesses dissimilarities associated with the changes in the temporal variations in concentration. The metric based on the EuD assesses dissimilarities based on the magnitude of the concentration over the period analysed. The multiplication of these two metrics (1-R) x EuD assesses time variation and pollution levels correlations, and it has been demonstrated to be the most useful metric for monitoring network optimization.
This study presents the MoNET webtool developed based on the hierarchical clustering methodology. This webtool aims to provide an easy solution for member states to quality control the data reported as a tier-2 level check and evaluate the representativeness of the air quality network reporting under the AAQD. Some examples from the ongoing evaluation of the monitoring site classification carried out as a joint exercise under the Forum for Air Quality Modeling (FAIRMODE) and the National Air Quality Reference Laboratories Network (AQUILA) are available to show the usability of the tool. MoNet should be able to identify outliers, i.e., issues with the data or data series with very specific temporal-magnitude profiles, and to distinguish, e.g., pollution regimes within a country and if it resembles the air quality zones required by the AAQD and set by the member states; stations monitoring high-emitting sources; background regimes vs. a local source driving pollution regime in cities.
How to cite: Soares, J., Stoll, C., Vallejo, I., Lee, C., Makar, P., and Tarrasón, L.: Applications of an advanced clustering tool for EU AQ monitoring network data analysis, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-15087, https://doi.org/10.5194/egusphere-egu23-15087, 2023.