- 1Grumets Research Group, CREAF, Edifici C, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain (e.trypidaki@creaf.uab.cat)
- 2Universitat Autònoma de Barcelona, Bellaterra, Catalunya, Spain
Meteorological, environmental, and geophysical measurements are essential for climate analysis and modelling, including weather forecasting and assessing extreme weather events such as drought and floods (WMO, 2008). Accurate meteorological data are critical, as erroneous data can significantly affect climate analyses and model validity (Llabrés-Brustenga et al., 2019). Harmonization and quality control (QC) are necessary to achieve reliable datasets.
Improved datasets can subsequently calculate drought indicators, such as the Standardized Precipitation Index (SPI) (McKee et al., 1993), and the Standardized Precipitation Evapotranspiration Index (SPEI) (Vicente-Serrano et al., 2010), which rely on meteorological data. By enhancing these indices, this work aims to improve strategies for monitoring and mitigating drought impacts.
As a case study, mean monthly temperature (Tmean °C) and cumulative precipitation (P, mm) data were collected from Ebro basin, Spain's largest catchment. Data from multiple organizations, including , were compiled for the period 1950–2023 to ensuring the highest possible accuracy and maximum station coverage. The QC process involves several steps, including test for temporal and spatial consistency, outlier detection, duplicate detection, missing data analysis, and cross-validation. Homogenization and outlier detection are the primary procedures for the monthly data series (Szentimrey, 2006; Venema et al., 2012). Proper merging of datasets from multiple providers required reprojection to align with a common spatial reference system and datum (EPSG:25830).
The workflow included the following steps: (a) Exclusion of short-length series to remove unstable or poorly accurate data; (b) Retention of stations installed before 2000 with ≥5 years of data and those installed after 2000 with ≥1 year of data, with all other stations removed; (c) Examination of temporal gaps and percentage of missing (NA) values for each station; (d) Detection of outliers, where extreme monthly temperature (>10°C) or precipitation values (>500 mm) were flagged, plotted, and compared to nearby stations. Erroneous values were removed based on expert judgment following visualisation in each subsequent step.
The QC script was developed in R and is openly accessible on GitHub: https://github.com/grumets/QCMeteoData/blob/QCMeteoData/Quality_Control.R, ensuring transparency and reproducibility.
The harmonisation process revealed challenges, including inconsistent formats across data sources and issues such as duplicated stations and measurements particularly in AEMET and XEMA datasets. Following QC, 0.35% of precipitation and 0.42% of temperature data from AEMET were removed, while only 0.13% of records from Météo France were affected, Despite assurance of dataset completeness and homogeneity by providers, the inconsistencies found showed the necessity of a more exhaustive QC procedure.
How to cite: Trypidaki, E., Batlle-Morera, A., Pesquer, L., and Domingo-Marimon, C.: Harmonizing Multi-Source Meteorological Data: A Reproducible Approach for Drought Monitoring, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-8850, https://doi.org/10.5194/egusphere-egu25-8850, 2025.