OSA3.2 | Climate monitoring: data rescue, management, quality and homogenization
Climate monitoring: data rescue, management, quality and homogenization
Convener: Federico Fierli | Co-conveners: Dan Hollis, John Kennedy
Orals
| Fri, 06 Sep, 14:00–17:15 (CEST)
 
A111 (Aula Joan Maragall)
Posters
| Attendance Thu, 05 Sep, 18:00–19:30 (CEST) | Display Thu, 05 Sep, 13:30–Fri, 06 Sep, 16:00|Poster area 'Galaria Paranimf'
Orals |
Fri, 14:00
Thu, 18:00
Robust and reliable climatic studies, particularly those assessments dealing with climate variability and change, greatly depend on availability and accessibility to high-quality/high-resolution and long-term instrumental climate data. At present, a restricted availability and accessibility to long-term and high-quality climate records and datasets is still limiting our ability to better understand, detect, predict and respond to climate variability and change at lower spatial scales than global. In addition, the need for providing reliable, opportune and timely climate services deeply relies on the availability and accessibility to high-quality and high-resolution climate data, which also requires further research and innovative applications in the areas of data rescue techniques and procedures, data management systems, climate monitoring, climate time-series quality control and homogenisation.
In this session, we welcome contributions (oral and poster) in the following major topics:
• Climate monitoring , including early warning systems and improvements in the quality of the observational meteorological networks
• More efficient transfer of the data rescued into the digital format by means of improving the current state-of-the-art on image enhancement, image segmentation and post-correction techniques, innovating on adaptive Optical Character Recognition and Speech Recognition technologies and their application to transfer data, defining best practices about the operational context for digitisation, improving techniques for inventorying, organising, identifying and validating the data rescued, exploring crowd-sourcing approaches or engaging citizen scientist volunteers, conserving, imaging, inventorying and archiving historical documents containing weather records
• Climate data and metadata processing, including climate data flow management systems, from improved database models to better data extraction, development of relational metadata databases and data exchange platforms and networks interoperability
• Innovative, improved and extended climate data quality controls (QC), including both near real-time and time-series QCs: from gross-errors and tolerance checks to temporal and spatial coherence tests, statistical derivation and machine learning of QC rules, and extending tailored QC application to monthly, daily and sub-daily data and to all essential climate variables
• Improvements to the current state-of-the-art of climate data homogeneity and homogenisation methods, including methods intercomparison and evaluation, along with other topics such as climate time-series inhomogeneities detection and correction techniques/algorithms, using parallel measurements to study inhomogeneities and extending approaches to detect/adjust monthly and, especially, daily and sub-daily time-series and to homogenise all essential climate variables
• Fostering evaluation of the uncertainty budget in reconstructed time-series, including the influence of the various data processes steps, and analytical work and numerical estimates using realistic benchmarking datasets

Orals: Fri, 6 Sep | A111 (Aula Joan Maragall)

Chairpersons: Federico Fierli, Dan Hollis
Data Rescue
14:00–14:15
|
EMS2024-342
|
Onsite presentation
Kevin Healion, Simon Noone, and Peter Thorne

Although much work has been conducted worldwide to discover, rescue and digitise historical weather observations, there remains a lack of access to weather observations from many regions of the world. One such region is Africa which has a scarcity of historical observations available from the pre satellite era which is preventing important research taking place on past extreme weather events and the climate of the continent. The need for this data becomes more urgent when we consider that the workforce in African countries is disproportionately employed in climate-exposed sectors with over 60% employed in the agricultural sector where crops are completely dependent on rainfall. The ACMAD (African Centre of Meteorological Applications for Development) collection offers an opportunity to improve the temporal and spatial data available across the continent of Africa with data available in some countries as far back as the late nineteenth century.          

An inventory has been created of the collection and shows that within it a large amount of unique data exists that has not been previously inventoried by other sources. This includes numerous weather stations that have never been inventoried before. With an estimated four million images within the collection, numerous methods are required to digitise the unique data. The team in ICARUS (Irish Climate Analysis Research UnitS), Maynooth University, launched a project called CliDAR-Africa, where we assisted second year undergraduate geography students to digitise unique data from stations in Madagascar and Guinea. Discussion on the use of AI as a tool to transcribe data have also taken place. Quality issues with the images within the collection also exist and the team have been developing a citizen science project where inferior quality images can be identified. Once identified it is hoped that the quality of these images can be improved at a later date.

In my presentation I will discuss the various issues with the ACMAD collection the ICARUS team are attempting to solve, the results following the inventorying of the collection, efforts to digitise data from Madagascar, proposed projects involving Citizen Science and AI, and finally the important need to rescue African meteorological data in order to improve our knowledge of climate change and extreme weather events in Africa.  

How to cite: Healion, K., Noone, S., and Thorne, P.: Rescuing historical data from the ACMAD collection- The importance for climate research in Africa., EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-342, https://doi.org/10.5194/ems2024-342, 2024.

14:15–14:30
|
EMS2024-797
|
Onsite presentation
|
Marlies van der Schee, Gerard van der Schrier, Martijn Majoor, Kirien Whan, Peer Hechler, Paul Poli, Maria Antónia Valente, Stefan Brönnimann, Stefan Grab, Rob Allan, and Peter Thorne

A new data rescue portal is designed to facilitate and coordinate the rescue of weather and climate data from around the world. It is hosted at https://datarescue.climate.copernicus.eu/. The practical information, data rescue (DARE) projects and metadata inventories originate from initiatives of both the World Meteorological Organization (WMO) and Copernicus Climate Change Service (C3S). All practical information of the  WMO I-DARE portal is merged into this current project and will be replaced by the new portal. IEDRO and ACRE are collaborators for the content of the website.

The mission of the portal is as follows:

  • a collaborative framework on sharing information, best practices, know-how, guidance, metadata on data rescue projects and activities worldwide,
  • provides a single entry point for accessing information on the status of climate data being digitized or in need of recovery and digitization, and
  • enables collaboration among organizations, development agencies, donors, scientists, NGOs, citizens, to work on the data recovery and digitization of climate heritage which is at risk of loss forever.

One of the main features on the website is the opportunity to highlight your data rescue project. By making your efforts known and publicly available, chances are decreased that the same data is rescued twice by other groups. Additionally, we encourage owners of rescued data to share this in a global repository, so that the valued data will not get lost again. The data portal and project collection also serves as a  starting point for donors to select viable DARE projects which need support. It offers donors the perspective that DARE is done following internationally agreed procedures and that rescued data are shared with the global community rather than the data ending-up in some poorly accessible local file system, which is a multiplier of investment.

The C3S Datarescue work package contributes and monitors the efforts on using Artificial Intelligence (AI) and Deep Learning for Optical Character Recognition (OCR) to aid data rescue efforts. The development of a proof of concept for a data rescue image repository is in line with future methods to retrieve valuable meteorological data from scanned paper records with OCR. During the presentation, we will present the latest progress of the C3S DARE work package on the DARE portal, which includes articles on AI efforts to rescue meteorological data.

How to cite: van der Schee, M., van der Schrier, G., Majoor, M., Whan, K., Hechler, P., Poli, P., Valente, M. A., Brönnimann, S., Grab, S., Allan, R., and Thorne, P.: New integrated Date Rescue Portal  - Facilitating DARE projects, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-797, https://doi.org/10.5194/ems2024-797, 2024.

Homogenization
14:30–14:45
|
EMS2024-72
|
Onsite presentation
Xiaolan Wang, Yang Feng, Victor Isaac, Lucie Vincent, and Megan Hartwell

Using hourly surface wind speed data from 155 stations in Canada, this study first developed a homogenized monthly mean wind speed dataset for the period of 1953-2023, which was then used to characterize observed changes in surface wind speed in Canada. The hourly data were first quality controlled and adjusted for non-standard anemometer heights before being used to calculate monthly mean wind speed series. To identify artificial discontinuities, the monthly mean wind speed series were subject to a semi-automated comprehensive data homogenization procedure, which uses a combination of station metadata and multiple statistical tests with and without using reference series. Reference series used include up to four best significantly-correlated neighbour stations’ data series, the ensemble mean series of monthly wind speed taken from the Twentieth Century Reanalysis version 3 (20CRv3), and monthly mean geostraphic wind speeds derived from homogenized surface pressure data. The results from the automated procedure were then reviewed manually using metadata and visual inspection of the multiphase regression fits with expert judgement. As a result, all the 155 data series were identified to have one or more artificial discontinuities, which were diminished by quantile matching adjustments. Anemometer height change, station joining, relocation, instrument changes/problems were found to be the main causes of data inhomogeneities. The homogenized dataset for 1953-2023 shows wind stilling in region from northern British Columbia (BC) to southern Yukon-Northwest Territories and from southern Prairies to Quebec-Labrador, which was matched with wind strengthening in the region from southern-central BC to the Rocky Mountains, and in Newfoundland and the high Arctics. The trend pattern of in-situ wind speed data bear substantial similarity to that of both the  ERA5 and 20CRv3 reanalysis wind speed data.

How to cite: Wang, X., Feng, Y., Isaac, V., Vincent, L., and Hartwell, M.: On Monthly Mean Surface Wind Speed Data Homogenization and Trend Assessment, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-72, https://doi.org/10.5194/ems2024-72, 2024.

14:45–15:00
|
EMS2024-231
|
Onsite presentation
Peter Domonkos, Marc Prohom, and Jordi Cunillera

The homogeneity of wind speed and wind gust data is very sensitive to changes in the instrumentation or the environment of the sensor. Time series of wind observations over Catalonia have been analysed, and in spite of the relatively short history of automated observations, several significant inhomogeneity biases were detected and removed.

A dense network of observing stations and documentation of station histories (metadata) helped the homogenization. Only time series with observation periods of at least 10 years were used, resulting in the homogenization of 209 time series for both wind speed and wind gust. The data was originated from three sources, i.e. the primary network of the Catalan Meteorological Service (SMC), the agrometeorological observation network of the SMC and the network of the Spanish Meteorological Agency (AEMET) in Catalonia.

The data underwent basic quality control before homogenization. Homogenization was performed with ACMANT, known to be the most accurate homogenization method based on currently available method comparison test results. The used new version ACMANTv5.2 can take the benefit of metadata either in automatic or interactive mode. Prior to homogenization, a specific examination was performed, in which residual standard deviations of two inhomogeneity models were compared, i.e., the additive model and multiplicative model. The additive model showed clear advantages for wind gust homogenization, while the suitability of the two models appeared to be similar for the homogenization of mean wind speed data. Metadata indicated several technical changes during the observation periods of 24 years on average. Most stations experienced changes in sensors and data logger multiple times, while a few time series were affected by station relocations or changes in the sensor height.

The homogenization process revealed that changes in sensors and data logger often did not cause perceptible inhomogeneities. On the other hand, two network-wide changes in data loggers occurred, in 2005 in the red agrometeorological network and in 2007 in the primary network of Meteocat, when the technical changes impacted significantly and almost synchronously the homogeneity of the data. Since all relative homogenization methods, including ACMANT, presume that inhomogeneities are station-specific, metadata played a crucial role in detecting these problems. The final homogenization was performed in a way that neighbour series affected by the same type inhomogeneity bias as that of the candidate series were excluded from the homogenization.

 

How to cite: Domonkos, P., Prohom, M., and Cunillera, J.: Homogenization of Daily Wind Speed and Wind Gust Time Series in the Era of Automatic Observations in Catalonia, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-231, https://doi.org/10.5194/ems2024-231, 2024.

15:00–15:15
|
EMS2024-406
|
Onsite presentation
Beatrix Izsák and Tamás Szentimrey

In essence the theme of homogenization can be divided into two subgroups, such as monthly and daily data series homogenization. These subjects are in strong connection with each other of course, for example the monthly results can be used for the homogenization of daily data. The earlier versions of our method MASH (Multiple Analysis of Series for Homogenization; Szentimrey) were developed for homogenization of the daily and monthly data series in the mean i.e. the first order moment. The software MASH was developed as an interactive automatic, artificial intelligence (AI) system that simulates the human intelligence and mimics the human analysis on the basis of advanced mathematics. The new version MASHv4.01 is able to homogenize also the standard deviation i.e. the second order moment. We remark if the data are normally distributed (e.g. mean temperature) then the homogenization of mean and standard deviation is sufficient since in case of normal distribution if the first two moments are homogenous then the higher order moments are also homogeneous.

In our presentation, we present the application of the MASH4 software on real climate data. From 1 January 1901 to 31 December 2023, the daily mean temperature data series of 27 stations in Hungary were homogenized in both first and second moments. We present our results based on verification statistics. The developed automatic verification procedure of MASH makes it possible to test the homogenized time series systematically in order to clear up the uncertainty. The basic concept of the verification procedure is that confidence in the homogenized series may be increased by the joint comparative examination of the original and the homogenized series systems. The comparison is based on such adequate questions, which can also be formulated mathematically, consequently, a programmed statistical test procedure can be obtained to evaluate the quality of the homogenized series. The questions are related to the estimated inhomogeneity of the series before and after homogenization the comparison of the modification measure of the series with the estimated inhomogeneity of the original series, the representativity of the given station network in respect of the homogenization, the relation between the estimated inhomogeneities and the metadata.

The research presented in the article was carried out within the framework of the Széchenyi Plan Plus program with the support of the RRF 2.3.1 21 2022 00008 project.

How to cite: Izsák, B. and Szentimrey, T.: Homogenization in mean and standard deviation (MASHv4.01), EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-406, https://doi.org/10.5194/ems2024-406, 2024.

15:15–15:30
|
EMS2024-410
|
Onsite presentation
Olivér Szentes, Mónika Lakatos, and Rita Pongrácz

The longest period of meteorological measurements in Hungary is available for the temperature records. A substantial expansion of the number of data series used to homogenize temperatures is currently under way, extending back to the second half of the 19th century. The main motivation of the study is the fact that a long-term database based on high-quality measurements is essential to better understand the regional climate and its changes.

To improve the understanding of climate and its changes requires temporally and spatially representative climate databases. However, measurement conditions change frequently: the relocation of stations, instrument changes, changes in measurement time, changes in environmental conditions can all cause inhomogeneities in the data series, and therefore homogenization is needed.

For homogenization of data series, quality control and filling in the missing values the MASH (Multiple Analysis of Series for Homogenization) procedure (MASHv3.03 software) is used at the Climate Research Department of the HungaroMet Hungarian Meteorological Service. Nowadays, temperature is measured in many more places than, for example, 100-150 years ago, so the homogenization should consist of several steps (i.e. 3 or more MASH systems). For example, we currently use temperature data from 34 stations from 1901 and 55 stations from 1951. These station networks are further extended. Inhomogeneities are estimated using monthly data series. Monthly, seasonal and annual inhomogeneities are harmonized in all MASH systems. Then, we create temporally representative data series using the MASH homogenization procedure.

However, weather stations are not evenly distributed, the station network consists of both densely and sparsely covered subregions. In order to estimate the values of meteorological variables at points where no measurements are available, a spatial interpolation method must be used. Our gridded climate datasets are generated using the MISH (Meteorological Interpolation based on Surface Homogenized Data Basis) method (MISHv1.03 software). The use of the MISH interpolation results in a spatially representative climate database.

In this presentation, the new mean temperature station systems used for homogenization are described together with the most important verification statistics of the homogenization of mean temperature data series, and finally, the gridded spatial means (national averages for Hungary) are analyzed from the mid-19th century to the present.

 

Acknowledgements:

The research presented was carried out within the framework of the Széchenyi Plan Plus program with the support RRF 2.3.1 21 2022 00008 project.

How to cite: Szentes, O., Lakatos, M., and Pongrácz, R.: Homogenized and gridded mean temperature data series in Hungary from the mid-19th century, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-410, https://doi.org/10.5194/ems2024-410, 2024.

Coffee break
Chairpersons: Dan Hollis, Federico Fierli
Quality Control
16:00–16:15
|
EMS2024-367
|
Online presentation
Yuri Brugnara, Martin Steinbacher, and Lukas Emmenegger

The Global Atmosphere Watch (GAW) Programme of the World Meteorological Organization coordinates a worldwide network of hundreds of ground-based in-situ monitoring stations that provide reliable scientific data on the chemical composition of the atmosphere. In the framework of the GAW Programme, the Quality Assurance/Scientific Activity Centre at Empa has developed an interactive dashboard based on data science to support station operators in timely detecting issues in their in-situ measurements of various trace gases.

The application (GAW-qc), currently in beta testing, makes use of a mixture of purely data-driven and hybrid anomaly detection techniques. It exploits historical measurements made at the target station as well as the archive of gridded numerical forecasts by the Copernicus Atmosphere Monitoring Service (CAMS). The accuracy of the latter for the specific site is improved through machine learning using various predictors, including meteorological parameters and aerosol concentrations.

GAW-qc allows station operators to upload their latest measurements, visualize the data with different temporal aggregations, and easily detect anomalous values using just their internet browser. By combining the information gathered from the dashboard with logbook entries and local expertise, they can effectively flag problematic measurements and even detect instrumental issues that would remain unnoticed otherwise. First case studies indicate that this process can indeed facilitate the detection of malfunctionings in the analytical setup and reduce the ingestion of erroneous data into the international data repositories. Moreover, it has the potential to shorten data gaps if applied timely. Therefore, it may become a game-changer towards reliable, comparable and traceable world-wide datasets in the field of air quality and greenhouse gases. The software is freely available through a GitHub repository and can be adapted to analyze other atmospheric variables.

How to cite: Brugnara, Y., Steinbacher, M., and Emmenegger, L.: GAW-qc: A data science-based dashboard for quality control of atmospheric composition measurements, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-367, https://doi.org/10.5194/ems2024-367, 2024.

Monitoring
16:15–16:30
|
EMS2024-555
|
Online presentation
Exploring the use of automatic weather stations for climate monitoring: The Basque Country case study
(withdrawn after no-show)
Roberto Hernandez, Maddalen Iza, Maialen Martija, Santiago Gaztelumendi, and José Antonio Aranda
16:30–16:45
|
EMS2024-831
|
Onsite presentation
Barbara Chimani and Anna-Maria Tilg

Climate normal periods are a common way to describe climate and its changes. With the period 1991 -2020, the first independent climate period after the frequently used reference period of 1961-1990 is available. While, according to the recommendation of WMO, the period of 1961-1990 should still be used as a reference for the long-term climate development, the newer period provides the information for the current climate and is going to be used in many applications regarding also different sectors as agriculture, infrastructure and energy.

The stations for which climate normal period values for 1991-2020 have been calculated were selected depending on their:

  • inclusion in regular international data exchange
  • use in climate products and services provided by GeoSphere Austria
  • inclusion in former homogenization activities
  • use in former national climate normal period datasets
  • the current state of the measurements of the stations
  • completeness in the period 1991-2020.

The homogenization of the selected stations data was done for the daily mean temperature, the daily maximum temperature, the daily minimum temperature and daily precipitation and for an as long period as possible (if possible back to 1961). This was done in order to compare the most current climate period to the 1961-1990 period. For the calculation of both climate normal periods the same criteria of completeness have been used. The method used for homogenization was ACMANT. Finally, around 170 stations where selected for most of the calculated parameters (significantly less stations have been taken into account for snow-depth, radiation and sunshine duration). The homogenization was evaluated by the analyses of breaks and adjustments, by comparison with parallel data as well as by the effect on trends and climate normal values. Additionally the results for the 1991-2020 climate normal period where compared to the results from neighboring countries.

The presentation will cover aspects of the homogenization, its evaluation and the effects of climate change.

How to cite: Chimani, B. and Tilg, A.-M.: Austria data for climate normal period 1991-2020, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-831, https://doi.org/10.5194/ems2024-831, 2024.

16:45–17:00
|
EMS2024-877
|
Onsite presentation
John O'Sullivan, Mary Curley, Ciarán Kelly, and Jonathan McGovern

It is essential to have validated and trusted records of past climate extremes.

These records are used by planners and policymakers to help them make informed decisions regarding many different sectors - from construction projects to health budgets, from environmental legislation to infrastructure planning, for example.

They are also used to tune and improve climate models, leading to more reliable future projections in a changing climate. Assessing and improving on the abilities of climate models to reproduce these (by definition) rare events, provides a stronger basis from which better informed mitigation and adaptation measures against such potential future climate extremes can be taken.

In this research, we present the comprehensive re-evaluation process of the Irish national maximum air temperature and the records for the months of June, July and August.

For records prior to 1961, we use newly digitised historical climate data from the Met Éireann archives and integrate advanced 20CRv3 sparse-input reanalysis data, station metadata, historical newspaper articles, and contemporaneous references from the examined timeframes. For more recent records, we use climate data from the Met Éireann database and integrate the most recent ECMWF reanalyses products, along with all relevant metadata for our analysis. We also employ time series methods and extreme value theory to help us assess the veracity of the records.

The result of this study will be a list of validated maximum air temperature summer monthly records for Ireland. This process will then be applied to other months and other climate variables in future work.

This research underscores the significance of data rescue efforts in advancing our understanding of past climate extremes, and advocates for continued digitisation and analysis of historical climate data and metadata. By refining national air temperature records through the integration of historical data and advanced reanalysis techniques, the research contributes to a more comprehensive understanding of climate dynamics.

How to cite: O'Sullivan, J., Curley, M., Kelly, C., and McGovern, J.: Reassessing Ireland’s maximum air temperature record value and the monthly maximum air temperature records for June, July and August, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-877, https://doi.org/10.5194/ems2024-877, 2024.

17:00–17:15
|
EMS2024-954
|
Onsite presentation
Elke Rustemeier, Peter Finger, Markus Ziese, and Zora Schirmeister

Reliable observational data are essential for robust climate analyses, especially long-term trends.

HOMPRA Europe 2 (HOMogenized PRecipitation Analysis of European in-situ data) is a gridded monthly precipitation dataset based on homogenized time-series. The carefully selected database of more than 5500 stations is a subset of the time series collected by the Global Precipitation Climatology Centre (GPCC). The subset is characterized by few missing values < 20% and intensive quality control.

Compared to its predecessor, the new HOMPRA Europe 2 product covers a longer period from 1951-2015 and is syncronized with the more recent data of the GPCC Full Data Monthly product, so that the data can be extended to the present without a break, although a homogeneous time series cannot be guaranteed for these more recent data.

The actual homogenization process consists of multiple steps: In the first step, for each station, comparable time series are selected. The decision is based on the correlation and Ward's method of minimum variance performed on the deterministic first derivative. For the artificial breakpoint detection, natural variability and natural trends are temporarily removed using the previously selected neighboring time series. This ensures that only artificial changes can be detected. The actual detection applied to annual values is based on the segmentation algorithm of Caussinus and Mestre (2004). In the final step, the breaks detected in the monthly data are corrected using multiple linear regression (Mestre, 2003).

An additional random variation of the neighboring stations shows the robustness of the correction of the individual time series.

Finally, the actual HOMPRA Europe 2 product is created by interpolating the homogenized series onto a 1° grid using the modified spheremap interpolation schemes in operation at the GPCC (Becker et al., 2013 and Schamm et al., 2014).

 

How to cite: Rustemeier, E., Finger, P., Ziese, M., and Schirmeister, Z.: HOMPRA Europe 2 – An update of a gridded precipitation data set from European homogenized time series, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-954, https://doi.org/10.5194/ems2024-954, 2024.

Posters: Thu, 5 Sep, 18:00–19:30 | Poster area 'Galaria Paranimf'

Display time: Thu, 5 Sep, 13:30–Fri, 6 Sep, 16:00
GP36
|
EMS2024-305
Monica Herrero-Anaya, Xavi de Yzaguirre, Marc Prohom, Jordi Cunillera, Toni Barrera, Adrian Ruiz, and Jordi Montserrat

Since the restoration of the Meteorological Service of Catalonia (SMC) in 2002, data rescue, quality control and homogeneity analysis of climate series have been one of the main focuses of the Climatology Area ofSMC. Working with continuous series, having a wide temporal coverage and ensuring good spatial density are essential for accurately characterizing climate change and variability.

To achieve this goal, significant effort must be invested in data preservation and rescue, identifying meteorological series of interest, and ensuring good digitization, transcription and cataloging. This work describes the process of data rescue at SMC, from identifying meteorological significant collections, to entering all data into the historical base of SMC, while preserving original documentation.

Collections with meteorological documentation vary widely.

  • Historical newspapers. In recent years many state, provincial and local archives in Catalonia have tried to digitize historic newspapers.SMC has identified over 175 periodicals containing meteorological observations and metadata information.
  • Private archives. Several private archives, both religious and corporate, have been found to contain documentation related to meteorological observations.
  • Observers or their relatives who have preserved the original documentation, contact SMC for the preservation.

When a collection arrives at SMC, there are essentially four possibilities. (1) First, it may involve a temporary transfer of documentation, in which case it is digitized and returned to the owner. (2) Alternatively, if photography of the documentation is not permitted, it may be necessary to visit the location and transcribe the data. (3) A third option is that, once the documentation is digitized, it is restored if necessary and then entered into the National Archive of Catalonia through an agreement with SMC to ensure optimal preservation conditions. (4) Finally, data may be provided directly by the observer (already digitized or not), which can then be entered directly into the historical database.

Once the data is extracted, any existing gaps in the historical database are filled, and coverage periods of the stations are extended. Data from previously unpublished stations are also incorporated, enriching and enhancing our ability to monitor variability and climate change in Catalonia.

How to cite: Herrero-Anaya, M., de Yzaguirre, X., Prohom, M., Cunillera, J., Barrera, T., Ruiz, A., and Montserrat, J.: 20 years of data rescue initiatives at the Meteorological Service of Catalonia, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-305, https://doi.org/10.5194/ems2024-305, 2024.

GP37
|
EMS2024-331
Kevin Healion, Peter Thorne, Simon Noone, Axel Andersson, Gerard van der Schrier, Alastair McKinstry, and Paul Poli
 

Access to climate data is essential if we are to better understand the climate of the past, present and future. Climate scientists require data to reconstruct past climate and extreme weather events, to create seasonal forecasts and to produce climate projections. Various private and public sector actors also require climate data as part of their climate-related decision-making and planning. Historical data can assist the insurance sector by providing information on past extreme weather events. Farmers require data to understand how the future climate will impact their output. The data can also help populations who live along coastlines better understand the changing nature of storm surges. Finally, those concerned about biodiversity can use the data to understand how climate change may impact flora and fauna in future.  

Our proposed poster will provide a visual representation of the various services offered by the Copernicus Climate Change Service for data rescue and surface meteorological data access over land and marine domains. We shall outline key tools available including the data rescue service and the data deposition service. We will also provide an overview of the data available via the C3S data store (CDS). We shall outline progress to date on improved curation of the fundamental data record of surface meteorological holdings. This includes key collections that have been recently secured via Copernicus agreements and new open data policies from various European National Meteorological and Hydrological Services. We will provide information on what data will be available in the next data release and plans for the very final release of the current contract. Finally, we will highlight how you can get involved to help improve the curation and access to the fundamental data record. 

The core mission of the Copernicus Climate Change Service is to “support adaptation and mitigation policies of the European Union by providing consistent and authoritative information about climate change”. Rescuing historical data and making that data freely accessible forms an important part of this core mission and is vital as Europe and the rest of the world prepare for further changes to the global climate. 

How to cite: Healion, K., Thorne, P., Noone, S., Andersson, A., van der Schrier, G., McKinstry, A., and Poli, P.: The Copernicus Climate Change Service’s data rescue and surface meterological data collection effort , EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-331, https://doi.org/10.5194/ems2024-331, 2024.

GP38
|
EMS2024-340
Gregor Vertačnik

Homogenisation and interpolation of bright sunshine duration time series in Slovenia have recently been renewed at Slovenian Environment Agency (ARSO). First, the quality of measured hourly to daily data at meteorological stations for the period 1960–2022 was checked. QC data was then aggregated to monthly sums, obtaining 27 stations’ time series, separately for three subdaily intervals. Monthly data was normalized relative to the maximum possible sunshine duration and transformed by arcus sinus function to improve the homogenisation with the HOMER software tool. Homogenisation was run in several iterations, using metadata and applying additive correction model with seasonally dependent corrections. Resulting homogenised time series were then back-transformed to the measurement units (hours).

Original subdaily values of sunshine duration were adjusted to match the monthly values of homogenised data. The first step of the adjustment procedure was the normalization of the subdaily data (i.e. division with the maximum possible sunshine duration for the day of year). Resulting ratio values from 0 to 1 were transformed by arcus sinus functions and then shifted by such a value that the monthly sum of back-transformed values matched the homogenised monthly sum. Shifted values below –π/2 were set to –π/2, whereas values above π/2 were set to π/2. This way all the original ratio values of 0 (no sunshine) and 1 (maximum possible sunshine) remain unchanged. This adjustment procedure takes into account both the sunshine duration frequency distribution and the influence of expected inhomogenities.

Missing subdaily data was replaced by spatially interpolated values of homogenised daily ratios. The distribution of raw interpolated values was partly adjusted to the distribution of the corresponding homogenised data in order to improve the statistics of extreme values.

The resulting collection of homogenised and interpolated subdaily data of bright sunshine duration was statistically analysed. Time series of the data show a statistically significant positive linear trend on an annual scale throughout the 63-year period. The trend is stronger for morning (around 3.7 % per decade) than for midday (2.0 % per decade) and afternoon values (2.5 % per decade). For autumn the trend is weak and insignificant, whereas for winter morning hours the trend reaches around 8 % per decade in the lowlands of eastern Slovenia.

Homogenised an interpolated time series have been used for the calculation of climate normals for the latest WMO standard reference period (1991–2020) and are planned to be used to improve climate projections for Slovenia.

Keywords: sunshine duration, climate change, homogenisation, spatial interpolation

How to cite: Vertačnik, G.: Homogenisation and interpolation of subdaily bright sunshine duration time series in Slovenia, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-340, https://doi.org/10.5194/ems2024-340, 2024.

GP39
|
EMS2024-384
A high density observational dataset of daily meteorological data for the Extended Alpine Region
(withdrawn)
Giulio Bongiovanni, Michael Matiu, Alice Crespi, Anna Napoli, Bruno Majone, and Dino Zardi
GP40
|
EMS2024-654
Thomas Möller, Tina Leiding, Axel Andersson, Janosch michaelis, Akio Hansen, Florian Imbery, and Thomas Junghänel

Historic observational data records are an important contribution to climate reconstructions and the analysis of past weather events. Particularly in remote and data sparse regions, such as the open ocean, newly rescued data can significantly improve the knowledge of weather and climatic conditions in earlier decades and centuries.

Deutscher Wetterdienst (DWD) has several collections of original historical weather records from land stations and ships worldwide. They comprise not only observations from Germany, but also from the world’s oceans and land stations in many parts of the world.

All German state-owned meteorological observations since the founding of the Prussian Meteorological Institute in 1848 are collected in DWD’s main archive in Offenbach.

DWD’s office in Hamburg holds the marine archive, starting with the collections of the German Naval Observatory, 'Deutsche Seewarte', which existed from 1868 to 1945. It includes marine data records from ships and land stations in many parts of the world (e.g. from former German colonies), as well as signal stations situated at the coasts of the North and Baltic Sea.

With increasing computing resources, high temporal resolution data has increasingly become the focus of climate research in recent years. Thus, the processing of such unique historical datasets creates considerable added value. The digitization of recording strips from pluviographs, for example, is currently one focal point of the data rescue activities at DWD.

The documentation, digitization and quality check of the enormous amount of handwritten journals of the data archives is still ongoing. The digitized data are freely accessible to the scientific community and are also continuously submitted to international data archives, such as ICOADS and ISPD. These datasets also make the data an important input for regional and global reanalyses.

The presentation will give an overview of the historical archives of Deutscher Wetterdienst. The challenges are discussed the latest progress of the digitization efforts and ongoing analysis of the data shown.

How to cite: Möller, T., Leiding, T., Andersson, A., michaelis, J., Hansen, A., Imbery, F., and Junghänel, T.: Data rescue of national and international meteorological observations at Deutscher Wetterdienst, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-654, https://doi.org/10.5194/ems2024-654, 2024.

GP41
|
EMS2024-768
Ada Barrantes, Cristina Carnerero, and Jan Mateu Armengol

Reliable air quality data are vital for informed decision-making, enabling evidence-based mitigation strategies to improve public health and sustainability. Although monitoring stations are essential for assessing air quality, they have limited spatial representativeness, leaving large extensions of areas without appropriate observational data. On the other hand, numerical air quality systems provide full spatial coverage. Nevertheless, modeled data are affected by persistent uncertainties, mainly due to emission inventories inaccuracies and the complexity of atmospheric processes involved in pollution transport. Data-fusion methods offer bias-corrected air quality maps with full spatial coverage. There is, however, a strong dependence on observational data availability to ensure reliable results of data-fusion methods. 

In this study, we quantify the impact of imputing missing observational data in data-fusion methods. We focus on PM2.5 for the region of Catalonia (Northeastern Spain) during 2019, for which data availability is strongly limited. We first present straightforward gap-filling methodologies, such as linear interpolation and persistence (repetition of the previous available value). We then compare these techniques with a state-of-the-art artificial intelligence gap-filling method based on the Gradient Boosting Machine algorithm trained with several years of data (2019-2022). To assess gap-filling methodologies, we generate random gaps of varying characteristics identifying the optimal technique for each gap size and frequency. Finally, we study how these methods affect the data-fusion process applied to the mesoscale air quality model CALIOPE. The output of this system has a horizontal spatial resolution of 1 km x 1 km on a daily scale. The data-fusion method uses universal kriging, a geostatistical technique based on a regression model and the spatial correlation between the model and observational data. 

Data-fusion results show significant improvement when using gap-filling observational data. Notably, the method’s effectiveness depends on observation availability, performing better with GBM-filled data.

How to cite: Barrantes, A., Carnerero, C., and Mateu Armengol, J.: The role of gap-filling observational data in air quality data-fusion methods: a case study with CALIOPE, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-768, https://doi.org/10.5194/ems2024-768, 2024.

GP42
|
EMS2024-777
Veronika Lukasová, Svetlana Varšová, Milan Onderka, Dušan Bilčík, Anna Buchholcerová, and Pavol Nejedlík

Mountain weather stations with continuous measurements maintained over decades serve as indispensable data sources for analysing climate dynamics. The transition from manual to automatic measuring systems may introduce inhomogeneities in these climate data series, therefore, it may be necessary to adjust the data, particularly when considering a transition to solely automatic measurements.

In our study, we analysed climate data parallelly measured by the automatic (AWS) and manual weather station (MWS) at the Skalnaté Pleso Observatory (1778 m a.s.l.) in the High Tatra Mountains. Manual meteorological measurements have been performed at this location since 1943 using the same methods and devices. The AWS Physicus with the parallel recording of meteorological data was installed at the observatory in 2014. Excluding the first years of trial operation with several data gaps in the AWS data, we processed six years of parallel measurements from 2017 to 2022 to derive corrections applicable to AWS data, ensuring the continuity and homogeneity of the conventional long-term data series. We utilized monthly regressions (MR) and cumulative distribution functions (Generalised extreme value – GEV and Gaussian distribution - GAUSS) proposed in Lukasová et al. (2023, DOI:10.1127/metz/2023/1200) and applied them to AWS data measured in 2023. Our focus was on two essential climatological parameters: air temperature and atmospheric precipitation.

Comparison of data from parallel manual and automatic measurements revealed the underestimation of monthly precipitation totals in AWS data, with a mean bias error (MBE) of -6.8 mm and root mean squared error (RMSE) of 20 mm. Among the methods considered, correction by MR yielded the lowest errors, reducing them to 1.5 mm and 11.3 mm for MBE and RMSE, respectively. For air temperature, the monthly AWS data were overestimated by 0.07, 0.28 and 0.11 °C for Tmax, Tmin and Tmean, respectively. The lowest errors after corrections of Tmax were achieved with MR and GAUSS methods with MBE of 0.0 °C and RMSE of 0.1°C for both. For Tmin and Tmean, MR and GEV methods resulted in MBE of 0.0 °C and RMSE of 0.3 °C and MBE of 0.0 °C and RMSE of 0.1 °C for both methods, respectively. Based on these results, AWS data corrected using MR for atmospheric precipitation and MR/GAUSS for air temperature can be considered suitable for maintaining the continuity of historical climate data series at Skalnaté Pleso Observatory.

Despite our findings, we recommend continuing parallel manual and automatic measurements at high-altitude meteorological observatories exposed to extreme weather events. In case of equipment failure, is often difficult to repair it, especially in harsh weather conditions and limited access to high mountains. For the observatories at unique positions, this may cause data loss that is difficult to compensate with data from other stations.

How to cite: Lukasová, V., Varšová, S., Onderka, M., Bilčík, D., Buchholcerová, A., and Nejedlík, P.: Adjustment of Monthly Air Temperature and Precipitation Data from Automatic System to Align with Manually Measured Long-Term Data at High-Altitude Observatory, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-777, https://doi.org/10.5194/ems2024-777, 2024.

GP43
|
EMS2024-809
Maria Mercedes Poggi, Maria Laura Bettolli, Cesar Azorin-Molina, and Maria de los Milagros Skansi

Argentina, a country located in the southern part of South America, has a large wind resource. However, renewable wind energy at the national level is under-exploited, with less than 15% of the country's energy coming from a renewable source. Existing studies are generally focused on the analysis of winds for particular stations or regions such as the Patagonia, where intense and persistent winds dominate throughout the year. Nevertheless, the study of wind and its socioeconomic and environmental implications is still lacking in Argentina.

This study aims to fill the research gap of quality controlling and homogenizing wind speed measurements from a meteorological network of 117 conventional stations from the National Meteorological Service of Argentina (SMN), using the R package Climatol. Here we will also correct the biases introduced by new automatic weather stations, which are planned to progressively replace the conventional ones in the near future. Moreover, metadata about the type, changes and calibrations in anemometers (30% of the stations used cup anemometers) is rescued to support the corrections of detected breakpoints by the Standard Normal Homogeneity Test (SNHT). Additionally, the quality controlled and homogenized database is used here to assess for the first time (i) the spatio-temporal climatology and (ii) the long-term trends and multidecadal variability of near-surface mean wind speed across Argentina for 1961-2022. 

This research is highly important as there is only little evidence of changes in surface winds in the Southern Hemisphere; where both observations and projections show positive trends, denoting an interhemispheric asymmetry because of the declining (stilling) of winds observed in the Northern Hemisphere.

How to cite: Poggi, M. M., Bettolli, M. L., Azorin-Molina, C., and Skansi, M. D. L. M.: Homogenization and assessment of observed near-surface wind speed across Argentina, 1961-2022, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-809, https://doi.org/10.5194/ems2024-809, 2024.

GP44
|
EMS2024-811
Mina Petrić, Branislava Lalić, and Cedric Marsboom

Biometeorological models simulating the dynamics and spread of insect species coupled with spatial decision support systems (SDSS) are becoming an important component of human and animal disease risk assessment and mitigation planning. To accurately simulate the dynamics of insect species and the anticipated spread of vector-borne disease (VBD), complex mathematical models are being developed which require precise and continuous micrometeorological forcing representative of the vector habitat, provided in near-real time. For this reason, reanalysis product such as ERA5-Land are not always suitable, and environmental wireless sensor networks (WSN) are often employed.

Regardless of the purpose of the observations, the quality of the recorded data is determined by the accuracy and completeness of the measurements, the spatial and temporal representativeness of the records, and the representativeness with respect to the goal of the observation. With the increasing use of autonomous environmental sensors employing different data-storage, transmission, and communication protocols, particularly in low-power/low-cost applications, the risk of data loss and measurement error grows, emphasising the importance of properly implementing quality planning and assurance.

In this paper we provide an outline of quality control (QC) and gap-filling methods for dealing with spatial and temporal gaps in meteorological measurements for different classes of biometeorological applications with a focus on: (i) mathematical vector population dynamics models; (ii) machine learning vector distribution models; and (iii) mechanistic vector distribution models.

Establishing a comprehensive quality assurance (QA) and gap-filling protocol for micrometeorological data in the field of biometeorological modelling will improve the understanding of the types of errors and limitations expected during analysis, as well as provide an overview of best-practices for a broader interdisciplinary audience.

How to cite: Petrić, M., Lalić, B., and Marsboom, C.: Gap-filling and quality control in biometeorological modelling: overview of best practices, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-811, https://doi.org/10.5194/ems2024-811, 2024.

GP45
|
EMS2024-835
Isabel Knerr, Karsten Friedrich, and Florian Imbery

To ensure consistent time series of climate observations, the German Meteorological Service (DWD) operates out parallel measurements (e.g. wind and precipitation measurements) at climate reference stations. Parallel measurements of subsequent generations of operational measurement systems are carried out over several years at selected ten locations with different environmental conditions in order to evaluate the quality of the data obtained, to analyse systematic differences between the instrument types and to ensure the homogeneity of the data series.

A key feature of climate reference observations is that they are monitored consistently over long periods of time to ensure a high degree of comparability of observations. Inconsistent data series are of limited use in climate research. In order to assess the consistency of climate observations, these time series must be described in detail using metadata and, in particular, by reliably specifying measurement uncertainties. The overall objective is to ensure the consistency of measurements, especially before and after sensor changes, and to develop the necessary automated procedures to detect breaks caused by instrument changes and to homogenise measurement series. This will ensure that the data remain consistent over long periods of time.

Precipitation measurements are affected by several factors that can lead to systematic underestimations. One of these factors is evaporation. This error can be significant, especially for automatic precipitation gauges equipped with a heating system. However, the potentially largest error is triggered by wind. The shape of the funnel influences the wind field around the gauge and small turbulence can occur which can blow snow and small raindrops in particular over the gauge, leading to an underestimation of precipitation.  To minimise this error, primary gauging stations are equipped with a wind shield, while secondary gauging stations are not.

These differences can vary from gauge to gauge, leading to inhomogeneities in long series of measurements.  In order to analyse and quantify these and other influences, the German Weather Service has established climate reference stations at ten locations with different natural conditions.

How to cite: Knerr, I., Friedrich, K., and Imbery, F.: Intercomparison of Measurements at Climate Reference Stations in Germany, EMS Annual Meeting 2024, Barcelona, Spain, 1–6 Sep 2024, EMS2024-835, https://doi.org/10.5194/ems2024-835, 2024.