EGU26-14791, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-14791
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Wednesday, 06 May, 16:15–18:00 (CEST), Display time Wednesday, 06 May, 14:00–18:00
 
Hall X3, X3.71
Combining human and AI approaches for effective digitization of historical atmospheric electricity records
Hripsime Mkrtchyan, Keri Nicoll, and Giles Harrison
Hripsime Mkrtchyan et al.
  • University of Reading, Department of Meteorology , Department of Meteorology , Reading, United Kingdom of Great Britain – England, Scotland, Wales (hrip.mkrtchyan@gmail.com)

Long-term measurements of the atmospheric electric field, measured as the potential gradient (PG), were obtained at Lerwick Observatory (Shetland Isles), UK, from 1925 to 1984, and provide a unique resource for studying links between atmospheric electricity, the global electric circuit (GEC), and climate variability. Most of these historical observations were originally made as handwritten or printed records, limiting their accessibility for modern analysis. In this project, we have undertaken a comprehensive digitization of the Lerwick PG dataset, at hourly resolution, by combining contributions from a citizen science platform and various AI tools.

The earliest handwritten records, made from 1927–1956, were digitized through the Zooniverse citizen science platform by engaging volunteers in transcribing data.  To digitize the later records, from 1957–1984, which are mainly printed and scanned tables, we utilized AI-based optical character recognition (OCR) tools from several software packages. An essential part of the transcription the use of multiple validation steps to assess and correct errors introduced by both the AI-based tools and the citizen science activity. By these techniques, we optimised the effectiveness of the digitisation to provide the most scientifically useful dataset.

This work presents a summary of the digitized historical dataset from Lerwick and provides insights into the reliability and limitations of AI-assisted digitization of scientific archives. The resulting new dataset generated will underpin modern investigations into long-term trends in atmospheric electricity and its connection to climate processes.

How to cite: Mkrtchyan, H., Nicoll, K., and Harrison, G.: Combining human and AI approaches for effective digitization of historical atmospheric electricity records, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-14791, https://doi.org/10.5194/egusphere-egu26-14791, 2026.