EGU25-3384, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-3384
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Thursday, 01 May, 08:30–10:15 (CEST), Display time Thursday, 01 May, 08:30–12:30
 
Hall X5, X5.21
Research on Quality Control Methodology of Automatic Precipitation Phase Observation Data Based on Machine Learning
Xiaofeng Ou1, Hao Lin1, and Xiaoyu Huang2
Xiaofeng Ou et al.
  • 1Hunan Meteorological Research Institute, Changsha, China (183292195@qq.com)
  • 2National Meteorological Centre, Beijing, China (303551728@qq.com)

In the wake of the continuous expansion and refinement of the ground automatic meteorological observation network in China, the development of an effective quality control system for ground automatic station observation data has become an urgent task of great significance in the field of meteorology. Although extensive research has been conducted on quality control techniques for traditional ground observation meteorological variables such as precipitation, temperature, and pressure both domestically and abroad, the exploration of quality control strategies for precipitation phase observation data remains relatively limited.This research endeavor undertakes the utilization of upper-air and manual ground observation datasets covering the period from 2000 to 2014. Through a comprehensive analysis and selection process, meteorological factors that exert a pronounced influence on precipitation phase are identified and optimized. Subsequently, the random forest algorithm is applied to establish a quality control model for the automatic observation data of three primary precipitation phases: rain, snow, and sleet. Employing this meticulously constructed quality control model, an in-depth quality assessment is carried out on the ground automatic precipitation phase observation data collected during the period from 2015 to 2023, after the discontinuation of manual observations. A total of 15,806 station-records are flagged as suspicious or incorrect. It is observed that the stations with such data anomalies are preponderantly located in regions with sparse human habitation and challenging maintenance conditions, such as the Qinghai-Tibet Plateau, the Tianshan Mountains, and the mountainous areas in northern Heilongjiang. In contrast, regions like Guangdong, Guangxi, Yunnan, Fujian, and Hainan exhibit relatively high data quality, with the eastern regions generally outperforming the western ones (Figure 1).For the identified suspicious data, a rigorous manual verification procedure is implemented. For example, at 14:00 on January 31, 2019, the quality control results for Wuqia, Akto, and Kashgar stations in Xinjiang indicated snowfall, yet the automatic observations registered precipitation. With the ground temperatures of these stations being -10°C, -6°C, and -6°C respectively, it is meteorologically implausible for rain to occur in Xinjiang during winter. Hence, the automatic precipitation observations at these stations are deemed incorrect. After conducting a substantial amount of manual verification on other suspicious and incorrect data, it is determined that the identification accuracy rate of this quality control method surpasses 98.5%. Presently, this research outcome has been successfully incorporated into the operational quality control framework for ground automatic precipitation phase observation.

Figure 1 Frequency Diagram of Stations with Suspected or Incorrect Precipitation Phase Quality Control from 2015 to 2023

 

How to cite: Ou, X., Lin, H., and Huang, X.: Research on Quality Control Methodology of Automatic Precipitation Phase Observation Data Based on Machine Learning, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-3384, https://doi.org/10.5194/egusphere-egu25-3384, 2025.