- National Disaster Management Research Institute, Disaster investigation division, Ulsan, Korea, Republic of (ecofriend97@gmail.com)
Road icing during winter in South Korea is a critical disaster factor, causing numerous casualties annually. This study conducts an integrated analysis—combining traffic accident statistics with news data text mining—to understand the quantitative characteristics of icy road accidents and to deeply investigate the underlying social and structural causes and risks that are often difficult to capture through numerical data alone. First, approximately 2.15 million traffic accident records from the past decade (2014–2023) were extracted from the Traffic Accident Analysis System (TAAS). Based on this dataset, we performed a precise spatiotemporal analysis of icy road accidents, categorized by time of occurrence, road type, road geometry, and surface conditions. The results revealed a fatality rate of 2.3% for icy road accidents, which is approximately 1.35 times higher than the 1.7% observed in general accidents, confirming the extreme danger of icing. Accidents were heavily concentrated (20.8%) during the morning rush hour (08:00–10:00), and municipal roads accounted for the highest volume of accidents (33.7%) by road type. Particularly, the fatality rate was highest on national highways (7.9%), primarily due to high vehicle speeds. Regarding road geometry, fatalities were prominent in tunnels (8.3%) and on bridges (6.4%); this was attributed to the difficulty of evacuation in constrained spaces when chain-reaction collisions or fires occur following initial icing-related accidents. To identify the specific underlying causes behind these statistical phenomena, this study analyzed news articles, which provide the most rapid, accurate, and extensive contextual information regarding problems and causes in the accident process. To systematically extract meaningful information from large-scale unstructured news data, we utilized Natural Language Processing (NLP)-based text mining. This technique involves semantic analysis to identify relationships between key elements through sentence segmentation, tokenization, morphological analysis, and named entity recognition. By applying approximately 200 keywords related to accident causes—such as "delayed response," "unpreparedness," and "negligence"—to roughly 37 million news articles from the past five years (2020–2025), we identified specific "causes" behind the "phenomena" presented by statistical data. The analysis identified 16 latent risk factors in road maintenance and situational awareness, including not only drivers' difficulty in perceiving black ice but also insufficient designation of icing-vulnerable sections, inadequate snow removal measures, and lack of relevant policies and budget investments. In conclusion, this study provides multidimensional insights into icy road accidents through the complementarity of the two analytical methods. While statistical analysis scientifically pinpointed "high-risk locations" (tunnels and bridges) and "vulnerable times" (rush hour), text mining revealed that recurring accidents are rooted in administrative and human factors. This integrated approach connects policy blind spots and driver behavioral contexts that numerical statistics might overlook, providing an effective evidence base for improving regulations and establishing tailored safety information delivery systems beyond simple infrastructure improvements.
How to cite: Choi, S. and Kim, J. E.: Identifying Characteristics and Latent Risks of Icy Road Traffic Accidents through Integrated Analysis of Traffic Statistics and News Big Data-based Text Mining, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-6287, https://doi.org/10.5194/egusphere-egu26-6287, 2026.