Leveraging Large Language Models for Enhancing and Reasoning Adverse Weather Hazard Classification

Adarsha Neupane; Nima Zafarmomen; Vidya Samadi

doi:https://doi.org/10.5194/egusphere-egu25-13882

[Back] [Session VPS2]

EGU25-13882, updated on 15 Mar 2025

https://doi.org/10.5194/egusphere-egu25-13882

EGU General Assembly 2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Leveraging Large Language Models for Enhancing and Reasoning Adverse Weather Hazard Classification

Adarsha Neupane

¹, Nima Zafarmomen¹, and Vidya Samadi^1,2

Adarsha Neupane et al.

¹Department of Agricultural Sciences, Clemson University, Clemson, SC, USA
²Artificial Intelligence Research Institute for Science and Engineering (AIRISE), School of Computing, Clemson University, Clemson, SC, USA

Severe weather events often develop rapidly and cause extensive damage, resulting in billions of dollars in losses annually. This paper explores Large Language Models (LLMs) to effectively reason about the adversity of weather hazards. To tackle this issue, we gathered National Weather Service (NWS) flood reports covering the period from June 2005 to September 2024. Two pre-trained LLMs including Bidirectional and Auto-Regressive Transformer (BART) models (large) and Bidirectional Encoder Representations from Transformers (BERT) were employed to classify flood reports according to predefined labels. These models encompass a range of sizes with parameter counts of 406 million, and 110 million parameters, respectively. We employed the Low-Rank Adaptation (LoRA) fine-tuning technique to enhance performance and memory efficiency. The fine-tuning and few-shot learning capabilities of these models were evaluated to adapt pre-trained language models for specific tasks or domains. The methodology was applied in Charleston County, South Carolina, USA— a vulnerable region to compound flooding. Extreme events reported during the training periods were unevenly distributed across training period, resulting in imbalanced datasets. The “Cyclonic” category represents significantly fewer instances in the report text data, while the “Flood” and “Thunderstorm” categories appeared more frequent. The findings revealed that while few-shot learning significantly reduced computational costs, fine-tuned models resulted in more stable and reliable performance. Among multiple LLMs applied in this research, the BART model achieved higher F1 scores in the “Flood,” “Thunderstorm,” and “Cyclonic” categories—requiring fewer training epochs to reach optimized performance levels. Furthermore, the BERT model demonstrated a shorter overall training time (12 hours 17 minutes) compared to other LLMs, demonstrating efficient cost of computing. This comprehensive evaluation of LLMs across diverse NWS flood reports enhanced our understanding of their capabilities in text classification and offered valuable insights into leveraging these advanced techniques for weather disaster assessment.

How to cite: Neupane, A., Zafarmomen, N., and Samadi, V.: Leveraging Large Language Models for Enhancing and Reasoning Adverse Weather Hazard Classification, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-13882, https://doi.org/10.5194/egusphere-egu25-13882, 2025.