EGU25-13666, updated on 15 Mar 2025
https://doi.org/10.5194/egusphere-egu25-13666
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Friday, 02 May, 10:45–12:30 (CEST), Display time Friday, 02 May, 08:30–12:30
 
Hall X1, X1.91
Small Data for Big Tasks in Seasonal Weather Forecasting: A Balanced Perspective on Interpretability and Predictability of NARMAX and Machine Learning Methods
Yiming Sun1, Hua-Liang Wei1, Edward Hanna2, and Linh Luu2
Yiming Sun et al.
  • 1University of Sheffield, Sheffield, United Kingdom of Great Britain – England, Scotland, Wales
  • 2University of Lincoln, Lincoln, United Kingdom of Great Britain – England, Scotland, Wales

Recent advances in machine learning (ML) have enabled significant progress in geoscience by capturing complex relationships and enhancing predictive skills. However, the success of many ML algorithms in data-rich settings does not seamlessly transfer to climate and atmospheric applications, where observational datasets are often limited. This underscores the need for methods that deliver high predictive accuracy under data-scarce conditions while retaining interpretability.

Here, we compare various ML approaches with the Nonlinear AutoRegressive Moving Average model with eXogenous inputs (NARMAX) in typical small-data climate applications, such as seasonal weather forecasting and Greenland Blocking Index (GBI) prediction. NARMAX, a transparent, interpretable, parsimonious and simulatable (TIPS) framework, demonstrates robust performance and avoids common pitfalls such as overfitting and unstable predictions when data are scarce. Notably, it achieves superior or competitive forecast accuracy for small or limited data conditions, underscoring its practical value in operational climate science. By adopting a sparse system identification approach, NARMAX yields model structures that readily reveal key predictors and their relative contributions, providing valuable physical and statistical insights into climate variability.

Our findings illustrate how NARMAX bridges the gap between purely data-driven modelling (focusing on prediction) and mechanistic modelling (focusing on physical insights), offering a clear pathway for refining model strategies and deepening our understanding of climate dynamics. We propose that NARMAX and similar methods play an inherently powerful role for both small and large data modelling problems and meanwhile serve as potent components to potentially improve the explainability of ML methods. By showcasing both interpretability and predictive efficacy, this work encourages the adoption of machine learning methods that best meet the needs for specific data modelling tasks in climate science and beyond.

How to cite: Sun, Y., Wei, H.-L., Hanna, E., and Luu, L.: Small Data for Big Tasks in Seasonal Weather Forecasting: A Balanced Perspective on Interpretability and Predictability of NARMAX and Machine Learning Methods, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-13666, https://doi.org/10.5194/egusphere-egu25-13666, 2025.