- 1University of Tübingen, Tübingen, Germany (jannik.thuemmel@uni-tuebingen.de)
- 2ECMWF, ECMWF, Reading, United Kingdom (jakob.schloer@ecmwf.int)
- 3Department of Data Science, Indian Institute of Science Education and Research, Pune, India (bedartha.goswami@iiserpune.ac.in)
Masked Token Models (MTMs) are a highly efficient paradigm for pre-training large-scale models in video and language domains. Designed to learn representations on inherently sparse or strongly subsampled data, MTMs can be a promising choice for weather and climate prediction over long horizons. Despite their advantageous design properties these models have not yet found widespread adoption in climate science. We partly attribute this to limitations of the prevalent choice to use masking strategies that are uniform over time, which biases the learned representations toward spatial interpolation rather than predictive dynamics.
By defining a time-aware prior over the masking distribution, we are able to control this bias in a principled manner, thereby elevating the forecasting capability of MTMs to be on par with other approaches while retaining their efficiency and flexibility in adapting to multiple downstream tasks. Furthermore, we show that the choice of prior has a strong effect on the predicted uncertainty, leading to substantial improvements in terms of calibration.
As an illustrative example we train MTMs to predict the El Niño–Southern Oscillation (ENSO)—a primary driver of inter-seasonal climate variability with extreme weather impacts across the globe. Our approach yields state-of-the-art probabilistic forecasts of the tropical Pacific up to 24 months ahead and produces uncertainty estimates with an almost perfect spread-to-skill ratio over the full horizon. The strong performance on both climate model simulations and observational datasets demonstrates that MTMs can be highly effective for seasonal-to-annual climate prediction.
How to cite: Thümmel, J., Ebmeier, F., Schlör, J., Ludwig, N., and Goswami, B.: Masked Token Models as a paradigm for probabilistic forecasts in weather and climate, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-20210, https://doi.org/10.5194/egusphere-egu26-20210, 2026.