Breaking Computational Bottlenecks in Land Surface Modelling with Shifted-Window Transformers

Siddik Barbhuiya; Vivek Gupta

doi:https://doi.org/10.5194/egusphere-egu26-16260

[Back] [Session ESSI2.2]

EGU26-16260, updated on 14 Mar 2026

https://doi.org/10.5194/egusphere-egu26-16260

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Breaking Computational Bottlenecks in Land Surface Modelling with Shifted-Window Transformers

Siddik Barbhuiya¹ and Vivek Gupta^2,3

Siddik Barbhuiya and Vivek Gupta

¹School of Civil and Environmental Engineering, Indian Institute of Technology Mandi(siddikbarbhuiya@gmail.com)
²School of Civil and Environmental Engineering, Indian Institute of Technology Mandi (vivekgupta@iitmandi.ac.in)
³Center for Climate Change and Disaster Management, Indian Institute of Technology Mandi (vivekgupta@iitmandi.ac.in)

The development of hyper-resolution land surface modelling poses significant computational challenges. Detailed water balance assessments, ensemble-based uncertainty quantification, and climate scenario exploration all require running physics-based models like VIC, Noah-MP, and CLM at continental scales with high spatial resolution, long temporal spans, and multiple parameter configurations. The computational cost becomes prohibitive. Machine learning surrogates have recently emerged as potential solutions; however, existing LSTM and CNN approaches have fundamental architectural problems. Sequential processing prevents parallel computation, limited receptive fields miss long-range dependencies, and most approaches only predict single variables, which restricts comprehensive hydrological analysis.

We present a shifted-window transformer framework that simultaneously predicts multiple land surface fluxes (runoff, evapotranspiration, and soil moisture) while maintaining computational efficiency at continental scales. The hierarchical attention mechanism captures both local temporal patterns through windowed self-attention and global temporal context through shifted-window operations. This eliminates recurrent bottlenecks. We adapt vision transformers for hydrological regression by tokenizing meteorological sequences temporally, using relative position biases to encode lag-dependent hydrological relationships, and designing multi-task regression heads that preserve both nonlinear interactions and direct physical drivers.

We demonstrate the approach by emulating the VIC model across India's 76,390 land grid cells at 6 km resolution, spanning diverse climate regimes. Training uses sparse spatial sampling with only a small fraction of available locations. This allows us to evaluate how well the surrogate generalizes VIC's process behaviors to the newer, unseen regions and parameter configurations. We test multiple variants, including autoregressive formulations that incorporate previous timestep outputs, and benchmark everything against LSTM baselines to isolate the contributions of the architecture.

How to cite: Barbhuiya, S. and Gupta, V.: Breaking Computational Bottlenecks in Land Surface Modelling with Shifted-Window Transformers, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-16260, https://doi.org/10.5194/egusphere-egu26-16260, 2026.

OSPP voting tool

This contribution takes part in the OSPP contest. Please log in to see the relevant judging section.