EGU26-8845, updated on 14 Mar 2026
https://doi.org/10.5194/egusphere-egu26-8845
EGU General Assembly 2026
© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.
Poster | Wednesday, 06 May, 14:00–15:45 (CEST), Display time Wednesday, 06 May, 14:00–18:00
 
Hall X5, X5.201
A Coupled Transformer-CNN Network: Advancing Sea Surface Temperature Forecast Accuracy
Tao Zhang1,2, Pengfei Lin1,2, Hailong Liu3, Pengfei Wang1, Ya Wang1, Kai Xu3, Weipeng Zheng1,2, Yiwen Li4, Jinrong Jiang5, Lian Zhao5, and Jian Chen1,6
Tao Zhang et al.
  • 1Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing, China
  • 2College of Earth and Planetary Sciences, University of Chinese Academy of Sciences, Beijing, China
  • 3Laoshan Laboratory, Qingdao, China
  • 4School of Ocean Sciences, China University of Geosciences, Beijing, China
  • 5Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
  • 6State Key Laboratory of Geo-Information Engineering, Xi’an, China

Sea surface temperature (SST) is critically important for understanding ocean dynamics and supporting various marine activities, making accurate short-term SST forecasting highly significant. However, accurately modeling the multi-scale variability of SST remains challenging for existing deep learning (DL) models. This study introduces the coupled Transformer–CNN network (CoTCN), a hybrid architecture designed to leverage the multiscale variability of SST. The CoTCN combines the strengths of Transformers and convolutional neural networks (CNNs), significantly enhancing SST forecasts’ spatial continuity and predictive accuracy. Compared to five state-of-the-art DL models based on Transformers or CNNs that include convolutional long short-term memory (ConvLSTM), ConvGRU, adaptive Fourier neural operator (AFNO), PredRNN, and SwinLSTM, the CoTCN demonstrates superior performance in global and local areas of SST forecasting. At 1-day lead time, the CoTCN reduces the global average root-mean-square error (RMSE) by over 15%, with forecast errors ranging from 0.20 °C to 0.53 °C across 1–10-day lead times. Moreover, the CoTCN effectively mitigates the checkerboard artifacts inherent to the Vision Transformer (ViT) architecture. These findings highlight the effectiveness of the CoTCN in capturing SST’s multiscale features and underscore the promising potential of hybrid architectures for future DL models.

How to cite: Zhang, T., Lin, P., Liu, H., Wang, P., Wang, Y., Xu, K., Zheng, W., Li, Y., Jiang, J., Zhao, L., and Chen, J.: A Coupled Transformer-CNN Network: Advancing Sea Surface Temperature Forecast Accuracy, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-8845, https://doi.org/10.5194/egusphere-egu26-8845, 2026.