Pretraining a foundation model using MODIS observations of the earth&rsquo;s atmosphere

Valentine Anantharaj; Takuya Kurihana; Gabriele Padovani; Ankur Kumar; Aristeidis Tsaris; Udaysankar Nair; Sandro Fiore; Ian Foster

doi:https://doi.org/10.5194/egusphere-egu24-22461

[Back] [Session ESSI1.1]

EGU24-22461, updated on 11 Mar 2024

https://doi.org/10.5194/egusphere-egu24-22461

EGU General Assembly 2024

© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Pretraining a foundation model using MODIS observations of the earth’s atmosphere

Valentine Anantharaj¹, Takuya Kurihana², Gabriele Padovani³, Ankur Kumar⁴, Aristeidis Tsaris¹, Udaysankar Nair⁴, Sandro Fiore³, and Ian Foster²

Valentine Anantharaj et al.

¹Oak Ridge National Laboratory, Oak Ridge, USA
²University of Chicago, Chicago, USA
³University of Trento, Trento, Italy
⁴University of Alabama - Huntsville, Huntsville, USA

Pretraining a foundation model using MODIS observations of the earth’s atmosphere

The earth and atmospheric sciences research community has an unprecedented opportunity to exploit the vast amount of data available from earth observation (EO) satellites and earth system models (ESM). Smaller and cheaper satellites with reduced operational costs have made a variety of EO data affordable, and technological advances have made the data accessible to a wide range of stakeholders, especially the scientific community (EY, 2023). The NASA ESDS program alone is expected to host 320 PB of data by 2030 (NASA ESDS, 2023). The ascent and application of artificial intelligence foundation models (FM) can be attributed to the availability of large volumes of curated data, accessibility to extensive compute resources and the maturity of deep learning architectures, especially the transformer (Bommasani et al., 2021).

Developing a foundation model involves pretraining a suitable deep learning architecture with large amounts of data, often via self supervised learning (SSL) methods. The pretrained models can then be adapted to downstream tasks via fine tuning, requiring less amount of data than task-specific models. Large language models (LLM) are likely the most common type of foundation encountered by the general public. Vision transformers (ViT) are based on the LLM architecture and adapted for image and image-like data (Dosovitskiy, et. al., 2020), such as EO data and ESM simulation output. We are in the process of pretraining a ViT model for the earth’s atmosphere using a select few bands of 1-km Level-1B MODIS radiances and brightness temperatures, MOD021KM and MYD021KM from the NASA Terra and Aqua satellites respectively. We are using 200 million image chips of size 128x128 pixels. We are pretraining two ViT models of sizes 100 million and 400 million parameters respectively. The pretrained models will be finetuned for cloud classification and evaluated against AICCA. We will discuss our experiences involving data and computing, and present preliminary results.

References

Bommasani R, Hudson DA, Adeli E, Altman R, Arora S, et al: On the opportunities and risks of foundation models. CoRR abs/2108.07258. https://arxiv.org/abs/2108.07258, 2021.

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S. and Uszkoreit, J.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.

Ernst & Young (EY): How can the vantage of space give you strategic advantage on Earth? https://www.ey.com/en_gl/technology/how-can-the-vantage-of-space-give-you-strategic-advantage-on-earth, 2023. Accessed 10 January 2024.

Kurihana, Takuya, Elisabeth J. Moyer, and Ian T. Foster: AICCA: AI-Driven Cloud Classification Atlas. Remote Sensing 14, no. 22: 5690. https://doi.org/10.3390/rs14225690, 2022.

NASA MODIS: MODIS - Level 1B Calibrated Radiances. DOI: 10.5067/MODIS/MOD021KM.061 and DOI: 10.5067/MODIS/MYD021KM.061

NASA ESDS: Earthdata Cloud Evolution https://www.earthdata.nasa.gov/eosdis/cloud-evolution. Accessed 10 January 2024.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I: Attention is all you need. Adv Neural Inf Process Syst 30, 2017.

How to cite: Anantharaj, V., Kurihana, T., Padovani, G., Kumar, A., Tsaris, A., Nair, U., Fiore, S., and Foster, I.: Pretraining a foundation model using MODIS observations of the earth’s atmosphere, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-22461, https://doi.org/10.5194/egusphere-egu24-22461, 2024.

Comments on the supplementary material

AC: Author Comment | CC: Community Comment | Report abuse

supplementary materials version 1 – uploaded on 18 Apr 2024, no comments