EGU25-3328, updated on 14 Mar 2025
https://doi.org/10.5194/egusphere-egu25-3328
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Monday, 28 Apr, 08:35–08:45 (CEST)
 
Room -2.92
AI Foundation Models for Science: Current Initiatives, Workflow, and Future Roadmap
Rahul Ramachandran1, Tsengdar Lee2, and Kevin Murphy3
Rahul Ramachandran et al.
  • 1NASA, MSFC, Huntsville, United States of America (rahul.ramachandran@nasa.gov)
  • 2Earth Science Division, NASA Headquarters
  • 3Office of Chief Science Data Officer, NASA Headquarters

NASA has collected—and continues to amass—petabytes of scientific data, ranging from the vastness of galaxies to the intricacies of cellular biology. These ever-expanding datasets provide unparalleled opportunities for discovery but pose significant challenges for managing data and extracting meaningful insights. Artificial intelligence (AI) and machine learning (ML) are emerging as transformative tools for addressing these issues. However, state-of-the-art deep neural networks often require large volumes of labeled training data, which are costly and time-intensive to generate. AI foundation models (FMs) offer a promising alternative by leveraging self-supervised learning to identify patterns within data. These FMs enable diverse applications with reduced dependence on compute resources and labeled datasets.

NASA’s Office of the Chief Science Data Officer has formulated a "5+1" strategy to develop AI foundation models for science. This strategy emphasizes creating foundation models (FMs) pre-trained using flagship datasets from each of NASA’s science divisions while building a science-specific language model to support cross-divisional applications. Key achievements include the release of INDUS, an encoder language model trained on scientific publications and technical documents; two versions of the Prithvi Geospatial model for environmental monitoring applications; and the Prithvi Weather and Climate model, designed to reconstruct atmospheric states from incomplete data and forecast future states. Additionally, a heliophysics foundation model for space weather applications is under development and is scheduled for release by mid-2025.

To encourage NASA’s research and application communities to use these FMs in their work and to support NASA’s new Earth Science to Action Strategy, the Earth Science Division has developed additional research and application solicitations to further enhance these FMs and to build applications and tools leveraging these FMs. These announcements are available in NASA’s Research Opportunities in Space and Earth Science (ROSES 2025).

NASA has forged strategic partnerships with private sector organizations, academia, and other entities grounded in open science principles to build these models. Each model is designed around a specific set of scientific use cases to ensure relevance and practical impact. All models and associated use case notebooks are shared openly. 

This presentation will provide an overview of the foundation models released to date, the workflows used in their design and development, and the roadmap for future models. It will also highlight upcoming workshops aimed at equipping the broader scientific community to effectively integrate these models into their research.

How to cite: Ramachandran, R., Lee, T., and Murphy, K.: AI Foundation Models for Science: Current Initiatives, Workflow, and Future Roadmap, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-3328, https://doi.org/10.5194/egusphere-egu25-3328, 2025.