EGU24-14071, updated on 09 Mar 2024
https://doi.org/10.5194/egusphere-egu24-14071
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Community-Driven Development of Tools to Improve AI-Readiness of the Open Environmental Data

Yuhan "Douglas" Rao1, Rob Redmon2, and Eric Khin2
Yuhan "Douglas" Rao et al.
  • 1North Carolina State University, Cooperative Institute for Satellite Earth System Studies, Asheville, United States of America
  • 2National Oceanic and Atmospheric Administration, Boulder, United States of America

As artificial intelligence (AI) and machine learning (ML) gaining broad interests in the Earth and space science community, the demand for AI-ready data can support the development of responsible AI/ML applications with open environmental data. Through a broad community collaboration under Earth Science Information Partners, we have developed an AI-readiness checklist as a community guideline for the development of AI-ready open environmental data. The checklist was initially based on an early draft of an AI-ready matrix developed by the OSTP Open Science Sub-committee but has been modified notably based on feedback from data users and AI/ML practitioners. The current version of the AI-readiness checklist can be used to holistically assess the documentation, quality, access, and pre-processing of a given dataset. The AI-readiness assessment result can be then summarized into a data card that provides human-readable metrics to assist users in determining if the dataset meets the user's need for their AI/ML development. The next milestone of this community-driven effort is to develop a community-driven convention by building on the existing data conventions and standards to fill the data management gap to support AI-ready data management. In this presentation, we will also showcase a collection of AI-ready climate datasets applying the AI-readiness checklist and data card concept to support AI/ML applications in climate sciences. The AI-readiness development process requires active community engagement with data repositories, domain scientists, and AI/ML practitioners to establish a flexible framework to ensure the rapid evolution of AI/ML technologies can be addressed in modern data management.

How to cite: Rao, Y. "., Redmon, R., and Khin, E.: Community-Driven Development of Tools to Improve AI-Readiness of the Open Environmental Data, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-14071, https://doi.org/10.5194/egusphere-egu24-14071, 2024.