EGU23-2941
https://doi.org/10.5194/egusphere-egu23-2941
EGU General Assembly 2023
© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Modern Scientific Data Governance Framework

Rahul Ramachandran1, Ge Peng2, Shelby Bagwell2, Abdelhak Marouane2, Sumant Jha3, and Jerika Christman4
Rahul Ramachandran et al.
  • 1NASA/MSFC, Huntsville, United States of America (rahul.ramachandran@nasa.gov)
  • 2UAH
  • 3USRA
  • 4Barrios Tech

Science has entered the era of Big Data with new challenges related to data governance, stewardship, and management. The existing data governance practices must catch up to ensure proper data management. Existing data governance policies and stewardship best practices tend to be disconnected from operational data management practices and enforcement and mainly exist in well-meaning documents or reports. These governance policies are, at best, partially implemented and rarely monitored or audited. In addition, existing governance policies keep adding additional data management steps that require a human, ‘a data steward’, in the loop, and the cost of data management can no longer scale proportionately with the current and future increased data volume and complexity.

 

The goal for developing an updated data governance framework is to modernize scientific data governance to the reality of Big data and align it with the current technology trends such as cloud computing and AI. The goals of this framework are two folds. One is to ensure thoroughness that the governance adequately covers the entire data life cycle. Two, provide a practical approach that offers a consistent and repeatable process for different projects. Three core principles ground this framework. First, focus on just enough governance and prevent data governance from becoming a roadblock toward the scientific process. Remove any unnecessary processes and steps. Second, automate data management steps where possible. Actively remove steps that require  ‘human in the loop’ within the management process to be efficient and scale with increasing data. Third, all the processes should continually be optimized using quantified metrics to streamline the monitoring and auditing workflows.

 

How to cite: Ramachandran, R., Peng, G., Bagwell, S., Marouane, A., Jha, S., and Christman, J.: Modern Scientific Data Governance Framework, EGU General Assembly 2023, Vienna, Austria, 24–28 Apr 2023, EGU23-2941, https://doi.org/10.5194/egusphere-egu23-2941, 2023.