EGU22-13338
https://doi.org/10.5194/egusphere-egu22-13338
EGU General Assembly 2022
© Author(s) 2022. This work is distributed under
the Creative Commons Attribution 4.0 License.

A workflow to standardize collection and management of large-scale data and metadata from environmental observatories

Dylan O'Ryan1,2, Charuleka Varadharajan1, Erek Alper3, Kristin Boye4, Madison Burrus1, Danielle Christianson1, Shreyas Cholia1, Robert Crystal-Ornelas1, Joan Damerow1, Wenming Dong1, Hesham Elbashandy1, Boris Faybishenko1, Valerie Hendrix1, Douglas Johnson3, Zarine Kakalia1,5, Roelof Versteeg3, Kenneth Williams1, Catherine Wong1,5, and Deborah Agarwal1
Dylan O'Ryan et al.
  • 1Lawrence Berkeley National Laboratory
  • 2California State University, Sacramento
  • 3Subsurface Insights
  • 4SLAC National Accelerator Laboratory
  • 5University of California, Berkeley

The Watershed Function Scientific Focus Area (WFSFA) is a U.S. Department of Energy research project that seeks to determine how mountainous watersheds retain and release water, carbon, nutrients, and metals. The WFSFA maintains a community field observatory at its primary field site in the East River, Colorado. The WFSFA collects diverse environmental data and has developed a “Field-Data” workflow that standardizes data management across the project, from field collection to laboratory analysis to publication. This workflow enables the WFSFA to address data quality and management challenges that environmental observatories face. 

Through this workflow, the WFSFA has increased the use of the data curated from the project by (1) providing detailed metadata with unique identifiers for samples, locations, and sensors, (2) streamlining the data sharing and publication process through early sharing of data internally within the team and publication of data on the ESS-DIVE repository following curation, and (3) adopting machine-readable and FAIR community data standards (Findability, Accessibility, Interoperability, Reusability). 

We describe an example application of this workflow for geochemical data, which utilizes a community geochemical data standard for water-soil-sediment chemistry (https://github.com/ess-dive-community/essdive-water-soil-sed-chem) developed by Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE). This data standard is designed to standardize geochemical data, metadata, and file-level metadata, and was applied to WFSFA geochemical data, including ICP-MS, Isotope, Ammonia-N, Anion, DIC/NPOC/TDN datasets. This ensures important metadata is contained within the data file, such as precision of data analysis, storage and sample processing information, detailed sample names, material information, and unique identifiers associated with the samples (IGSNs). This metadata is essential to understand and reuse data products, as well as enable machine-readability for future model applications. Detailed examples of the standardized geochemical data types were created and are now being used as templates by WFSFA researchers to standardize their geochemical data. The adoption of this community geochemical data standard and more broadly the Field-Data workflow will improve the findability and reusability of WFSFA datasets. 

How to cite: O'Ryan, D., Varadharajan, C., Alper, E., Boye, K., Burrus, M., Christianson, D., Cholia, S., Crystal-Ornelas, R., Damerow, J., Dong, W., Elbashandy, H., Faybishenko, B., Hendrix, V., Johnson, D., Kakalia, Z., Versteeg, R., Williams, K., Wong, C., and Agarwal, D.: A workflow to standardize collection and management of large-scale data and metadata from environmental observatories, EGU General Assembly 2022, Vienna, Austria, 23–27 May 2022, EGU22-13338, https://doi.org/10.5194/egusphere-egu22-13338, 2022.

Comments on the display material

to access the discussion