EGU25-13062, updated on 15 Mar 2025
https://doi.org/10.5194/egusphere-egu25-13062
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Thursday, 01 May, 09:45–09:55 (CEST)
 
Room 2.44
Microplastics and Trash Cleaning and Harmonization (MaTCH): Semantic Data Ingestion and Harmonization Using Artificial Intelligence
Hannah Hapich1, Win Cowger1,2, and Andrew B. Gray1,3
Hannah Hapich et al.
  • 1University of California, Riverside, Environmental Sciences, Riverside, CA, United States of America (hannahhapich@gmail.com)
  • 2Moore Institute for Plastic Pollution Research, Long Beach, CA, United States of America (wincowger@gmail.com)
  • 3Karlsruhe Institute of Technology, Karlsruhe, Germany (agray@ucr.edu)

There has been a rapid increase in the number of studies on both trash and microplastics in recent years, with little data standardization. However, as data is being produced by a wide range of practitioners with differing study goals, researchers adhering to a single data standard may not be realistic. Post-hoc data harmonization is a pathway that transforms non-standardized data from prior studies into harmonized, comparable databases. Harmonization, however, is hindered by the vast number of categorical descriptors used to describe trash and microplastics (thousands or more), making manual harmonization efforts labor intensive. Additionally, non-semantic data misalignment also exists as different studies measure plastic occurrence via different metrics (particle count, mass, volume, etc.) and evaluate differing size ranges that must be rescaled to make meaningful comparisons between concentrations. We created Microplastics and Trash Cleaning and Harmonization (MaTCH), an AI automated algorithm utilizing manually developed databases that describe relationships between categorical descriptors of trash and microplastic particles. MaTCH also integrates other data harmonization techniques to address non-semantic issues of misalignment. All steps are combined into a single algorithm that can harmonize datasets from studies using various nomenclature, study methods, data formats, and reporting metrics. MaTCH is available as an open-source web tool for the research community to rapidly and accurately leverage existing data from trash and microplastic studies to better perform meta-analyses and make more meaningful assessments of data trends. By providing MaTCH as a live web-tool, we are able to include data from new and emerging studies to improve algorithm performance and keep up with the rapid pace of discovery. In a field as labor intensive as plastics research, we believe this may greatly expedite future discovery.

How to cite: Hapich, H., Cowger, W., and Gray, A. B.: Microplastics and Trash Cleaning and Harmonization (MaTCH): Semantic Data Ingestion and Harmonization Using Artificial Intelligence, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-13062, https://doi.org/10.5194/egusphere-egu25-13062, 2025.