- 1Clarkson University, Institute for a Sustainable Environment, Potsdam, NY USA
- 2University of Rochester School of Medicine and Dentistry, Departments of Public Health Sciences and Environmental Medicine, Rochester, NY USA
- 3U.S. Environmental Protection Agency (EPA), Office of Research and Development, Center for Environmental Measurement and Modeling, Athens, GA, USA
In environmental data analysis, source apportionment can be an important approach to extract useful information that might otherwise be hidden within the data. The United States Environmental Protection Agency (EPA) has developed an open-source python package, the Environmental Source Apportionment Toolkit (ESAT), which enables source apportionment modeling and error estimation workflows. ESAT is intended to replace Positive Matrix Factorization v5 (PMF5) that has substantial data size limitations. ESAT is currently in alpha testing with development plans for enhanced functionality and support of large datasets, High-performance Computing (HPC) execution through a command line interface (CLI), and a standalone desktop graphical user interface (GUI). The alpha product of ESAT is publicly available and offers a complete application programming interface (API) to replicate the workflows and functionality of PMF5, with examples provided through Jupyter Notebooks. The ESAT computing module currently contains two non-negative matrix factorization (NMF) algorithms for model training, with the module designed for other algorithms to be easily added. The two algorithms currently available are the least-squares NMF (LS-NMF) and weighted-semi NMF (WS-NMF). Each algorithm offers different benefits depending on project or data requirements. The ESAT python codebase has been optimized to run in a highly parallelized manner, with most of the numerical computations implemented in Rust, a low-level language comparable in performance to C. ESAT replicates the model error estimation methods of PMF5, namely bootstrap, displacement, and a hybrid method. To facilitate experimentation and testing, ESAT contains a synthetic dataset generator and model simulator that can evaluate how well ESAT can recreate synthetic factors and contributions. Continuous development of new features are tested and added to the python package on a regular basis. One such feature is the addition of an uncertainty perturbation workflow, which will run a collection of models while slightly perturbing the uncertainty matrix, and then evaluating the impact on the solution profiles and contributions. The alpha version of the ESAT python package is available for installation from pypi at https://pypi.org/project/esat/. Further testing and development of the alpha version will proceed to a full release in late 2025. The development of a GUI desktop application is currently planned to begin after the ESAT full release.
How to cite: Hopke, P., Smith, D., Cyterski, M., Johnston, J., Wolfe, K., and Parmar, R.: Alpha Release of the US Environmental Protection Agency’s Environmental Source Apportionment Toolkit (ESAT), EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-6691, https://doi.org/10.5194/egusphere-egu25-6691, 2025.