- Barcelona Supercomputing Center, Earth Sciences, Spain (alessio.melli@bsc.es)
Massive computing resources are nowadays required by current chemical transport models (CTMs) operating at global and/or regional scale to solve the system of ordinary differential equations associated with chemical kinetics. The sheer complexity of our atmosphere (in terms of the number of different constituents and reactions) together with the orders of magnitude differing between the chemical and the transport time scales, hinder the use of comprehensive mechanisms in large-scale 3D models. The rapid advancements in the field of machine learning (ML), alongside with the latest improvements in parallel computing, supplied new and powerful tools to the equipment of present-day atmospheric modelers. A notable example is given by physics-informed ML, where specialized network architectures are designed to satisfy the physical constraints of the system under investigation, leading to promising results in the emulation of physical processes. Indeed, physics may be introduced in the ML architecture at different stages, therefore determining the type of constraint (hard vs soft) embedded into the model.
In this work, the baseline performance is defined on a fully-connected multilayer perceptron (fc-MLP) trained to predict the concentration change using the composition vector at a given time as input. The dataset is generated using Sobol sampling of different initial conditions within a specified concentration range to ensure comprehensive and efficient coverage of the input space. As a first attempt of including physics in the model architecture, we introduce the mech-MLP model, obtained by exploiting an array of MLPs—one per each chemical reaction present in the mechanism—whose outputs (i.e., the change in composition) are aggregated together to determine the total change to each chemical species. Furthermore, chemical and physical soft constraints are introduced also via the use of custom loss functions by imposing penalty terms for un-physical predictions (e.g., negative concentration or divergence from stoichiometry). The trade-off between dataset size, creation cost, and training efficiency, the inductive biases arising from the architecture choice, and the reliability of the model when tested on unseen conditions will be presented for two study cases: an explanatory mechanism involving 3 species and 2 reactions, and a simple, stiff air pollution mechanism (POLLU, doi.org/10.1137/0915076) composed by 20 species and 25 reactions.
How to cite: Melli, A., Mouchel-Vallon, C., Petetin, H., and Jorba Casellas, O.: Emulating tropospheric chemistry with physics-informed machine learning, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-5628, https://doi.org/10.5194/egusphere-egu25-5628, 2025.