EGU25-9878, updated on 15 Mar 2025
https://doi.org/10.5194/egusphere-egu25-9878
EGU General Assembly 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Oral | Thursday, 01 May, 16:20–16:30 (CEST)
 
Room K2
Theory and implementation of least-squares-based deep learning
Alireza Amiri-Simkooei
Alireza Amiri-Simkooei
  • Delft University of Technology, Faculty of Aerospace Engineering, Department of Control and Operations, Delft, Netherlands (a.amirisimkooei@tudelft.nl)

Big data is one of the most important phenomena of the 21st century, creating unique opportunities and challenges in its processing and interpretation. Machine learning (ML), a subset of artificial intelligence (AI), has become a foundation of data science, which enables applications ranging from computer vision, geoscience, aviation and medicine. ML becomes important when establishing mathematical models that connect explanatory variables to predicted variables is impossible due to complexity. Deep learning (DL), a subset of ML, has revolutionized fields such as speech recognition, email filtering, and time series analysis. However, DL methods face challenges such as high data demand, overfitting, and the “black box” problem.

We review least-squares-based deep learning (LSBDL), a framework that combines the interpretability of linear least squares (LS) theory with the flexibility and power of deep learning (DL). LS theory, widely used in engineering and geosciences, provides powerful tools for parameter estimation, quality control, and reliability through linear models. DL, on the other hand, deals with modelling complex nonlinear relationships where the mapping between explanatory and predicted variables is unknown. LSBDL bridges these approaches by formulating DL within the LS framework: training networks to establish a design matrix, an essential element of linear models. Through this integration, LSBDL enhances DL with transparency, statistical inference, and reliability. Gradient descent methods such as steepest descent and Gauss-Newton methods are used to construct an adaptive design matrix. By combining the transparency of LS theory with the data-driven adaptability of DL, LSBDL addresses challenges in different fields including geoscience, aviation, and data science. This approach not only improves the interpretability of DL models, but also extends the applicability of LS theory to nonlinear and complex systems, offering new opportunities for innovation and research.

By embedding statistical foundations in the DL workflow, LSBDL offers a three-fold advantage: i) Direct computation of covariance matrices for predicted outcomes allows for quantitative assessment of model uncertainty. ii) Well-established theories of hypothesis testing and outlier detection facilitate the identification of model misspecifications and outlying data, and iii) The covariance matrix of observations can be used to train networks with statistically correlated, inconsistent, or heterogeneous datasets. Incorporating least squares principles increases model explainability, a critical aspect of interpretable and explainable artificial intelligence, and bridges the gap between traditional statistical methods and modern DL techniques. For example, LSBDL can incorporate prior knowledge using soft and hard physics-based constraints, a technique known as physics-informed machine learning (PIML).

The approach is illustrated through three illustrative examples: Surface fitting, time series forecasting, and groundwater storage downscaling. Beyond these examples, LSBDL offers opportunities for various applications including geoscience, inverse problems, aviation, data assimilation, sensor fusion, and time series analysis.

How to cite: Amiri-Simkooei, A.: Theory and implementation of least-squares-based deep learning, EGU General Assembly 2025, Vienna, Austria, 27 Apr–2 May 2025, EGU25-9878, https://doi.org/10.5194/egusphere-egu25-9878, 2025.