Physics-Aware Hybrid Deep Visual-Inertial Odometry Based on Graph Attention Networks for GNSS-denied Environment

Yubing Jiao; Shijie Liu; Changjiang Xiao; Wei Ouyang; Xiaohua Tong

doi:https://doi.org/10.5194/egusphere-egu26-2564

[Back] [Session ESSI1.18]

EGU26-2564, updated on 13 Mar 2026

https://doi.org/10.5194/egusphere-egu26-2564

EGU General Assembly 2026

© Author(s) 2026. This work is distributed under
the Creative Commons Attribution 4.0 License.

Physics-Aware Hybrid Deep Visual-Inertial Odometry Based on Graph Attention Networks for GNSS-denied Environment

Yubing Jiao^1,3, Shijie Liu^1,2,3, Changjiang Xiao^2,3,4, Wei Ouyang^2,3,4, and Xiaohua Tong^1,2,3

Yubing Jiao et al.

¹Shanghai Research Institute of Intelligent Autonomous System, Tongji University, Shanghai 200092, China (2311796@tongji.edu.cn)
²College of Surveying and Geo-Informatics, Tongji University, Shanghai 200092, China (liusjtj@tongji.edu.cn)
³Shanghai Key Laboratory of Planetary Mapping and Remote Sensing for Deep Space Exploration, Tongji University, Shanghai 200092, China (2311796@tongji.edu.cn)
⁴Shanghai Integrated Innovation Center for Manned Lunar Exploration, Tongji University, Shanghai 200092, China (cjxiao@tongji.edu.cn)

In GNSS-denied deep space exploration missions, high-precision state estimation and navigation positioning are critical to ensuring the successful completion of complex mission objectives. However, the environmental characteristics of extraterrestrial surfaces, such as drastic illumination changes, monotonous textures, and sparse features, often lead to the failure of traditional visual navigation systems. Meanwhile, IMUs, despite their high-frequency and anti-interference capabilities, face the challenge of integration error accumulation caused by biases and noise. Although Visual-Inertial Odometry (VIO) achieves complementary advantages through multi-source fusion, existing end-to-end deep learning methods often lack explicit physical modeling, this deficiency leads to a sharp degradation in generalization performance and susceptibility to drift in extreme environments, thereby failing to meet the stringent standards required for aerospace-grade missions.

To address the extreme environments of extraterrestrial bodies and the limitations of existing methods regarding the lack of physical consistency and insufficient generalization, we propose a Physics-Aware Hybrid Deep Visual-Inertial Odometry (PDVIO) navigation method suitable for extraterrestrial bodies, this framework is dedicated to deeply coupling physics-driven kinematic priors with data-driven deep representation capabilities to construct a navigation system that possesses both strong robustness and high precision. Specifically, this study comprises three core contributions: First, addressing the integration drift caused by IMU noise, we designed an analytical physical pre-integration module based on Lie Group Theory, unlike traditional networks that directly regress pose parameters, this module explicitly constructs IMU motion differential equations on the SE(3) manifold, embedding hard rigid body dynamic constraints directly into the network structure, thereby substantially reducing the risk of model divergence in extreme environments. Second, to cope with visual perception degradation caused by high-dynamic illumination changes and sparse textures, we introduce a FlowNet-enhanced multi-scale feature encoder, by extracting hierarchical spatiotemporal optical flow features via a pyramid structure, this enables the system to effectively capture ego-motion states based on optical flow field consistency even in regions with extreme textures, significantly enhancing the stability of front-end tracking. Finally, addressing the drawback of traditional methods relying on fixed noise covariance, we propose a differentiable factor graph back-end framework based on Graph Attention Networks (GAT). Utilizing an attention mechanism to dynamically learn the confidence weights of visual and inertial modalities according to the real-time dynamic environment, this successfully achieves adaptive end-to-end joint optimization from feature extraction to state estimation, greatly improving the system's adaptability and navigation accuracy in complex deep space environments.

Experiments conducted on simulation datasets and real-world ground data demonstrate that, while maintaining the efficiency of deep learning feature extraction, this method significantly enhances the robustness and generalization capability of the navigation system, specifically, the trajectory estimation error is markedly reduced compared to traditional end-to-end models, effectively mitigating long-term integration drift. Therefore, this study not only validates the effectiveness of embedding physical priors into deep learning frameworks, addressing the issues of insufficient robustness and limited autonomy inherent in purely data-driven methods within aerospace scenarios, but also provides a highly reliable and high-precision navigation solution for future planetary exploration missions involving precise pinpointing and navigation.

How to cite: Jiao, Y., Liu, S., Xiao, C., Ouyang, W., and Tong, X.: Physics-Aware Hybrid Deep Visual-Inertial Odometry Based on Graph Attention Networks for GNSS-denied Environment, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-2564, https://doi.org/10.5194/egusphere-egu26-2564, 2026.