- 1School of Earth Sciences and Engineering, Hohai University, Nanjing, China
- 2Institute for Biodiversity and Ecosystem Dynamics (IBED), University of Amsterdam, Amsterdam, The Netherlands
- 3Department of Geodesy and Geoinformation, Vienna University of Technology, Vienna, Austria
Forests are critical ecosystems that sustain biodiversity conservation, carbon cycling and climate regulation. Recent advances in laser scanning technology have provided unprecedented opportunities for detailed forest inventory and monitoring. Airborne, unmanned aerial, mobile, and terrestrial laser scanning systems produce complementary 3D point clouds that capture forest structural attributes across multiple scales and viewing geometries. However, the inherent heterogeneity in data characteristics across platforms severely limits the generalizability of conventional data-driven models. Additionally, naive multi-platform data mixed-training strategies that simply combine multi-platform data often lead to negative transfer, degrading segmentation performance and hindering consistent results across different acquisition systems. To address these challenges, we propose a Multi-platform Synergistic Training (MST) paradigm, a data- and model-driven representation learning framework, which can be seamlessly integrated into both semantic (tree components segmentation) and instance (individual tree segmentation) segmentation deep learning architectures. MST explicitly captures shared structural representations of forest environments through Cross Platform Aware Tokens (CPATs) and a Context Integration Module (CIM), which together enhance transferability and stability across heterogeneous forest point clouds. Furthermore, MST employs a two-stage training strategy in which platform-invariant features are first learned from pre-training on virtual synthetic multi-platform forest datasets, followed by fine-tuning on real-world data. This design lays the foundation for robust, platform-agnostic forest scene understanding while substantially reducing reliance on large volumes of manually annotated real-world data for training. The code for the proposed representation learning framework is available at: https://github.com/jdjiang312/MST.
The effectiveness of the proposed method is evaluated on nine benchmark forest point cloud datasets covering airborne, unmanned aerial, mobile, and terrestrial acquisitions, for both semantic and instance segmentation. Cross-dataset generalization experiments demonstrate that our framework achieves robust performance across all platform datasets and consistently outperforms models trained on single-platform data. Furthermore, by pre-training MST on a virtual synthetic forest point cloud dataset and subsequently fine-tuning on real-world data, the framework attains accuracy comparable to fully supervised training for both single-tree segmentation and tree-component segmentation, while relying on only 20% of the real annotations (semantic - mIoU: fully supervised 69.42% vs. MST 69.71%, instance - F1 Score: fully supervised 88.69% vs. MST 86.96%). These results highlight MST as a promising paradigm for cross-platform forest point cloud analysis, significantly reducing labeling costs while improving robustness and scalability. The framework thus offers a practical tool to enhance forest monitoring, inventory, and ecosystem assessment.
How to cite: Jiang, J., Shen, Y., Wang, J., Hollaus, M., Kissling, W. D., Ferreira, V., and Pfeifer, N.: Toward Platform-Invariant Forest 3D Perception: A Multi-platform Synergistic Training for Forest Point Cloud Segmentation, EGU General Assembly 2026, Vienna, Austria, 3–8 May 2026, EGU26-4814, https://doi.org/10.5194/egusphere-egu26-4814, 2026.