EPSC Abstracts
Vol. 18, EPSC-DPS2025-606, 2025, updated on 09 Jul 2025
https://doi.org/10.5194/epsc-dps2025-606
EPSC-DPS Joint Meeting 2025
© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.
Unsupervised Meteoroid Stream Identification Using HDBSCAN
Eloy Peña-Asensio and Fabio Ferrari
Eloy Peña-Asensio and Fabio Ferrari
  • Department of Aerospace Science and Technology, Politecnico di Milano, Via La Masa 34, 20156 Milano, Italy

Accurate classification of meteoroid streams is essential for understanding their evolution, dynamics, and source bodies. Conventional methods such as traditional orbital similarity metrics or look-up tables face limitations, as they are not mathematically consistent or rely on subjective criteria. In this work, we evaluate the performance of the Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) algorithm [1] for unsupervised stream identification using data from the CAMS Meteoroid Orbit Database v3.0 [2], which contains 471,582 meteoroid entries observed from 2010 to 2016. After applying standard quality filters—convergence angle ≥15°, velocity uncertainty ≤10%, eccentricity e ≤ 1, and perihelion distance q ≤ 1 au—a total of 316,235 meteoroids are retained, of which ~70% are classified as sporadic by CAMS.

To characterize each meteoroid, we define three feature vectors. The LUTAB vector comprises the solar longitude (λ), the Sun-centered ecliptic radiant coordinates (α and β), and the geocentric velocity (Vg), following the parameters used in the CAMS classification look-up table. The ORBIT vector consists of the heliocentric orbital elements: perihelion distance (q), eccentricity (e), inclination (i), argument of perihelion (ω), and the longitude of ascending node (Ω), similar to [3]. The GEO vector encodes the solar longitude, the geocentric radiant via sin(λg − λ)·cos(βg), cos(λg − λ)·cos(βg), and sin(βg), along with the geocentric velocity, following [4]. In all cases, normalization is applied: each feature is standardized to zero mean and unit variance to prevent dominance of any single variable during clustering, while angular parameters are represented in their sine and cosine components to address periodicity.

HDBSCAN is executed with a range of minimum cluster sizes (10–1000) and with two cluster selection strategies: ‘excess of mass’ (EOM) and ‘leaf’. Clustering performance is assessed using the Silhouette score (for internal compactness), Normalized Mutual Information (NMI) with respect to CAMS classifications, and F1 score following cluster-label assignment via the Hungarian algorithm. Sporadic meteors are treated as noise by HDBSCAN.

The ORBIT vector yields the highest Silhouette scores, indicating the most internally consistent clusters (see Figure 1). However, the LUTAB vector provides the best agreement with CAMS classifications (see Figure 2), achieving NMI = 0.76 and 25 out of 45 HDBSCAN-confirmed streams matching the CAMS labels with F1 > 0.8. Using ORBIT, only 7 out of 25 clusters reach F1 > 0.8. The GEO vector results are intermediate. The EOM cluster selection method consistently outperforms ‘leaf’, producing more stable and coherent cluster structures. Nevertheless, increasing the minimum cluster size improves compactness but reduces agreement with CAMS due to excessive merging of smaller showers.

We observe that HDBSCAN naturally groups streams with similar orbital geometry even when CAMS classifies them separately. For example, the Orionids (8/ORI) and Eta Aquariids (31/ETA), both from 1P/Halley, are clustered together when using the ORBIT vector. Likewise, HDBSCAN merges multiple minor showers (see Figure 3), reducing classification granularity but improving statistical coherence.

These findings show that HDBSCAN, when applied to normalized orbital or geocentric vectors, can provide a robust, parameter-efficient framework for meteoroid stream classification. Compared to CAMS look-up table, HDBSCAN yields more statistically consistent clusters. However, their physical validity remains to be verified.

Figure 1. Silhouette score vs. minimum cluster size for ORBIT (blue), GEO (red), and LUTAB (green); solid, dashed, and dash-dotted lines indicate each vector. Dark/light shades: EOM/leaf methods. Markers at size 100 with reference lines.

Figure 2. NMI vs. minimum cluster size, comparing HDBSCAN to CAMS for ORBIT (solid blue), GEO (dashed red), and LUTAB (dash-dotted green). Darker shades: eom; lighter: leaf.

Figure 3. X-axis: number of clusters; Y-axis: cluster size. Clusters sorted by size (largest on the left). Black: CAMS. Cyan: HDBSCAN with max NMI. Red: HDBSCAN with max Silhouette at min size 100. Yellow: HDBSCAN with overall max Silhouette.

 

Acknowledgements

The scientific activities of LUMIO are supported by the Italian Space Agency (ASI), whereas the Phase B engineering work has been conducted under ESA Contract No. 4000139301/22/NL/AS within the General Support Technology Programme (GSTP) through the support of the national delegations of Italy (ASI) and Norway (NOSA).

 

References

[1] Campello, R. J., Moulavi, D., & Sander, J. (2013, April). Density-based clustering based on hierarchical density estimates. In Pacific-Asia conference on knowledge discovery and data mining (pp. 160-172). Berlin, Heidelberg: Springer Berlin Heidelberg.

[2] Jenniskens, P., Baggaley, J., Crumpton, I., Aldous, P., Pokorny, P., Janches, D., ... & Ganju, S. (2018). Planetary and Space Science, 154, 21-29.

[3] Peña-Asensio, E., & Sánchez-Lozano, J. M. (2024). Advances in Space Research, 74(2), 1073-1089.

[4] Sugar, G., Moorhead, A., Brown, P., & Cooke, W. (2017). Meteoritics & Planetary Science, 52(6), 1048-1059.

How to cite: Peña-Asensio, E. and Ferrari, F.: Unsupervised Meteoroid Stream Identification Using HDBSCAN, EPSC-DPS Joint Meeting 2025, Helsinki, Finland, 7–12 Sep 2025, EPSC-DPS2025-606, https://doi.org/10.5194/epsc-dps2025-606, 2025.