EGU24-2046, updated on 08 Mar 2024
https://doi.org/10.5194/egusphere-egu24-2046
EGU General Assembly 2024
© Author(s) 2024. This work is distributed under
the Creative Commons Attribution 4.0 License.

Comparative Analysis of Random Forest and XGBoost in Classifying Ionospheric Signal Disturbances During Solar Flares

Filip Arnaut, Aleksandra Kolarski, and Vladimir Srećković
Filip Arnaut et al.
  • Institute of Physics Belgrade, University of Belgrade, Laboratory for astrophysics and physics of ionosphere, Belgrade, Serbia (filip.arnaut@ipb.ac.rs)

In our previous publication (Arnaut et al. 2023), we demonstrated the application of the Random Forest (RF) algorithm for classifying disturbances associated with solar flares (SF), erroneous signals, and measurement errors in VLF amplitude data i.e., anomaly detection in VLF amplitude data. The RF algorithm is widely regarded as a preferred option for conducting research in novel domains. Its advantages, such as its ability to avoid overfitting data and its simplicity, make it particularly valuable in these situations. Nevertheless, it is imperative to conduct thorough testing and evaluation of alternative algorithms and methods to ascertain their potential advantages and enhance the overall efficiency of the method. This brief communication demonstrates the application of the XGBoost (XGB) method on the exact dataset previously used for the RF algorithm, along with a comparative analysis between the two algorithms. Given that the problem is framed as a machine learning (ML) problem with a focus on the minority class, the comparative analysis is exclusively conducted using the minority (anomalous) data class. The data pre-processing methodology can be found in Arnaut et al. (2023). The XGB tuning process involved using a grid search method to optimize the hyperparameters of the model. The number of estimators (trees) was varied from 25 to 500 in increments of 25, and the learning rate was varied from 0.02 to 0.4 in increments of 0.02. The F1-Score for the anomalous data class is similar for both models, with a value of 0.508 for the RF model and 0.51 for the XGB model. These scores were calculated using the entire test dataset, which consists of 19 transmitter-receiver pairs. Upon closer examination, it becomes evident that the RF model exhibits a higher precision metric (0.488) than the XGB model (0.37), while the XGB model demonstrates a higher recall metric (0.84) compared to the RF model (0.53). Upon examining each individual transmitter-receiver pair, it was found that XGB outperformed RF in terms of F1-Scores in 10 out of 19 cases. The most significant disparities are observed in cases where the XGB model outperformed by a margin of 0.15 in terms of F1-Score, but conversely performed worse by approximately -0.16 in another instance for the anomalous data class. The XGB models outperformed the RF model by approximately 6.72% in terms of the F1-score for the anomalous data class when averaging all the 19 transmitter-receiver pairs. When utilizing a point-based evaluation metric that assigns rewards or penalties for each entry in the confusion matrix, the RF model demonstrates an overall improvement of approximately 5% compared to the XGB model. Overall, the comparison between the RF and XGB models is ambiguous. Both models have instances where one is superior to the other. Further research is necessary to fully optimize the method, which has benefits in automatically classifying VLF amplitude anomalous signals caused by SF effects, erroneous measurements, and other factors.

References:

Arnaut, F., Kolarski, A. and Srećković, V.A., 2023. Random Forest Classification and Ionospheric Response to Solar Flares: Analysis and Validation. Universe9(10), p.436.

How to cite: Arnaut, F., Kolarski, A., and Srećković, V.: Comparative Analysis of Random Forest and XGBoost in Classifying Ionospheric Signal Disturbances During Solar Flares, EGU General Assembly 2024, Vienna, Austria, 14–19 Apr 2024, EGU24-2046, https://doi.org/10.5194/egusphere-egu24-2046, 2024.

Comments on the supplementary material

AC: Author Comment | CC: Community Comment | Report abuse

supplementary materials version 1 – uploaded on 09 May 2024, no comments