This study analyzes recent network data (HIKARI-2021) collected in 2021, with a focus on network anomaly detection. This initial work reports our evaluation results and observations performed with Machine Learning (ML) and Deep Learning (DL) techniques, including tree-based ensemble methods and neural network structures. The first observation is that the data is highly unbalanced, with only a small number of attack instances (normal vs. attack = 93%:7%). This class imbalance affects learning performance considerably, showing an F-measure of 69.98% at best. Applying a sampling strategy is beneficial and significantly improves the performance by up to 99.64%. We also examine the feasibility of zero-day detection (identifying previously unseen types of attacks) using the learning models. Our observation is that detecting previously untrained attack types is highly challenging, showing approximately 70% of the F1 score at best. We provide our analysis of the experimental results with an embedding-based visualization tool (t-distributed stochastic neighbor embedding).
CITATION STYLE
Kwon, D., Neagu, R. M., Rasakonda, P., Ryu, J. T., & Kim, J. (2023). Evaluating Unbalanced Network Data for Attack Detection. In SNTA 2023 - Proceedings of the 2023 on Systems and Network Telemetry and Analytics (pp. 23–26). Association for Computing Machinery, Inc. https://doi.org/10.1145/3589012.3594898
Mendeley helps you to discover research relevant for your work.