From CIC-IDS2017 to LYCOS-IDS2017: A corrected dataset for better performance

6Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

As connected objects become the standard for quality of life, network intrusion detection is getting more critical than ever. Over the past decades, various datasets have been developed to address this security challenge. Analysis of earlier datasets, such as KDD-Cup99 and NSL-KDD, highlighted some of the issues, leading the way for newer datasets that have corrected the identified problems. CIC-IDS2017, one of the newest network intrusion detection datasets, has become a popular choice. Its advantage is the availability of raw data in PCAP files as well as flow-based features in CSV files. In this paper, a detailed analysis of this dataset is performed and we report several problems discovered in the flows retrieved from the network packets. To overcome these problems, a new feature extraction tool named LycoSTand is suggested. In addition, a feature selection is proposed considering correlations and feature importance. The performance comparison between the original and the new dataset shows significant improvements for all evaluated machine learning algorithms. Based on the improvements in CIC-IDS2017, we also examine other datasets affected by the same issues on which LycoSTand can be used to produce improved datasets for network intrusion detection.

Cite

CITATION STYLE

APA

Rosay, A., Carlier, F., Cheval, E., & Leroux, P. (2021). From CIC-IDS2017 to LYCOS-IDS2017: A corrected dataset for better performance. In ACM International Conference Proceeding Series (pp. 570–575). Association for Computing Machinery. https://doi.org/10.1145/3486622.3493973

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free