OPTIMIZING CYBERSECURITY: A DUAL-PHASE ML ARCHITECTURE FOR DETECTION AND CLASSIFICATION OF NETWORK ATTACKS

0Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

The rapid evolution of cyber threats has underscored the critical need for robust, high-performance Intrusion Detection Systems (IDS). This research presents a comprehensive framework leveraging five machine learning algorithms—Decision Tree, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Naïve Bayes, and Random Forest—to detect and classify network intrusions using the NSL-KDD dataset. The proposed methodology encompasses packet capture via a WinPcap-based sniffer, feature extraction across IP, TCP, UDP, and ICMP headers, and extensive preprocessing to aggregate session-level records. From an initial set of 41 attributes, a data-gain–driven feature selection process yielded 35 highly discriminative features. Each algorithm was trained on five million labeled connections and tested on two million unseen records, evaluating Accuracy, Precision, Recall, and F1-Score. Comparative analysis demonstrated that Random Forest achieved superior performance (96.4% accuracy, 96.3% F1-Score), followed by SVM (93.7% accuracy) and Decision Tree (89.3% accuracy). Naïve Bayes and KNN trailed due to conditional independence assumptions and distance-based limitations, respectively. Confusion matrix analysis highlighted Random Forest’s lowest False Negative rate, crucial for minimizing undetected attacks. Multi-class detection performance revealed ensemble and kernel methods excel in handling diverse intrusion categories, notably in challenging R2L and U2R attacks. Feature importance ranking underscored the significance of service-and connection-based metrics—such as srv_count and dst_host_srv_count—for effective discrimination. Time-performance evaluation found Naïve Bayes best suited for resource-constrained environments, while Random Forest balanced training complexity with fast inference. These findings suggest that ensembles of decorrelated trees, supported by targeted feature engineering, provide the most reliable IDS solution. The modular architecture and extensive benchmarking offer a scalable blueprint for industrial and cloud deployments. Future work will explore deep learning augmentations and adaptive online learning to address evolving threat landscapes and zero-day intrusion vectors.

Cite

CITATION STYLE

APA

Ambhore, P. G., Dand, H., & Joshi, R. (2025). OPTIMIZING CYBERSECURITY: A DUAL-PHASE ML ARCHITECTURE FOR DETECTION AND CLASSIFICATION OF NETWORK ATTACKS. International Journal of Applied Mathematics, 38(1S), 727–744. https://doi.org/10.12732/ijam.v38i1s.41

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free