Imbalanced SVM-based anomaly detection algorithm for imbalanced training datasets

Gui Ping Wang; Jian Xi Yang; Ren Li

Journal ArticleOPEN ACCESS

Imbalanced SVM-based anomaly detection algorithm for imbalanced training datasets

ETRI Journal (2017) 39(5) 621-631

DOI: 10.4218/etrij.17.0116.0879

34Citations

32Readers

Abstract

Abnormal samples are usually difficult to obtain in production systems, resulting in imbalanced training sample sets. Namely, the number of positive samples is far less than the number of negative samples. Traditional Support Vector Machine (SVM)-based anomaly detection algorithms perform poorly for highly imbalanced datasets: The learned classification hyperplane skews toward the positive samples, resulting in a high false-negative rate. This article proposes a new imbalanced SVM (termed ImSVM)- based anomaly detection algorithm, which assigns a different weight for each positive support vector in the decision function. ImSVM adjusts the learned classification hyperplane to make the decision function achieve a maximum GMean measure value on the dataset. The above problem is converted into an unconstrained optimization problem to search the optimal weight vector. Experiments are carried out on both Cloud datasets and Knowledge Discovery and Data Mining datasets to evaluate ImSVM. Highly imbalanced training sample sets are constructed. The experimental results show that ImSVM outperforms over-sampling techniques and several existing imbalanced SVM-based techniques.

Author supplied keywords

Cite

CITATION STYLE

APA

Wang, G. P., Yang, J. X., & Li, R. (2017). Imbalanced SVM-based anomaly detection algorithm for imbalanced training datasets. ETRI Journal, 39(5), 621–631. https://doi.org/10.4218/etrij.17.0116.0879

Imbalanced SVM-based anomaly detection algorithm for imbalanced training datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions