An improved algorithm for SVMs classification of imbalanced data sets

11Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Support Vector Machines (SVMs) have strong theoretical foundations and excellent empirical success in many pattern recognition and data mining applications. However, when induced by imbalanced training sets, where the examples of the target class (minority) are outnumbered by the examples of the non-target class (majority), the performance of SVM classifier is not so successful. In medical diagnosis and text classification, for instance, small and heavily imbalanced data sets are common. In this paper, we propose the Boundary Elimination and Domination algorithm (BED) to enhance SVM class-prediction accuracy on applications with imbalanced class distributions. BED is an informative resampling strategy in input space. In order to balance the class distributions, our algorithm considers density information in training sets to remove noisy examples of the majority class and generate new synthetic examples of the minority class. In our experiments, we compared BED with original SVM and Synthetic Minority Oversampling Technique (SMOTE), a popular resampling strategy in the literature. Our results demonstrate that this new approach improves SVM classifier performance on several real world imbalanced problems. © 2009 Springer-Verlag.

Cite

CITATION STYLE

APA

Castro, C. L., Carvalho, M. A., & Braga, A. P. (2009). An improved algorithm for SVMs classification of imbalanced data sets. In Communications in Computer and Information Science (Vol. 43 CCIS, pp. 108–118). https://doi.org/10.1007/978-3-642-03969-0_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free