Support Vector Machines (SVMs) have strong theoretical foundations and excellent empirical success in many pattern recognition and data mining applications. However, when induced by imbalanced training sets, where the examples of the target class (minority) are outnumbered by the examples of the non-target class (majority), the performance of SVM classifier is not so successful. In medical diagnosis and text classification, for instance, small and heavily imbalanced data sets are common. In this paper, we propose the Boundary Elimination and Domination algorithm (BED) to enhance SVM class-prediction accuracy on applications with imbalanced class distributions. BED is an informative resampling strategy in input space. In order to balance the class distributions, our algorithm considers density information in training sets to remove noisy examples of the majority class and generate new synthetic examples of the minority class. In our experiments, we compared BED with original SVM and Synthetic Minority Oversampling Technique (SMOTE), a popular resampling strategy in the literature. Our results demonstrate that this new approach improves SVM classifier performance on several real world imbalanced problems. © 2009 Springer-Verlag.
CITATION STYLE
Castro, C. L., Carvalho, M. A., & Braga, A. P. (2009). An improved algorithm for SVMs classification of imbalanced data sets. In Communications in Computer and Information Science (Vol. 43 CCIS, pp. 108–118). https://doi.org/10.1007/978-3-642-03969-0_11
Mendeley helps you to discover research relevant for your work.