Instance selection is a pre-processing technique for machine learning and data mining. The main problem is that previous approaches still suffer from the difficulty to produce effective samples for training classifiers. In recent research, a new sampling technique, called Progressive Border Sampling (PBS), has been proposed to produce a small sample from the original labelled training set by identifying and augmenting border points. However, border sampling on multi-class domains is not a trivial issue. Training sets contain much redundancy and noise in practical applications. In this work, we discuss several issues related to PBS and show that PBS can be used to produce effective samples by removing redundancies and noise from training sets for training classifiers. We compare this new technique with previous instance selection techniques for learning classifiers, especially, for learning Naïve Bayes-like classifiers, on multi-class domains except for one binary case which was for a practical application. © 2009 Springer.
CITATION STYLE
Li, G., Japkowicz, N., Stocki, T. J., & Ungar, R. K. (2009). Instance selection by border sampling in multi-class domains. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5678 LNAI, pp. 209–221). https://doi.org/10.1007/978-3-642-03348-3_22
Mendeley helps you to discover research relevant for your work.