Instance selection by border sampling in multi-class domains

Guichong Li; Nathalie Japkowicz; Trevor J. Stocki; R. Kurt Ungar

Conference Proceedings

Instance selection by border sampling in multi-class domains

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5678 LNAI 209-221

DOI: 10.1007/978-3-642-03348-3_22

2Citations

4Readers

Get full text

Abstract

Instance selection is a pre-processing technique for machine learning and data mining. The main problem is that previous approaches still suffer from the difficulty to produce effective samples for training classifiers. In recent research, a new sampling technique, called Progressive Border Sampling (PBS), has been proposed to produce a small sample from the original labelled training set by identifying and augmenting border points. However, border sampling on multi-class domains is not a trivial issue. Training sets contain much redundancy and noise in practical applications. In this work, we discuss several issues related to PBS and show that PBS can be used to produce effective samples by removing redundancies and noise from training sets for training classifiers. We compare this new technique with previous instance selection techniques for learning classifiers, especially, for learning Naïve Bayes-like classifiers, on multi-class domains except for one binary case which was for a practical application. © 2009 Springer.

Author supplied keywords

Cite

CITATION STYLE

APA

Li, G., Japkowicz, N., Stocki, T. J., & Ungar, R. K. (2009). Instance selection by border sampling in multi-class domains. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5678 LNAI, pp. 209–221). https://doi.org/10.1007/978-3-642-03348-3_22

Instance selection by border sampling in multi-class domains

Abstract

Author supplied keywords

Cite

Register to see more suggestions