Random feature subset selection for ensemble based classification of data with missing features

Joseph DePasquale; Robi Polikar

Conference Proceedings

Random feature subset selection for ensemble based classification of data with missing features

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4472 LNCS 251-260

DOI: 10.1007/978-3-540-72523-7_26

8Citations

8Readers

Get full text

Abstract

We report on our recent progress in developing an ensemble of classifiers based algorithm for addressing the missing feature problem. Inspired in part by the random subspace method, and in part by an AdaBoost type distribution update rule for creating a sequence of classifiers, the proposed algorithm generates an ensemble of classifiers, each trained on a different subset of the available features. Then, an instance with missing features is classified using only those classifiers whose training dataset did not include the currently missing features. Within this framework, we experiment with several bootstrap sampling strategies each using a slightly different distribution update rule. We also analyze the effect of the algorithm's primary free parameter (the number of features used to train each classifier) on its performance. We show that the algorithm is able to accommodate data with up to 30% missing features, with little or no significant performance drop. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

DePasquale, J., & Polikar, R. (2007). Random feature subset selection for ensemble based classification of data with missing features. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4472 LNCS, pp. 251–260). Springer Verlag. https://doi.org/10.1007/978-3-540-72523-7_26

Random feature subset selection for ensemble based classification of data with missing features

Abstract

Cite

Register to see more suggestions