Best subset feature selection for massive mixed-type problems

10Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We address the problem of identifying a non-redundant subset of important variables. All modern feature selection approaches including filters, wrappers, and embedded methods experience problems in very general settings with massive mixed-type data, and with complex relationships between the inputs and the target. We propose an efficient ensemble-based approach measuring statistical independence between a target and a potentially very large number of inputs including any meaningful order of interactions between them, removing redundancies from the relevant ones, and finally ranking variables in the identified minimum feature set. Experiments with synthetic data illustrate the sensitivity and the selectivity of the method, whereas the scalability of the method is demonstrated with a real car sensor data base. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Tuv, E., Borisov, A., & Torkkola, K. (2006). Best subset feature selection for massive mixed-type problems. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4224 LNCS, pp. 1048–1056). Springer Verlag. https://doi.org/10.1007/11875581_125

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free