Using data complexity measures for thresholding in feature selection rankers

Borja Seijo-Pardo; Verónica Bolón-Canedo; Amparo Alonso-Betanzos

Conference Proceedings

Using data complexity measures for thresholding in feature selection rankers

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9868 LNAI 121-131

DOI: 10.1007/978-3-319-44636-3_12

14Citations

19Readers

Get full text

Abstract

In the last few years, feature selection has become essential to confront the dimensionality problem, removing irrelevant and redundant information. For this purpose, ranker methods have become an approximation commonly used since they do not compromise the computational efficiency. Ranker methods return an ordered ranking of all the features, and thus it is necessary to establish a threshold to reduce the number of features to deal with. In this work, a practical subset of features is selected according to three different data complexity measures, releasing the user from the task of choosing a fixed threshold in advance. The proposed approach was tested on six different DNA microarray datasets which have brought a difficult challenge for researchers due to the high number of gene expression and the low number of patients. The adequacy of the proposed approach in terms of classification error was checked by the use of an ensemble of ranker methods with a Support Vector Machine as classifier. This study shows that our approach was able to achieve competitive results compared with those obtained by fixed threshold approach, which is the standard in most research works.

Cite

CITATION STYLE

APA

Seijo-Pardo, B., Bolón-Canedo, V., & Alonso-Betanzos, A. (2016). Using data complexity measures for thresholding in feature selection rankers. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9868 LNAI, pp. 121–131). Springer Verlag. https://doi.org/10.1007/978-3-319-44636-3_12

Using data complexity measures for thresholding in feature selection rankers

Abstract

Cite

Register to see more suggestions