A review of feature selection methods on synthetic data

574Citations
Citations of this article
593Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With the advent of high dimensionality, adequate identification of relevant features of the data has become indispensable in real-world scenarios. In this context, the importance of feature selection is beyond doubt and different methods have been developed. However, with such a vast body of algorithms available, choosing the adequate feature selection method is not an easy-to-solve question and it is necessary to check their effectiveness on different situations. Nevertheless, the assessment of relevant features is difficult in real datasets and so an interesting option is to use artificial data. In this paper, several synthetic datasets are employed for this purpose, aiming at reviewing the performance of feature selection methods in the presence of a crescent number or irrelevant features, noise in the data, redundancy and interaction between attributes, as well as a small ratio between number of samples and number of features. Seven filters, two embedded methods, and two wrappers are applied over eleven synthetic datasets, tested by four classifiers, so as to be able to choose a robust method, paving the way for its application to real datasets. © 2012 Springer-Verlag London Limited.

Cite

CITATION STYLE

APA

Bolón-Canedo, V., Sánchez-Maroño, N., & Alonso-Betanzos, A. (2013, March 1). A review of feature selection methods on synthetic data. Knowledge and Information Systems. Springer London. https://doi.org/10.1007/s10115-012-0487-8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free