A time efficient approach for distributed feature selection partitioning by features

12Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With the advent of high dimensionality, feature selection has become indispensable in real-world scenarios. However, most of the traditional methods only work in a centralized manner, which —ironically— increase the running time requirements when they are applied to this type of data. For this reason, we propose a distributed filter approach for vertically partitioned data. The idea is to split the data by features and apply a filter at each partition performing several rounds to obtain a final subset of features. Different than existing procedures to combine the partial outputs of the different partitions of data, we propose a merging process according to the theoretical complexity of these feature subsets instead of classification error. Experimental results tested in five datasets show that the running time decreases considerably. Moreover, regarding the classification accuracy, our approach was able to match, and in some cases even improve, the standard algorithms applied to the non-partitioned datasets.

Cite

CITATION STYLE

APA

Morán-Fernández, L., Bolón-Canedo, V., & Alonso-Betanzos, A. (2015). A time efficient approach for distributed feature selection partitioning by features. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9422, pp. 245–254). Springer Verlag. https://doi.org/10.1007/978-3-319-24598-0_22

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free