Feature Selection for High Dimensional Data Using Weighted K-Nearest Neighbors and Genetic Algorithm

43Citations
Citations of this article
39Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Too many input features in applications may lead to over-fitting and reduce the performance of the learning algorithm. Moreover, in most cases, each feature containing different information content has different effects on the prediction target. Therefore, a feature selection method for calculating the importance of each feature, called WKNNGAFS, is proposed in this paper. In this method, the genetic algorithm (GA) is adopted to search the optimal weight vector, the value of the i th component of which corresponds to the contribution degree of the i th feature to the classification from a global perspective. Besides, weighted K-nearest neighbors algorithm (WKNN), which takes both the different contributions of nearest neighbors and the different classification ability of each feature into account, is used to determine the target label. To evaluate the effectiveness of the proposed method, nine existing feature selection methods are compared with it on 13 real datasets, including 6 high dimensional microarray datasets. Experimental results demonstrate the method is more effective and can improve classification performance.

Cite

CITATION STYLE

APA

Li, S., Zhang, K., Chen, Q., Wang, S., & Zhang, S. (2020). Feature Selection for High Dimensional Data Using Weighted K-Nearest Neighbors and Genetic Algorithm. IEEE Access, 8, 139512–139528. https://doi.org/10.1109/ACCESS.2020.3012768

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free