Unsupervised feature selection by heuristic search with provable bounds on suboptimality

40Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

Identifying a small number of features that can represent the data is a known problem that comes up in areas such as machine learning, knowledge representation, data mining, and numerical linear algebra. Computing an optimal solution is believed to be NP-hard, and there is extensive work on approximation algorithms. Classic approaches exploit the algebraic structure of the underlying matrix, while more recent approaches use randomization. An entirely different approach that uses the A∗ heuristic search algorithm to find an optimal solution was recently proposed. Not surprisingly it is limited to effectively selecting only a small number of features. We propose a similar approach related to the Weighted A∗ algorithm. This gives algorithms that are not guaranteed to find an optimal solution but run much faster than the A∗ approach, enabling effective selection of many features from large datasets. We demonstrate experimentally that these new algorithms are more accurate than the current state-of-The-Art while still being practical. Furthermore, they come with an adjustable guarantee on how different their error may be from the smallest possible (optimal) error. Their accuracy can always be increased at the expense of a longer running time.

Cite

CITATION STYLE

APA

Arai, H., Maung, C., Xu, K., & Schweitzer, H. (2016). Unsupervised feature selection by heuristic search with provable bounds on suboptimality. In 30th AAAI Conference on Artificial Intelligence, AAAI 2016 (pp. 666–672). AAAI press. https://doi.org/10.1609/aaai.v30i1.10082

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free