Speeding up the wrapper feature subset selection in regression by mutual information relevance and redundancy analysis

Gert Van Dijck; Marc M. Van Hulle

Conference Proceedings

Speeding up the wrapper feature subset selection in regression by mutual information relevance and redundancy analysis

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4131 LNCS - I 31-40

DOI: 10.1007/11840817_4

34Citations

38Readers

Get full text

Abstract

A hybrid filter/wrapper feature subset selection algorithm for regression is proposed. First, features are filtered by means of a relevance and redundancy filter using mutual information between regression and target variables, We introduce permutation tests to find statistically significant relevant and redundant features. Second, a wrapper searches for good candidate feature subsets by taking the regression model into account. The advantage of a hybrid approach is threefold. First, the filter provides interesting features independently from the regression model and, hence, allows for an easier interpretation. Secondly, because the filter part is computationally less expensive, the global algorithm will faster provide good candidate subsets compared to a stand-alone wrapper approach. Finally, the wrapper takes the bias of the regression model into account, because the regression model guides the search for optimal features. Results are shown for the 'Boston housing' and 'orange juice' benchmarks based on the multilayer perceptron regression model. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Van Dijck, G., & Van Hulle, M. M. (2006). Speeding up the wrapper feature subset selection in regression by mutual information relevance and redundancy analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4131 LNCS-I, pp. 31–40). Springer Verlag. https://doi.org/10.1007/11840817_4

Speeding up the wrapper feature subset selection in regression by mutual information relevance and redundancy analysis

Abstract

Cite

Register to see more suggestions