Measuring the stability of feature selection

65Citations
Citations of this article
48Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In feature selection algorithms, “stability” is the sensitivity of the chosen feature set to variations in the supplied training data. As such it can be seen as an analogous concept to the statistical variance of a predictor. However unlike variance, there is no unique definition of stability, with numerous proposed measures over 15 years of literature. In this paper, instead of defining a new measure, we start from an axiomatic point of view and identify what properties would be desirable. Somewhat surprisingly, we find that the simple Pearson’s correlation coefficient has all necessary properties, yet has somehow been overlooked in favour of more complex alternatives. Finally, we illustrate how the use of this measure in practice can provide better interpretability and more confidence in the model selection process. The data and software related to this paper are available at https://github.com/nogueirs/ECML2016.

Author supplied keywords

Cite

CITATION STYLE

APA

Nogueira, S., & Brown, G. (2016). Measuring the stability of feature selection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9852 LNAI, pp. 442–457). Springer Verlag. https://doi.org/10.1007/978-3-319-46227-1_28

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free