Feature selection methods identify subsets of features in large datasets. Such methods have become popular in data-intensive areas, and performing feature selection prior to model constructionmay reduce the computational cost and improve the model quality. Monte Carlo Feature Selection (MCFS) is a feature selection method aimed at finding features to use for classification. Here we suggest a strategy using a z-test to compute the significance of a feature using MCFS. We have used simulated data with both informative and random features, and compared the z-test with a permutation test and a test implemented into the MCFS software. The z-test had a higher agreement with the permutation test compared with the built-in test. Furthermore, it avoided a bias related to the distribution of feature values that may have affected the built-in test. In conclusion, the suggested method has the potential to improve feature selection using MCFS.
CITATION STYLE
Bornelöv, S., & Komorowski, J. (2015). Selection of significant features using monte carlo feature selection. In Challenges in Computational Statistics and Data Mining (Vol. 605, pp. 25–38). Springer International Publishing. https://doi.org/10.1007/978-3-319-18781-5_2
Mendeley helps you to discover research relevant for your work.