On Estimating Model in Feature Selection with Cross-Validation

Chunxia Qi; Jiandong Diao; Like Qiu

Journal ArticleOPEN ACCESS

On Estimating Model in Feature Selection with Cross-Validation

IEEE Access (2019) 7 33454-33463

DOI: 10.1109/ACCESS.2019.2892062

24Citations

38Readers

Abstract

Both wrapper and hybrid methods in feature selection need the intervention of learning algorithm to train parameters. The preset parameters and dataset are used to construct several sub-optimal models, from which the final model is selected. The question is how to evaluate the performance of these sub-optimal models? What are the effects of different evaluation methods of sub-optimal model on the result of feature selection? Aiming at the evaluation problem of predictive models in feature selection, we chose a hybrid feature selection algorithm, FDHSFFS, and conducted comparative experiments on four UCI datasets with large differences in feature dimension and sample size by using five different cross-validation (CV) methods. The experimental results show that in the process of feature selection, twofold CV and leave-one-out-CV are more suitable for the model evaluation of low-dimensional and small sample datasets, tenfold nested CV and tenfold CV are more suitable for the model evaluation of high-dimensional datasets; tenfold nested CV is close to the unbiased estimation, and different optimal models may choose the same approximate optimal feature subset.

Author supplied keywords

Cite

CITATION STYLE

APA

Qi, C., Diao, J., & Qiu, L. (2019). On Estimating Model in Feature Selection with Cross-Validation. IEEE Access, 7, 33454–33463. https://doi.org/10.1109/ACCESS.2019.2892062

On Estimating Model in Feature Selection with Cross-Validation

Abstract

Author supplied keywords

Cite

Register to see more suggestions