Stability based sparse LSI/PCA: Incorporating feature selection in LSI and PCA

6Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The stability of sample based algorithms is a concept commonly used for parameter tuning and validity assessment. In this paper we focus on two well studied algorithms, LSI and PCA, and propose a feature selection process that provably guarantees the stability of their outputs. The feature selection process is performed such that me level of (statistical) accuracy of the LSI/PCA input matrices is adequate for computing meaningful (stable) eigenvectors. The feature selection process "sparsifies" LSI/PCA, resulting in the projection of the instances on the eigenvectors of a principal submatrix of the original input matrix, thus producing sparse factor loadings that are linear combinations solely of the selected features. We utilize bootstrapping confidence intervals for assessing the statistical accuracy of the input sample matrices, and matrix perturbation theory in order to relate the statistical accuracy to the stability of eigenvectors. Experiments on several UCI-datasets verify empirically our approach. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Mavroeidis, D., & Vazirgiannis, M. (2007). Stability based sparse LSI/PCA: Incorporating feature selection in LSI and PCA. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4701 LNAI, pp. 226–237). Springer Verlag. https://doi.org/10.1007/978-3-540-74958-5_23

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free