Abstract
Sparse representation originating from signal compressed sensing theory has attracted increasing interest in computer vision research community. In this paper, we present a novel non-parametric feature selection method based on sparse representation in text classification. In order to solve the problem of polysems and synonyms in VSM, we construct semantic structure to represent document with PLSA. Motivated by the fact that kernel trick can capture the nonlinear similarity of features, which may reduce the feature quantization error, we propose Empirical Kernel Sparse Representation (EKSR). We apply EKSR to reconstruct weight vector between samples, then design evaluating mechanism CKernel Sparsity Score (KSS) to select excellent feature subset. As the natural discriminative power of EKSR, KSS can find Agood@ feature which preserves the original structure with less information loss. The results of experiment both on English and Chinese dataset demonstrate the effectiveness of the proposed method. © 2012 Asian Network for Scientific Information.
Author supplied keywords
Cite
CITATION STYLE
Deng, Z., Hu, G., Pan, Z., & Zhang, Y. (2012). Kernel sparse feature selection based on semantics in text classification. Information Technology Journal, 11(3), 319–323. https://doi.org/10.3923/itj.2012.319.323
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.