Dimensionality reduction can efficiently improve computing performance of classifiers in text categorization, and non-negative matrix factorization could map the high dimensional term space into a low dimensional semantic subspace easily. Meanwhile, the non-negative of the basis vectors could provide a meaningful explanation for the semantic subspace. However, it usually could not achieve a satisfied classification performance because it is sensitive to the noise, data missing and outlier as a linear reconstruction method. This paper proposes a novel approach in which the train text and its category information are fused and a transformation matrix that maps the term space into a semantic subspace is obtained by a basis orthogonality non-negative matrix factorization and truncation. Finally, the dimensionality can be reduced aggressively with these transformations. Experimental results show that the proposed approach remains a good classification performance in a very low dimensional case. © 2011 Springer-Verlag.
CITATION STYLE
Zheng, W., Qian, Y., & Tang, H. (2011). Dimensionality reduction with category information fusion and non-negative matrix factorization for text categorization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7004 LNAI, pp. 505–512). https://doi.org/10.1007/978-3-642-23896-3_62
Mendeley helps you to discover research relevant for your work.