Sprinkling: Supervised latent semantic indexing

22Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Latent Semantic Indexing (LSI) is an established dimensionality reduction technique for Information Retrieval applications. However, LSI generated dimensions are not optimal in a classification setting, since LSI fails to exploit class labels of training documents. We propose an approach that uses class information to influence LSI dimensions whereby class labels of training documents are endoded as new terms, which are appended to the documents. When LSI is carried out on the augmented term-document matrix, terms pertaining to the same class are pulled closer to each other. Evaluation over experimental data reveals significant improvement in classification accuracy over LSI. The results also compare favourably with naive Support Vector Machines. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Chakraborti, S., Lothian, R., Wiratunga, N., & Watt, S. (2006). Sprinkling: Supervised latent semantic indexing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3936 LNCS, pp. 510–514). Springer Verlag. https://doi.org/10.1007/11735106_53

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free