Latent Semantic Indexing (LSI) is an established dimensionality reduction technique for Information Retrieval applications. However, LSI generated dimensions are not optimal in a classification setting, since LSI fails to exploit class labels of training documents. We propose an approach that uses class information to influence LSI dimensions whereby class labels of training documents are endoded as new terms, which are appended to the documents. When LSI is carried out on the augmented term-document matrix, terms pertaining to the same class are pulled closer to each other. Evaluation over experimental data reveals significant improvement in classification accuracy over LSI. The results also compare favourably with naive Support Vector Machines. © Springer-Verlag Berlin Heidelberg 2006.
CITATION STYLE
Chakraborti, S., Lothian, R., Wiratunga, N., & Watt, S. (2006). Sprinkling: Supervised latent semantic indexing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3936 LNCS, pp. 510–514). Springer Verlag. https://doi.org/10.1007/11735106_53
Mendeley helps you to discover research relevant for your work.