An empirical comparison of text categorization methods

37Citations
Citations of this article
59Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper we present a comprehensive comparison of the performance of a number of text categorization methods in two different data sets. In particular, we evaluate the Vector and Latent Semantic Analysis (LSA) methods, a classifier based on Support Vector Machines (SVM) and the k-Nearest Neighbor variations of the Vector and LSA models. We report the results obtained using the Mean Reciprocal Rank as a measure of overall performance, a commonly used evaluation measure for question answering tasks. We argue that this evaluation measure is also very well suited for text categorization tasks. Our results show that overall, SVMs and k-NN LSA perform better than the other methods, in a statistically significant way. © Springer-Verlag Berlin Heidelberg 2003.

Cite

CITATION STYLE

APA

Cardoso-Cachopo, A., & Oliveira, A. L. (2003). An empirical comparison of text categorization methods. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2857, 183–196. https://doi.org/10.1007/978-3-540-39984-1_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free