In this paper we present a comprehensive comparison of the performance of a number of text categorization methods in two different data sets. In particular, we evaluate the Vector and Latent Semantic Analysis (LSA) methods, a classifier based on Support Vector Machines (SVM) and the k-Nearest Neighbor variations of the Vector and LSA models. We report the results obtained using the Mean Reciprocal Rank as a measure of overall performance, a commonly used evaluation measure for question answering tasks. We argue that this evaluation measure is also very well suited for text categorization tasks. Our results show that overall, SVMs and k-NN LSA perform better than the other methods, in a statistically significant way. © Springer-Verlag Berlin Heidelberg 2003.
CITATION STYLE
Cardoso-Cachopo, A., & Oliveira, A. L. (2003). An empirical comparison of text categorization methods. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2857, 183–196. https://doi.org/10.1007/978-3-540-39984-1_14
Mendeley helps you to discover research relevant for your work.