We examine the impact on classification effectiveness of semantic differences in categories. Specifically, we measure broadness and narrowness of categories in terms of their distance to the root of a hierarchically organized thesaurus. Using categories of four different levels degrees of broadness, we show that classifying documents into narrow categories gives better scores than classifying them into broad terms, which we attribute to the fact that more specific categories are associated with terms with a higher discriminatory power. © Springer-Verlag Berlin Heidelberg 2006.
CITATION STYLE
Bouma, L., & De Rijke, M. (2006). Specificity helps text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3936 LNCS, pp. 539–542). Springer Verlag. https://doi.org/10.1007/11735106_60
Mendeley helps you to discover research relevant for your work.