We introduce two novel methods of text categorization in which documents are split into fragments. We conducted experiments on English, French and Czech. In all cases, the problems referred to a binary document classification. We find that both methods increase the accuracy of text categorization. For the Naïve Bayes classifier this increase is significant.
CITATION STYLE
Blaták, J., Mráková, E., & Popelínský, L. (2004). Fragments and text categorization. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2004-July). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1219044.1219078
Mendeley helps you to discover research relevant for your work.