Abstract
In this paper, we address the problem of dealing with a large collection of data and propose a method for text classification which manipulates data using two well-known machine learning techniques, Naive Bayes(NB) and Support Vector Machines(SVMs). NB is based on the assumption of word independence in a text, which makes the computation of it far more efficient. SVMs, on the other hand, have the potential to handle large feature spaces, which makes it possible to produce better performance. The training data for SVMs are extracted using NB classifiers according to the category hierarchies, which makes it possible to reduce the amount of computation necessary for classification without sacrificing accuracy.
Cite
CITATION STYLE
Fukumoto, F., & Suzuki, Y. (2002). Manipulating Large Corpora for Text Classification. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, EMNLP 2002 (pp. 196–203). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1118693.1118719
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.