When text categorization is applied to complex tasks, it is tedious and expensive to hand-label the large amounts of training data necessary for good performance. In this paper, we put forward an approach to text categorization that requires no labeled documents. The proposed approach automatically learns features using bootstrapping. The input consists of a small set of keywords per class and a large amount of easily obtained unlabeled documents. Using these automatically learned features, we develop a naïve Bayes classifier. The classifier provides 82.8% F1 while classifying a set of web documents into 10 categories, which performs better than naïve Bayes by supervised learning in small number of features cases. © Springer-Verlag 2004.
CITATION STYLE
Chen, W., Zhu, J., Wu, H., & Yao, T. (2004). Automatic learning features using bootstrapping for text categorization. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2945, 571–579. https://doi.org/10.1007/978-3-540-24630-5_70
Mendeley helps you to discover research relevant for your work.