Automatic learning features using bootstrapping for text categorization

Wenliang Chen; Jingbo Zhu; Honglin Wu; Tianshun Yao

Journal Article

Automatic learning features using bootstrapping for text categorization

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 2945 571-579

DOI: 10.1007/978-3-540-24630-5_70

2Citations

7Readers

Get full text

Abstract

When text categorization is applied to complex tasks, it is tedious and expensive to hand-label the large amounts of training data necessary for good performance. In this paper, we put forward an approach to text categorization that requires no labeled documents. The proposed approach automatically learns features using bootstrapping. The input consists of a small set of keywords per class and a large amount of easily obtained unlabeled documents. Using these automatically learned features, we develop a naïve Bayes classifier. The classifier provides 82.8% F1 while classifying a set of web documents into 10 categories, which performs better than naïve Bayes by supervised learning in small number of features cases. © Springer-Verlag 2004.

Cite

CITATION STYLE

APA

Chen, W., Zhu, J., Wu, H., & Yao, T. (2004). Automatic learning features using bootstrapping for text categorization. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2945, 571–579. https://doi.org/10.1007/978-3-540-24630-5_70

Automatic learning features using bootstrapping for text categorization

Abstract

Cite

Register to see more suggestions