A new PU learning algorithm for text classification

Hailong Yu; Wanli Zuo; Tao Peng

Conference Proceedings

A new PU learning algorithm for text classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2005) 3789 LNAI 824-832

DOI: 10.1007/11579427_84

4Citations

5Readers

Get full text

Abstract

This paper studies the problem of building text classifiers using positive and unlabeled examples. The primary challenge of this problem as compared with classical text classification problem is that no labeled negative documents are available in the training example set. We call this problem PU-Oriented text Classification. Our text classifier adopts traditional two-step approach by making use of both positive and unlabeled examples. In the first step, we improved the 1-DNF algorithm by identifying much more reliable negative documents with very low error rate. In the second step, we build a set of classifiers by iteratively applying SVM algorithm on training data set, which is augmented during iteration. Different from previous PU-oriented text classification works, we adopt the weighted vote of all classifiers generated in the iteration steps to construct the final classifier instead of choosing one of the classifiers as the final classifier. Experimental results on the Reuter data set show that our method increases the performance (F1-measure) of classifier by 1.734 percent compared with PEBL. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Yu, H., Zuo, W., & Peng, T. (2005). A new PU learning algorithm for text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3789 LNAI, pp. 824–832). Springer Verlag. https://doi.org/10.1007/11579427_84

A new PU learning algorithm for text classification

Abstract

Cite

Register to see more suggestions