Knowledge supervised text classification with no labeled documents

Congle Zhang; Gui Rong Xue; Yong Yu

Conference Proceedings

Knowledge supervised text classification with no labeled documents

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 5351 LNAI 509-520

DOI: 10.1007/978-3-540-89197-0_47

1Citations

6Readers

Get full text

Abstract

In traditional text classification approaches, the semantic meanings of the classes are described by the labeled documents. Since labeling documents is often time consuming and expensive, it is a promising idea that asking users to provide some keywords to depict the classes, instead of labeling any documents. However, short pieces of keywords may not contain enough information and therefore may lead to unreliable classifier. Fortunately, there are large amount of public data easily available in web directories, such as ODP, Wikipedia, etc. We are interested in exploring the enormous crowd intelligence contained in such public data to enhance text classification. In this paper, we propose a novel text classification framework called "Knowledge Supervised Learning"(KSL), which utilizes the knowledge in keywords and the crowd intelligence to learn the classifier without any labeled documents. We design a two-stage risk minimization (TSRM) approach for the KSL problem. It can optimize the expected prediction risk and build the high quality classifier. Empirical results verify our claim: our algorithm can achieve above 0.9 on Micro-F1 on average, which is much better than baselines and even comparable against SVM classifier supervised by labeled documents. © 2008 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Zhang, C., Xue, G. R., & Yu, Y. (2008). Knowledge supervised text classification with no labeled documents. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5351 LNAI, pp. 509–520). https://doi.org/10.1007/978-3-540-89197-0_47

Knowledge supervised text classification with no labeled documents

Abstract

Cite

Register to see more suggestions