Automatic webpage classification enhanced by unlabeled data

Seong Bae Park; Byoung Tak Zhang

Journal Article

Automatic webpage classification enhanced by unlabeled data

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 2690 821-825

DOI: 10.1007/978-3-540-45080-1_113

2Citations

5Readers

Get full text

Abstract

This paper describes a novel method for webpage classification that uses unlabeled data. The proposed method is based on a sequential learning of the classifiers which are trained on a small number of labeled data and then augmented by a large number of unlabeled data. By taking advantage of unlabeled data, the effective number of labeled data needed is significantly reduced and the classification accuracy is increased. The use of unlabeled data is important because obtaining labeled data, especially in Web environment, is difficult and time-consuming. The experiments on two standard datasets show substantial improvements over the method which does not use unlabeled data. © Springer-Verlag 2003.

Cite

CITATION STYLE

APA

Park, S. B., & Zhang, B. T. (2004). Automatic webpage classification enhanced by unlabeled data. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2690, 821–825. https://doi.org/10.1007/978-3-540-45080-1_113

Automatic webpage classification enhanced by unlabeled data

Abstract

Cite

Register to see more suggestions