This paper describes a novel method for webpage classification that uses unlabeled data. The proposed method is based on a sequential learning of the classifiers which are trained on a small number of labeled data and then augmented by a large number of unlabeled data. By taking advantage of unlabeled data, the effective number of labeled data needed is significantly reduced and the classification accuracy is increased. The use of unlabeled data is important because obtaining labeled data, especially in Web environment, is difficult and time-consuming. The experiments on two standard datasets show substantial improvements over the method which does not use unlabeled data. © Springer-Verlag 2003.
CITATION STYLE
Park, S. B., & Zhang, B. T. (2004). Automatic webpage classification enhanced by unlabeled data. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2690, 821–825. https://doi.org/10.1007/978-3-540-45080-1_113
Mendeley helps you to discover research relevant for your work.