Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification?

7Citations
Citations of this article
87Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Support Vector Machines present an interesting and effective approach to solve automated classification tasks. Although it only handles binary and supervised problems by nature, it has been transformed into multiclass and semi-supervised approaches in several works. A previous study on supervised and semi-supervised SVM classification over binary taxonomies showed how the latter clearly outperforms the former, proving the suitability of unlabeled data for the learning phase in this kind of tasks. However, the suitability of unlabeled data for multiclass tasks using SVM has never been tested before. In this work, we present a study on whether unlabeled data could improve results for multiclass web page classification tasks using Support Vector Machines. As a conclusion, we encourage to rely only on labeled data, both for improving (or at least equaling) performance and for reducing the computational cost.

Cite

CITATION STYLE

APA

Zubiaga, A., Fresno, V., & Martínez, R. (2009). Is Unlabeled Data Suitable for Multiclass SVM-based Web Page Classification? In NAACL HLT 2009 - Semi-Supervised Learning for Natural Language Processing, Proceedings of the Workshop (pp. 28–36). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1621829.1621833

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free