Importance-based web page classification using cost-sensitive SVM

Wei Liu; Gui Rong Xue; Yong Yu; Hua Jun Zeng

Conference Proceedings

Importance-based web page classification using cost-sensitive SVM

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2005) 3739 LNCS 127-137

DOI: 10.1007/11563952_12

6Citations

6Readers

Get full text

Abstract

Web page classification is facing great challenges since there is a huge repository and diversity of information. As known, each web page varies both in content and quality, just as PageRank suggested. Typical machine learning algorithms take advantage of positive and negative examples to train a classifier; however, it has been neglected that each instance has a different weight, which can be user pre-defined. This paper presents an effective algorithm based on Cost-Sensitive Support Vector Machine (CS-SVM) to improve the accuracy of classification. During the training process of CS-SVM, different cost factors are attached on the training errors to generate an optimized hyperplane. Our experiments show that CS-SVM outperforms SVM on the standard ODP data set. The web pages with relative high PageRank values contribute most to the classifier and using them for training can exceed the random sampling technique. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Liu, W., Xue, G. R., Yu, Y., & Zeng, H. J. (2005). Importance-based web page classification using cost-sensitive SVM. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3739 LNCS, pp. 127–137). Springer Verlag. https://doi.org/10.1007/11563952_12

Importance-based web page classification using cost-sensitive SVM

Abstract

Cite

Register to see more suggestions