A novel web page categorization algorithm based on block propagation using query-log information

4Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Most existing web page classification algorithms, including contentbased, link-based, or query-log analysis methods, treat the pages as smallest units. However, web pages usually contain some noisy or biased information which could affect the performance of classification. In this paper, we propose a Block Propagation Categorization (BPC) algorithm which deep mines web structure and views blocks as basic semantic units. Moreover, with query log information, BPC propagates only suitable information (block) among web pages to emphasize their topics. We also optimize the BPC algorithm to significantly speed up the block propagation process, without losing any precision. Our experiments on ODP and MSN search engine log show that BPC achieves a great improvement over traditional approaches. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Dai, W., Yu, Y., Zhang, C. L., Han, J., & Xue, G. R. (2006). A novel web page categorization algorithm based on block propagation using query-log information. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4016 LNCS, pp. 435–446). Springer Verlag. https://doi.org/10.1007/11775300_37

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free