This paper addresses the problem of topic distillation on the World Wide Web, namely, given a typical user query to find quality documents related to the query topic. Connectivity analysis has been shown to be useful in identifying high quality pages within a topic specific graph of hyperlinked documents. The essence of our approach is to augment a previous connectivity analysis based algorithm with content analysis. We identify three problems with the existing approach and devise algorithms to tackle them. The results of a user evaluation are reported that show an improvement of precision at 10 documents by at least 45% over pure connectivity analysis.
CITATION STYLE
Bharat, K., & Henzinger, M. R. (1998). Improved Algorithms for Topic Distillation in a Hyperlinked Environment. In SIGIR 1998 - Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 104–111). Association for Computing Machinery, Inc. https://doi.org/10.1145/290941.290972
Mendeley helps you to discover research relevant for your work.