Agglomerative clustering of a search engine query log

Doug Beeferman; Adam Berger

Conference Proceedings

Agglomerative clustering of a search engine query log

Proceeding of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2000) 407-416

DOI: 10.1145/347090.347176

602Citations

210Readers

Get full text

Abstract

This paper introduces a technique for mining a collection of user transactions with an Internet search engine to discover clusters of similar queries and similar URLs. The information we exploit is "clickthrough data": each record consists of a user's query to a search engine along with the URL which the user selected from among the candidates offered by the search engine. By viewing this dataset as a bipartite graph, with the vertices on one side corresponding to queries and on the other side to URLs, one can apply an agglomerative clustering algorithm to the graph's vertices to identify related queries and URLs. One noteworthy feature of the proposed algorithm is that it is "content-ignorant"-the algorithm makes no use of the actual content of the queries or URLs, but only how they co-occur within the clickthrough data. We describe how to enlist the discovered clusters to assist users in web search, and measure the effectiveness of the discovered clusters in the Lycos search engine.

Cite

CITATION STYLE

APA

Beeferman, D., & Berger, A. (2000). Agglomerative clustering of a search engine query log. In Proceeding of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 407–416). Association for Computing Machinery (ACM). https://doi.org/10.1145/347090.347176

Agglomerative clustering of a search engine query log

Abstract

Cite

Register to see more suggestions