Clustering of web documents using graph representations

Adam Schenker; Horst Bunke; Mark Last; Abraham Kandel

Journal Article

Clustering of web documents using graph representations

Studies in Computational Intelligence (2007) 52 247-265

DOI: 10.1007/978-3-540-68020-8_10

8Citations

5Readers

Get full text

Abstract

In this paper we describe a clustering method that allows the use of graph-based representations of data instead of traditional vector-based representations. Using this new method we conduct content-based clustering of two web document collections. Clustering of web documents is performed to organize the documents with little or no human intervention. Benefits of clustering include easier browsing and improved retrieval speed. In order to measure the performance of our graph-matching approach, we compare it to the popular vector-based k-means method. We perform experiments using different graph distance measures as well as various document representations that utilize graphs. The results with the k-means clustering algorithm show that the graph-based approach can outperform traditional vector-based methods. © Springer-Verlag Berlin Heidelberg 2007.

Author supplied keywords

Cite

CITATION STYLE

APA

Schenker, A., Bunke, H., Last, M., & Kandel, A. (2007). Clustering of web documents using graph representations. Studies in Computational Intelligence, 52, 247–265. https://doi.org/10.1007/978-3-540-68020-8_10

Clustering of web documents using graph representations

Abstract

Author supplied keywords

Cite

Register to see more suggestions