A graph-based framework for web document mining

8Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In this paper we describe methods of performing data mining on web documents, where the web document content is represented by graphs. We show how traditional clustering and classification methods, which usually operate on vector representations of data, can be extended to work with graph-based data. Specifically, we give graphtheoretic extensions of the k-Nearest Neighbors classification algorithm and the k-means clustering algorithm that process graphs, and show how the retention of structural information can lead to improved performance over the case of the vector model approach. We introduce several different types of web document representations that utilize graphs and compare their performance for clustering and classification. © Springer-Verlag 2004.

Cite

CITATION STYLE

APA

Schenker, A., Bunke, H., Last, M., & Kandel, A. (2004). A graph-based framework for web document mining. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3163, 401–412. https://doi.org/10.1007/978-3-540-28640-0_38

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free