Finding significant web pages with lower ranks by Pseudo-Clique search

3Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we discuss a method of finding useful clusters of web pages which are significant in the sense that their contents are similar or closely related to ones of higher-ranked pages, Since we are usually careless of pages with lower ranks, they are unconditionally discarded even if their contents are similar to some pages with high ranks. We try to extract such hidden pages together with significant higherranked pages as a cluster. In order to obtain such clusters, we first extract semantic correlations among terms by applying Singular Value Decomposition(SVD) to the term-document matrix generated from a corpus w.r.t. a specific topic, Based on the correlations, we can evaluate potential similarities among web pages from which we try to obtain clusters, The set of web pages is represented as a weighted graph G based on the similarities and their ranks. Our clusters can be found as pseudo-cliques in G. We present an algorithm for finding Top-N weighted pseudo-cliques. Our experimental result shows that quite valuable clusters can be actually extracted according to our method. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Okubo, Y., Haraguchi, M., & Shi, B. (2005). Finding significant web pages with lower ranks by Pseudo-Clique search. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3735 LNAI, pp. 346–353). https://doi.org/10.1007/11563983_30

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free