Clustering web documents based on knowledge granularity

3Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We propose a new data model for Web document representation based on granulation computing, named as Expanded Vector Space Model (EVSM). Traditional Web document clustering is based on two-level knowledge granularity: document and term. It can lead to that clustering results are of "false relevant". In our approach, Web documents are represented in many-level knowledge granularity, Knowledge granularity with sufficiently conceptual sentences is beneficial for knowledge engineers to understand valuable relations hidden in data, With granularity calculation data can be more efficiently and effectively disposed of and knowledge engineers can handle the same dataset in different knowledge levels. This provides more reliable soundness for interpreting results of various data analysis methods. We experimentally evaluate the proposed approach and demonstrate that our algorithm is promising and efficient. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Huang, F., & Zhang, S. (2006). Clustering web documents based on knowledge granularity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3841 LNCS, pp. 85–96). Springer Verlag. https://doi.org/10.1007/11610113_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free