A unified probabilistic framework for clustering correlated heterogeneous web objects

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Most existing algorithms cluster highly correlated data objects (e.g. web pages and web queries) separately. Some other algorithms, however, do take into account the relationship between data objects, but they either integrate content and link features into a unified feature space or apply a hard clustering algorithm, making it difficult to fully utilize the correlated information over the heterogeneous Web objects. In this paper, we propose a novel unified probabilistic framework for iteratively clustering correlated heterogeneous data objects until it converges. Our approach introduces two latent clustering layers, which serve as two mixture probabilistic models of the features. In each clustering iteration we use EM (Expectation-Maximization) algorithm to estimate the parameters of the mixture model in one latent layer and propagate them to the other one. The experimental results show that our approach effectively combines the content and link features and improves the performance of the clustering. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Liu, G., Zhu, W., & Yu, Y. (2005). A unified probabilistic framework for clustering correlated heterogeneous web objects. In Lecture Notes in Computer Science (Vol. 3399, pp. 76–87). Springer Verlag. https://doi.org/10.1007/978-3-540-31849-1_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free