Robust unsupervised cluster matching for network data

2Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Unsupervised cluster matching is a task to find matching between clusters of objects in different domains. Examples include matching word clusters in different languages without dictionaries or parallel sentences and matching user communities across different friendship networks. Existing methods assume that every object is assigned into a cluster. However, in real-world applications, some objects would not form clusters. These irrelevant objects deteriorate the cluster matching performance since mistakenly estimated matching affect on estimation of matching of other objects. In this paper, we propose a probabilistic model for robust unsupervised cluster matching that discovers relevance of objects and matching of object clusters, simultaneously, given multiple networks. The proposed method finds correspondence only for relevant objects, and keeps irrelevant objects unmatched, which enables us to improve the matching performance since the adverse impact of irrelevant objects is eliminated. With the proposed method, relevant objects in different networks are clustered into a shared set of clusters by assuming that different networks are generated from a common network probabilistic model, which is an extension of stochastic block models. Objects assigned into the same clusters are considered as matched. Edges for irrelevant objects are assumed to be generated from a noise distribution irrespective of cluster assignments. We present an efficient Bayesian inference procedure of the proposed model based on collapsed Gibbs sampling. In our experiments, we demonstrate the effectiveness of the proposed method using synthetic and real-world data sets, including multilingual corpora and movie ratings.

Cite

CITATION STYLE

APA

Iwata, T., & Ishiguro, K. (2017). Robust unsupervised cluster matching for network data. Data Mining and Knowledge Discovery, 31(4), 1132–1154. https://doi.org/10.1007/s10618-017-0509-y

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free