When does Label propagation fail? A view from a network generative model

15Citations
Citations of this article
51Readers
Mendeley users who have this article in their library.

Abstract

What kinds of data does Label Propagation (LP) work best on? Can we justify the solution of LP from a theoretical standpoint? LP is a semi-supervised learning algorithm that is widely used to predict unobserved node labels on a network (e.g., user's gender on an SNS). Despite its importance, its theoretical properties remain mostly unexplored. In this paper, we answer the above questions by interpreting LP from a statistical viewpoint. As our main result, we identify the network generative model behind the discretized version of LP (DLP), and we show that under specific conditions the solution of DLP is equal to the maximum a posteriori estimate of that generative model. Our main result reveals the critical limitations of LP. Specifically, we discover that LP would not work best on networks with (1) disassortative node labels, (2) clusters having different edge densities, (3) non-uniform label distributions, or (4) unreliable node labels provided. Our experiments under a variety of settings support our theoretical results.

Cite

CITATION STYLE

APA

Yamaguchi, Y., & Hayashi, K. (2017). When does Label propagation fail? A view from a network generative model. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 0, pp. 3224–3230). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2017/450

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free