Hypothesis testing for automated community detection in networks

135Citations
Citations of this article
69Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Community detection in networks is a key exploratory tool with applications in a diverse set of areas, ranging from finding communities in social and biological networks to identifying link farms in the World Wide Web. The problem of finding communities or clusters in a network has received much attention from statistics, physics and computer science. However, most clustering algorithms assume knowledge of the number of clusters k. We propose to determine k automatically in a graph generated from a stochastic block model by using a hypothesis test of independent interest. Our main contribution is twofold; first, we theoretically establish the limiting distribution of the principal eigenvalue of the suitably centred and scaled adjacency matrix and use that distribution for our test of the hypothesis that a random graph is of Erdős–Rényi (noise) type. Secondly, we use this test to design a recursive bipartitioning algorithm, which naturally uncovers nested community structure. Using simulations and quantifiable classification tasks on real world networks with ground truth, we show that our algorithm outperforms state of the art methods.

Cite

CITATION STYLE

APA

Bickel, P. J., & Sarkar, P. (2016). Hypothesis testing for automated community detection in networks. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 78(1), 253–273. https://doi.org/10.1111/rssb.12117

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free