Measuring graph clustering quality remains an open problem. Here, we introduce three statistical measures to address the problem. We empirically explore their behavior under a number of stress test scenarios and compare it to the commonly used modularity and conductance. Our measures are robust, immune to resolution limit, easy to intuitively interpret and also have a formal statistical interpretation. Our empirical stress test results confirm that our measures compare favorably to the established ones. In particular, they are shown to be more responsive to graph structure, less sensitive to sample size and breakdowns during numerical implementation and less sensitive to uncertainty in connectivity. These features are especially important in the context of larger data sets or when the data may contain errors in the connectivity patterns.
CITATION STYLE
Miasnikof, P., Shestopaloff, A. Y., Bonner, A. J., & Lawryshyn, Y. (2018). A statistical performance analysis of graph clustering algorithms. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10836 LNCS, pp. 170–184). Springer Verlag. https://doi.org/10.1007/978-3-319-92871-5_11
Mendeley helps you to discover research relevant for your work.