Social networks are a source of large scale graphs. We study how social network algorithms behave on sparsified versions of such networks with two motivations in mind: 1. In practice, it is challenging to collect, store and process the entire often constantly growing network, so it is important to understand how algorithms behave on incomplete views of a network. 2. Even if one has the full network, algorithms may be infeasible at such large scale, and the only option may be to sparsify the networks to make them computationally tractable while still maintaining the fidelity of the social network algorithms. We present a variety of methods for sparsifying a network based on linear regression and linear algebraic sampling for graph reconstruction. We compare the methods against one another with respect to clustering. Specifically, given a graph G, we sample the columns of its adjacency matrix and reconstruct the remaining columns using only those sampled columns to obtain Ĝ, the reconstructed approximation of G. We then perform clustering on G and Ĝ to get two sets of clusters and compute their modularity, fitness and centrality. Our thorough experimentation reveals that graphs reconstructed through our methodology preserve (in some cases, even improve) community structure while being orders of magnitude more efficient both in storage and computation. We show similar results if the target is prominence of nodes rather than clusters.
CITATION STYLE
Hegde, K., Magdon-Ismail, M., Szymanski, B., & Kuzmin, K. (2017). Clustering, prominence and social network analysis on incomplete networks. Studies in Computational Intelligence, 693, 287–298. https://doi.org/10.1007/978-3-319-50901-3_23
Mendeley helps you to discover research relevant for your work.