The identification of the most central nodes of a graph is a fundamental task of data analysis. The current flow betweenness is a centrality index which considers how the information flows along all the paths of a graph, not only on the shortest ones. Finding the exact value of the current flow betweenness is computationally expensive for large graphs, so the definition of algorithms returning an approximation of this measure is mandatory. In this paper we propose a solution, based on the Gather Apply Scatter model, that estimates the current flow betweenness in a distributed setting using the Apache Spark framework. The experimental evaluation shows that the algorithm achieves high correlation with the exact value of the index and outperforms other algorithms.
CITATION STYLE
Bertolucci, M., Lulli, A., & Ricci, L. (2016). Current flow betweenness centrality with Apache Spark. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10048 LNCS, pp. 270–278). Springer Verlag. https://doi.org/10.1007/978-3-319-49583-5_21
Mendeley helps you to discover research relevant for your work.