Current flow betweenness centrality with Apache Spark

Massimiliano Bertolucci; Alessandro Lulli; Laura Ricci

Conference Proceedings

Current flow betweenness centrality with Apache Spark

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 10048 LNCS 270-278

DOI: 10.1007/978-3-319-49583-5_21

1Citations

4Readers

Get full text

Abstract

The identification of the most central nodes of a graph is a fundamental task of data analysis. The current flow betweenness is a centrality index which considers how the information flows along all the paths of a graph, not only on the shortest ones. Finding the exact value of the current flow betweenness is computationally expensive for large graphs, so the definition of algorithms returning an approximation of this measure is mandatory. In this paper we propose a solution, based on the Gather Apply Scatter model, that estimates the current flow betweenness in a distributed setting using the Apache Spark framework. The experimental evaluation shows that the algorithm achieves high correlation with the exact value of the index and outperforms other algorithms.

Author supplied keywords

Cite

CITATION STYLE

APA

Bertolucci, M., Lulli, A., & Ricci, L. (2016). Current flow betweenness centrality with Apache Spark. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10048 LNCS, pp. 270–278). Springer Verlag. https://doi.org/10.1007/978-3-319-49583-5_21

Current flow betweenness centrality with Apache Spark

Abstract

Author supplied keywords

Cite

Register to see more suggestions