Scalable data-driven PageRank: Algorithms, system issues, and lessons learned

Joyce Jiyoung Whang; Andrew Lenharth; Inderjit S. Dhillon; Keshav Pingali

Conference ProceedingsOPEN ACCESS

Scalable data-driven PageRank: Algorithms, system issues, and lessons learned

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9233 438-450

DOI: 10.1007/978-3-662-48096-0_34

33Citations

22Readers

Abstract

Large-scale network and graph analysis has received considerable attention recently. Graph mining techniques often involve an iterative algorithm, which can be implemented in a variety of ways. Using PageRank as a model problem, we look at three algorithm design axes: work activation, data access pattern, and scheduling. We investigate the impact of different algorithm design choices. Using these design axes, we design and test a variety of PageRank implementations finding that data-driven, push-based algorithms are able to achieve more than 28x the performance of standard PageRank implementations (e.g., those in GraphLab). The design choices affect both single-threaded performance as well as parallel scalability. The implementation lessons not only guide efficient implementations of many graph mining algorithms, but also provide a framework for designing new scalable algorithms.

Author supplied keywords

Cite

CITATION STYLE

APA

Whang, J. J., Lenharth, A., Dhillon, I. S., & Pingali, K. (2015). Scalable data-driven PageRank: Algorithms, system issues, and lessons learned. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9233, pp. 438–450). Springer Verlag. https://doi.org/10.1007/978-3-662-48096-0_34

Scalable data-driven PageRank: Algorithms, system issues, and lessons learned

Abstract

Author supplied keywords

Cite

Register to see more suggestions