As graph datasets grow, faster data mining methods become indispensable. RandomWalk with Restart (RWR), belief propagation, semi-supervised learning, and more graph methods can be expressed as a set of linear equations. In this work, we focus on solving such equations fast and accurately when large number of queries need to be handled. We use RWR as a case study, since it is widely used not only to evaluate the importance of a node, but also as a basis for more complex tasks, e.g., representation learning and community detection. We introduce a new, intuitive two-step divide-and-conquer formulation and a corresponding parallelizable method, FlowR, for solving RWR with two goals: (i) fast and accurate computation under multiple queries; (ii) one-time message exchange between subproblems. We further speed up our proposed method by extending our formulation to carefully designed overlapping subproblems (FlowR-OV) and by leveraging the strengths of iterative methods (FlowR-Hyb). Extensive experiments on synthetic and real networks with up to ∼ 8 million edges show that our methods are accurate and outperform in runtime various state-of-the-art approaches, running up to 34 × faster in preprocessing and up to 32 × faster in query time.
CITATION STYLE
Yan, Y., Heimann, M., Jin, D., & Koutra, D. (2018). Fast flow-based random walk with restart in a multi-query setting. In SIAM International Conference on Data Mining, SDM 2018 (pp. 342–350). Society for Industrial and Applied Mathematics Publications. https://doi.org/10.1137/1.9781611975321.39
Mendeley helps you to discover research relevant for your work.