We present new algorithms for graph summarization where the loss in utility is fully controllable by the user. Specifically, we make three key contributions. First, we present a utility-driven graph summarization method G-SCIS, based on a clique and independent set decomposition, that produces optimal compression with zero loss of utility. The compression provided is significantly better than state-of-the-art in lossless graph summarization, while the runtime is two orders of magnitude lower. Second, we propose a highly scalable, utility-driven algorithm, T-BUDS, for fully controlled lossy summarization. It achieves high scalability by combining memory reduction using Maximum Spanning Tree with a novel binary search procedure. T-BUDS outperforms state-of-the-art drastically in terms of the quality of summarization and is about two orders of magnitude better in terms of speed. In contrast to the competition, we are able to handle web-scale graphs in a single machine without performance impediment as the utility threshold (and size of summary) decreases. Third, we show that our graph summaries can be used as-is to answer several important classes of queries, such as triangle enumeration, Pagerank and shortest paths.
CITATION STYLE
Hajiabadi, M., Singh, J., Srinivasan, V., & Thomo, A. (2021). Graph Summarization with Controlled Utility Loss. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 536–546). Association for Computing Machinery. https://doi.org/10.1145/3447548.3467359
Mendeley helps you to discover research relevant for your work.