Query-Efficient Correlation Clustering

David García-Soriano; Konstantin Kutzkov; Francesco Bonchi; Charalampos Tsourakakis

Conference ProceedingsOPEN ACCESS

Query-Efficient Correlation Clustering

The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020 (2020) 1468-1478

DOI: 10.1145/3366423.3380220

8Citations

10Readers

Get full text

Abstract

Correlation clustering is arguably the most natural formulation of clustering. Given n objects and a pairwise similarity measure, the goal is to cluster the objects so that, to the best possible extent, similar objects are put in the same cluster and dissimilar objects are put in different clusters. A main drawback of correlation clustering is that it requires as input the T(n2) pairwise similarities. This is often infeasible to compute or even just to store. In this paper we study query-efficient algorithms for correlation clustering. Specifically, we devise a correlation clustering algorithm that, given a budget of Q queries, attains a solution whose expected number of disagreements is at most , where is the optimal cost for the instance. Its running time is O(Q), and can be easily made non-adaptive (meaning it can specify all its queries at the outset and make them in parallel) with the same guarantees. Up to constant factors, our algorithm yields a provably optimal trade-off between the number of queries Q and the worst-case error attained, even for adaptive algorithms. Finally, we perform an experimental study of our proposed method on both synthetic and real data, showing the scalability and the accuracy of our algorithm.

Author supplied keywords

Cite

CITATION STYLE

APA

García-Soriano, D., Kutzkov, K., Bonchi, F., & Tsourakakis, C. (2020). Query-Efficient Correlation Clustering. In The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020 (pp. 1468–1478). Association for Computing Machinery, Inc. https://doi.org/10.1145/3366423.3380220

Query-Efficient Correlation Clustering

Abstract

Author supplied keywords

Cite

Register to see more suggestions