Scalable and space-efficient Robust Matroid Center algorithms

1Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Given a dataset V of points from some metric space, a popular robust formulation of the k-center clustering problem requires to select k points (centers) of V which minimize the maximum distance of any point of V from its closest center, excluding the z most distant points (outliers) from the computation of the maximum. In this paper, we focus on an important constrained variant of the robust k-center problem, namely, the Robust Matroid Center (RMC) problem, where the set of returned centers are constrained to be an independent set of a matroid of rank k built on V. Instantiating the problem with the partition matroid yields a formulation of the fair k-center problem, which has attracted the interest of the ML community in recent years. In this paper, we target accurate solutions of the RMC problem under general matroids, when confronted with large inputs. Specifically, we devise a coreset-based algorithm affording efficient sequential, distributed (MapReduce) and streaming implementations. For any fixed ε> 0 , the algorithm returns solutions featuring a (3 + ε) -approximation ratio, which is a mere additive term ε away from the 3-approximations achievable by the best known polynomial-time sequential algorithms. Moreover, the algorithm obliviously adapts to the intrinsic complexity of the dataset, captured by its doubling dimension D. For wide ranges of k, z, ε, D, our MapReduce/streaming implementations require two rounds/one pass and substantially sublinear local/working memory. The theoretical results are complemented by an extensive set of experiments on real-world datasets, which provide clear evidence of the accuracy and efficiency of our algorithms and of their improved performance with respect to previous solutions.

Cite

CITATION STYLE

APA

Ceccarello, M., Pietracaprina, A., Pucci, G., & Soldà, F. (2023). Scalable and space-efficient Robust Matroid Center algorithms. Journal of Big Data, 10(1). https://doi.org/10.1186/s40537-023-00717-4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free