Abstract
Federated learning (FL) is typically performed in a synchronous parallel manner, and the involvement of a slow client delays the training progress. Current FL systems employ a participant selection strategy to select fast clients with quality data in each iteration. However, this is not always possible in practice, and the selection strategy has to navigate a knotty tradeoff between the speed and the data quality. This paper makes a case for asynchronous FL by presenting Pisces, a new FL system with intelligent participant selection and model aggregation for accelerated training despite slow clients. To avoid incurring excessive resource cost and stale training computation, Pisces uses a novel scoring mechanism to identify suitable clients to participate in each training iteration. It also adapts the aggregation pace dynamically to bound the progress gap between the participating clients and the server, with a provable convergence guarantee in a smooth non-convex setting. We have implemented Pisces in an open-source FL platform, Plato, and evaluated its performance in large-scale experiments with popular vision and language models. Pisces outperforms the state-of-the-art synchronous and asynchronous alternatives, reducing the time-to-accuracy by up to 2.0X and 1.9X, respectively.
Author supplied keywords
Cite
CITATION STYLE
Jiang, Z., Wang, W., Li, B., & Li, B. (2022). Pisces: Efficient Federated Learning via Guided Asynchronous Training. In SoCC 2022 - Proceedings of the 13th Symposium on Cloud Computing (pp. 370–385). Association for Computing Machinery, Inc. https://doi.org/10.1145/3542929.3563463
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.