Reducing tail latency using duplication: A multi-layered approach

Hafiz Mohsin Bashir; Abdullah Bin Faisal; M. Asim Jamshed; Peter Vondras; Ali Musa Iftikhar; Ihsan Ayyub Qazi; Fahad R. Dogar

Conference ProceedingsOPEN ACCESS

Reducing tail latency using duplication: A multi-layered approach

CoNEXT 2019 - Proceedings of the 15th International Conference on Emerging Networking Experiments and Technologies (2019) 246-259

DOI: 10.1145/3359989.3365432

9Citations

14Readers

Get full text

Abstract

Duplication can be a powerful strategy for overcoming stragglers in cloud services, but is often used conservatively because of the risk of overloading the system. We call for making duplication a first-class concept in cloud systems, and make two contributions in this regard. First, we present duplicate-aware scheduling or DAS, an aggressive duplication policy that duplicates every job, but keeps the system safe by providing suitable support (prioritization and purging) at multiple layers of the cloud system. Second, we present the D-Stage abstraction, which supports DAS and other duplication policies across diverse layers of a cloud system (e.g., network, storage, etc.). The D-Stage abstraction decouples the duplication policy from the mechanism, and facilitates working with legacy layers of a system. Using this abstraction, we evaluate the benefits of DAS for two data parallel applications (HDFS, an in-memory workload generator) and a network function (Snort-based IDS cluster). Our experiments on the public cloud and Emulab show that DAS is safe to use, and the tail latency improvement holds across a wide range of workloads.

Author supplied keywords

Cite

CITATION STYLE

APA

Bashir, H. M., Faisal, A. B., Jamshed, M. A., Vondras, P., Iftikhar, A. M., Qazi, I. A., & Dogar, F. R. (2019). Reducing tail latency using duplication: A multi-layered approach. In CoNEXT 2019 - Proceedings of the 15th International Conference on Emerging Networking Experiments and Technologies (pp. 246–259). Association for Computing Machinery, Inc. https://doi.org/10.1145/3359989.3365432

Reducing tail latency using duplication: A multi-layered approach

Abstract

Author supplied keywords

Cite

Register to see more suggestions