Reducing tail latency using duplication: A multi-layered approach

9Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Duplication can be a powerful strategy for overcoming stragglers in cloud services, but is often used conservatively because of the risk of overloading the system. We call for making duplication a first-class concept in cloud systems, and make two contributions in this regard. First, we present duplicate-aware scheduling or DAS, an aggressive duplication policy that duplicates every job, but keeps the system safe by providing suitable support (prioritization and purging) at multiple layers of the cloud system. Second, we present the D-Stage abstraction, which supports DAS and other duplication policies across diverse layers of a cloud system (e.g., network, storage, etc.). The D-Stage abstraction decouples the duplication policy from the mechanism, and facilitates working with legacy layers of a system. Using this abstraction, we evaluate the benefits of DAS for two data parallel applications (HDFS, an in-memory workload generator) and a network function (Snort-based IDS cluster). Our experiments on the public cloud and Emulab show that DAS is safe to use, and the tail latency improvement holds across a wide range of workloads.

Cite

CITATION STYLE

APA

Bashir, H. M., Faisal, A. B., Jamshed, M. A., Vondras, P., Iftikhar, A. M., Qazi, I. A., & Dogar, F. R. (2019). Reducing tail latency using duplication: A multi-layered approach. In CoNEXT 2019 - Proceedings of the 15th International Conference on Emerging Networking Experiments and Technologies (pp. 246–259). Association for Computing Machinery, Inc. https://doi.org/10.1145/3359989.3365432

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free