Scrooge: A cost-effective deep learning inference system

35Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Advances in deep learning (DL) have prompted the development of cloud-hosted DL-based media applications that process video and audio streams in real-time. Such applications must satisfy throughput and latency objectives and adapt to novel types of dynamics, while incurring minimal cost. Scrooge, a system that provides media applications as a service, achieves these objectives by packing computations efficiently into GPU-equipped cloud VMs, using an optimization formulation to find the lowest cost VM allocations that meet the performance objectives, and rapidly reacting to variations in input complexity (e.g., changes in participants in a video). Experiments show that Scrooge can save serving cost by 16-32% (which translate to tens of thousands of dollars per year) relative to the state-of-the-art while achieving latency objectives for over 98% under dynamic workloads.

Cite

CITATION STYLE

APA

Hu, Y., Ghosh, R., & Govindan, R. (2021). Scrooge: A cost-effective deep learning inference system. In SoCC 2021 - Proceedings of the 2021 ACM Symposium on Cloud Computing (pp. 624–638). Association for Computing Machinery, Inc. https://doi.org/10.1145/3472883.3486993

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free