A Theory of Auto-Scaling for Resource Reservation in Cloud Services

2Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We consider a distributed server system consisting of a large number of servers, each with limited capacity on multiple resources (CPU, memory, etc.). Jobs with different rewards arrive over time and require certain amounts of resources for the duration of their service. When a job arrives, the system must decide whether to admit it or reject it, and if admitted, in which server to schedule it. The objective is to maximize the expected total reward received by the system. This problem is motivated by control of cloud computing clusters, in which jobs are requests for virtual machines (VMs) or containers that reserve resources for various services, and rewards represent service priority of requests or price paid per time unit of service. We study this problem in an asymptotic regime where the number of servers and jobs’ arrival rates scale by a factor L, asL becomes large. We propose a resource reservation policy that asymptotically achieves at least 1/2, and under certain monotone property on jobs’ rewards and resources, at least 1 − 1=e of the optimal expected reward. The policy automatically scales the number of VM slots for each job type as the demand changes and decides in which servers the slots should be created in advance, without the knowledge of trafficrates.

Cite

CITATION STYLE

APA

Psychas, K., & Ghaderi, J. (2022). A Theory of Auto-Scaling for Resource Reservation in Cloud Services. Stochastic Systems, 12(3), 227–252. https://doi.org/10.1287/stsy.2021.0091

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free