A Theory of Auto-Scaling for Resource Reservation in Cloud Services

Konstantinos Psychas; Javad Ghaderi

Journal ArticleOPEN ACCESS

A Theory of Auto-Scaling for Resource Reservation in Cloud Services

Stochastic Systems (2022) 12(3) 227-252

DOI: 10.1287/stsy.2021.0091

2Citations

14Readers

Get full text

Abstract

We consider a distributed server system consisting of a large number of servers, each with limited capacity on multiple resources (CPU, memory, etc.). Jobs with different rewards arrive over time and require certain amounts of resources for the duration of their service. When a job arrives, the system must decide whether to admit it or reject it, and if admitted, in which server to schedule it. The objective is to maximize the expected total reward received by the system. This problem is motivated by control of cloud computing clusters, in which jobs are requests for virtual machines (VMs) or containers that reserve resources for various services, and rewards represent service priority of requests or price paid per time unit of service. We study this problem in an asymptotic regime where the number of servers and jobs’ arrival rates scale by a factor L, asL becomes large. We propose a resource reservation policy that asymptotically achieves at least 1/2, and under certain monotone property on jobs’ rewards and resources, at least 1 − 1=e of the optimal expected reward. The policy automatically scales the number of VM slots for each job type as the demand changes and decides in which servers the slots should be created in advance, without the knowledge of trafficrates.

Author supplied keywords

Cite

CITATION STYLE

APA

Psychas, K., & Ghaderi, J. (2022). A Theory of Auto-Scaling for Resource Reservation in Cloud Services. Stochastic Systems, 12(3), 227–252. https://doi.org/10.1287/stsy.2021.0091

A Theory of Auto-Scaling for Resource Reservation in Cloud Services

Abstract

Author supplied keywords

Cite

Register to see more suggestions