Abstract
Multi-tenancy in modern datacenters is currently limited to a single latency-critical, interactive service, running alongside one or more low-priority, best-efort jobs. This limits the eiciency gains from multi-tenancy, especially as an increasing number of cloud applications are shifting from batch jobs to services with strict latency requirements. We present PARTIES, a QoS-aware resource manager that enables an arbitrary number of interactive, latency-critical services to share a physical node without QoS violations. PARTIES leverages a set of hardware and software resource partitioning mechanisms to adjust allocations dynamically at runtime, in a way that meets the QoS requirements of each co-scheduled workload, and maximizes throughput for the machine. We evaluate PARTIES on state-of-the-art server platforms across a set of diverse interactive services. Our results show that PARTIES improves throughput under QoS by 61% on average, compared to existing resource managers, and that the rate of improvement increases with the number of co-scheduled applications per physical host.
Author supplied keywords
Cite
CITATION STYLE
Chen, S., Delimitrou, C., & Martinez, J. F. (2019). PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services. In International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS (pp. 107–120). Association for Computing Machinery. https://doi.org/10.1145/3297858.3304005
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.