Cluster Usage Policy Enforcement Using Slurm Plugins and an HTTP API

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

Managing and limiting cluster resource usage is a critical task for computing clusters with a large number of users. By enforcing usage limits, cluster managers are able to ensure fair availability for all users, bill users accordingly, and prevent the abuse of cluster resources. As this is such a common problem, there are naturally many existing solutions. However, to allow for greater control over usage accounting and submission behavior in Slurm, we present a system composed of: a web API which exposes accounting data; Slurm plugins that communicate with a REST-like HTTP implementation of that API; and client tools that use it to report usage. Key advantages of our system include a customizable resource accounting formula based on job parameters, preemptive blocking of user jobs at submission time, project-level and user-level resource limits, and support for the development of other web and command-line clients that query the extensible web API. We deployed this system on Berkeley Research Computing's institutional cluster, Savio, allowing us to automatically collect and store accounting data, and thereby easily enforce our cluster usage policy.

Cite

CITATION STYLE

APA

Li, M., Chan, N., Chandra, V., & Muriki, K. (2020). Cluster Usage Policy Enforcement Using Slurm Plugins and an HTTP API. In ACM International Conference Proceeding Series (pp. 232–238). Association for Computing Machinery. https://doi.org/10.1145/3311790.3397341

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free