Abstract
Managing and limiting cluster resource usage is a critical task for computing clusters with a large number of users. By enforcing usage limits, cluster managers are able to ensure fair availability for all users, bill users accordingly, and prevent the abuse of cluster resources. As this is such a common problem, there are naturally many existing solutions. However, to allow for greater control over usage accounting and submission behavior in Slurm, we present a system composed of: a web API which exposes accounting data; Slurm plugins that communicate with a REST-like HTTP implementation of that API; and client tools that use it to report usage. Key advantages of our system include a customizable resource accounting formula based on job parameters, preemptive blocking of user jobs at submission time, project-level and user-level resource limits, and support for the development of other web and command-line clients that query the extensible web API. We deployed this system on Berkeley Research Computing's institutional cluster, Savio, allowing us to automatically collect and store accounting data, and thereby easily enforce our cluster usage policy.
Author supplied keywords
Cite
CITATION STYLE
Li, M., Chan, N., Chandra, V., & Muriki, K. (2020). Cluster Usage Policy Enforcement Using Slurm Plugins and an HTTP API. In ACM International Conference Proceeding Series (pp. 232–238). Association for Computing Machinery. https://doi.org/10.1145/3311790.3397341
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.