HyperSched: Dynamic Resource Reallocation for Model Development on a Deadline

38Citations
Citations of this article
40Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Prior research in resource scheduling for machine learning training workloads has largely focused on minimizing job completion times. Commonly, these model training workloads collectively search over a large number of parameter values that control the learning process in a hyperparameter search. It is preferable to identify and maximally provision the best-performing hyperparameter configuration (trial) to achieve the highest accuracy result as soon as possible. To optimally trade-off evaluating multiple configurations and training the most promising ones by a fixed deadline, we design and build HyperSched-A dynamic application-level resource scheduler to track, identify, and preferentially allocate resources to the best performing trials to maximize accuracy by the deadline. HyperSched leverages three properties of a hyperparameter search workload overlooked in prior work-trial disposability, progressively identifiable rankings among different configurations, and space-Time constraints-to outperform standard hyperparameter search algorithms across a variety of benchmarks.

Cite

CITATION STYLE

APA

Liaw, R., Bhardwaj, R., Dunlap, L., Zou, Y., Gonzalez, J. E., Stoica, I., & Tumanov, A. (2019). HyperSched: Dynamic Resource Reallocation for Model Development on a Deadline. In SoCC 2019 - Proceedings of the ACM Symposium on Cloud Computing (pp. 61–73). Association for Computing Machinery. https://doi.org/10.1145/3357223.3362719

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free