Efficient execution of multiple CUDA applications using transparent suspend, resume and migration

4Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

GPUs are now one of the mainstream high-performance processors, embodying rich sets of computational as well as bandwidth resources. However, an individual GPU application typically does not exploit the resources on a GPU in its entirety, and thus concurrent execution of multiple applications may be advantageous in terms of total execution time and energy, by multiplexing on less utilized resources. Although modern GPU features such as Hyper-Q allow such a concurrent execution, it is at the risk of causing device memory shortage, and thus crashing the application or even the entire node. Our Mobile CUDA realizes safe, concurrent execution of multiple, unmodified CUDA applications using a transparent checkpointing approach, and achieves both improved throughput and energy savings for a mix of applications exhibiting different GPU resource requirements on multiple GPUs. Performance evaluation using the Rodinia benchmark suite shows that Mobile CUDA reduces total execution time by 18.4% and total energy by 5.5% on mixed workloads.

Cite

CITATION STYLE

APA

Suzuki, T., Nukada, A., & Matsuoka, S. (2015). Efficient execution of multiple CUDA applications using transparent suspend, resume and migration. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9233, pp. 687–699). Springer Verlag. https://doi.org/10.1007/978-3-662-48096-0_53

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free