gShare: A centralized GPU memory management framework to enable GPU memory sharing for containers

8Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Owing to low overhead and rapid deployment, containers are increasingly becoming an attractive system software platform for deep learning and high performance computing (HPC) applications that leverage GPUs. Unfortunately, existing container software does not concern how each container allocates GPU memory. Therefore, if a certain container consumes the majority of GPU memory, other containers may not run their workloads because of insufficient memory. This paper presents gShare, a centralized GPU memory management framework to enable GPU memory sharing for containers. As with a modern operating system, gShare allocates the entire GPU memory inside the framework and manages the memory with sophisticated memory allocators. gShare is then able to enforce the GPU memory limit of each container by mediating the memory allocation calls. To achieve its objective, gShare introduces the API remoting components, the mediator, and the three-level memory allocator, which enable lightweight and efficient GPU memory management. Our prototype implementation achieves near-native performance with secure isolation and little memory waste in popular deep learning and HPC workloads.

Author supplied keywords

Cite

CITATION STYLE

APA

Lee, M., Ahn, H., Hong, C. H., & Nikolopoulos, D. S. (2022). gShare: A centralized GPU memory management framework to enable GPU memory sharing for containers. Future Generation Computer Systems, 130, 181–192. https://doi.org/10.1016/j.future.2021.12.016

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free