CDLRM: Look ahead caching for scalable training of recommendation models

Keshav Balasubramanian; Abdulla Alshabanah; Joshua D. Choe; Murali Annavaram

Conference ProceedingsOPEN ACCESS

CDLRM: Look ahead caching for scalable training of recommendation models

RecSys 2021 - 15th ACM Conference on Recommender Systems (2021) 263-272

DOI: 10.1145/3460231.3474246

13Citations

14Readers

Abstract

Deep learning recommendation models (DLRMs) are typically composed of two sets of parameters: large embedding tables to handle sparse categorical inputs, and neural networks such as multi-layer perceptrons (MLPs) to handle dense non-categorical inputs. Current DLRM training practices keep both these parameters in GPU memory. But as the size of the embedding tables grow, this practice of storing model parameters in GPU memory requires dozens or even hundreds of GPUs. This is an unsustainable trend with severe environmental consequences. Furthermore, such a design forces only a few conglomerates to be the gate keepers of model training. In this work, we propose cDLRM which democratizes recommendation model training by allowing a user to train on a single GPU regardless of the size of embedding tables by storing all embedding tables in CPU memory. A CPU based pre-processor analyzes training batches to prefetch embedding table slices accessed by those batches and caches them in GPU memory just-in-time. An associated caching protocol on the GPU enables efficiently updating the cached embedding table parameters. cDLRM decouples the embedding table size demands from the number of GPUs needed for compute. We first demonstrate that with cDLRM it is possible to train a large recommendation model using a single GPU regardless of model size. We then demonstrate that with its unique caching strategy, cDLRM enables pure data parallel training. We use two publicly available datasets to show that a cDLRM achieves identical model accuracy compared to a baseline trained completely on GPUs, while benefiting from large reduction in GPU demand.

Author supplied keywords

Cite

CITATION STYLE

APA

Balasubramanian, K., Alshabanah, A., Choe, J. D., & Annavaram, M. (2021). CDLRM: Look ahead caching for scalable training of recommendation models. In RecSys 2021 - 15th ACM Conference on Recommender Systems (pp. 263–272). Association for Computing Machinery, Inc. https://doi.org/10.1145/3460231.3474246

CDLRM: Look ahead caching for scalable training of recommendation models

Abstract

Author supplied keywords

Cite

Register to see more suggestions