Layup

  • Jiang W
  • Ma Y
  • Liu B
  • et al.
N/ACitations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

Although GPUs have emerged as the mainstream for the acceleration of convolutional neural network (CNN) training processes, they usually have limited physical memory, meaning that it is hard to train large-scale CNN models. Many methods for memory optimization have been proposed to decrease the memory consumption of CNNs and to mitigate the increasing scale of these networks; however, this optimization comes at the cost of an obvious drop in time performance. We propose a new memory optimization strategy named Layup that realizes both better memory efficiency and better time performance. First, a fast layer-type-specific method for memory optimization is presented, based on the new finding that a single memory optimization often shows dramatic differences in time performance for different types of layers. Second, a new memory reuse method is presented in which greater attention is paid to multi-type intermediate data such as convolutional workspaces and cuDNN handle data. Experiments show that Layup can significantly increase the scale of extra-deep network models on a single GPU with lower performance loss. It even can train ResNet with 2,504 layers using 12GB memory, outperforming the state-of-the-art work of SuperNeurons with 1,920 layers (batch size = 16).

Cite

CITATION STYLE

APA

Jiang, W., Ma, Y., Liu, B., Liu, H., Zhou, B. B., Zhu, J., … Jin, H. (2019). Layup. ACM Transactions on Architecture and Code Optimization, 16(4), 1–23. https://doi.org/10.1145/3357238

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free