Optimal memory-aware backpropagation of deep join networks

Olivier Beaumont; Julien Herrmann; Guillaume Pallez; Alena Shilova

Journal ArticleOPEN ACCESS

Optimal memory-aware backpropagation of deep join networks

Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences (2020) 378(2166)

DOI: 10.1098/rsta.2019.0049

14Citations

15Readers

Abstract

Deep learning training memory needs can prevent the user from considering large models and large batch sizes. In this work, we propose to use techniques from memory-aware scheduling and automatic differentiation (AD) to execute a backpropagation graph with a bounded memory requirement at the cost of extra recomputations. The case of a single homogeneous chain, i.e. the case of a network whose stages are all identical and form a chain, is well understood and optimal solutions have been proposed in the AD literature. The networks encountered in practice in the context of deep learning are much more diverse, both in terms of shape and heterogeneity. In this work, we define the class of backpropagation graphs, and extend those on which one can compute in polynomial time a solution that minimizes the total number of recomputations. In particular, we consider join graphs which correspond to models such as siamese or cross-modal networks. This article is part of a discussion meeting issue 'Numerical algorithms for high-performance computational science'.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Beaumont, O., Herrmann, J., Pallez, G., & Shilova, A. (2020). Optimal memory-aware backpropagation of deep join networks. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 378(2166). https://doi.org/10.1098/rsta.2019.0049

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 5

63%

Professor / Associate Prof. 2

25%

Researcher 1

13%

Readers' Discipline

Engineering 3

43%

Computer Science 2

29%

Medicine and Dentistry 1

14%

Psychology 1

14%

Optimal memory-aware backpropagation of deep join networks

Abstract

Author supplied keywords

References Powered by Scopus

Deep metric learning using triplet network

Algorithm 799: Revolve: An implementation of checkpointing for the reverse or adjoint mode of computational differentiation

VDNN: Virtualized deep neural networks for scalable, memory-efficient neural network design

Cited by Powered by Scopus

Classifying Sequences of Extreme Length with Constant Memory Applied to Malware Detection

Expert system gradient descent style training: Development of a defensible artificial intelligence technique

Enabling Resource-Efficient AIoT System With Cross-Level Optimization: A Survey

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline