Kernel Mapping Techniques for Deep Learning Neural Network Accelerators

1Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Deep learning applications are compute intensive and naturally parallel; this has spurred the development of new processor architectures tuned for the work load. In this paper, we consider structural differences between deep learning neural networks and more conventional circuits - highlighting how this impacts strategies for mapping neural network compute kernels onto available hardware. We present an efficient mapping approach based on dynamic programming, and also a method to establish performance bounds. We also propose an architectural approach to extend the practical life time of hardware accelerators, enabling the integration of a variety of heterogenous processors into a high performance system. Experimental results using benchmarks from a recent ISPD contest are also reported.

Cite

CITATION STYLE

APA

Özdemir, S., Khasawneh, M., Rao, S., & Madden, P. H. (2022). Kernel Mapping Techniques for Deep Learning Neural Network Accelerators. In Proceedings of the International Symposium on Physical Design (pp. 21–28). Association for Computing Machinery. https://doi.org/10.1145/3505170.3506730

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free