Tailor: Altering Skip Connections for Resource-Efficient Inference

11Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

Deep neural networks use skip connections to improve training convergence. However, these skip connections are costly in hardware, requiring extra buffers and increasing on- and off-chip memory utilization and bandwidth requirements. In this article, we show that skip connections can be optimized for hardware when tackled with a hardware-software codesign approach. We argue that while a network's skip connections are needed for the network to learn, they can later be removed or shortened to provide a more hardware-efficient implementation with minimal to no accuracy loss. We introduce Tailor, a codesign tool whose hardware-aware training algorithm gradually removes or shortens a fully trained network's skip connections to lower the hardware cost. Tailor improves resource utilization by up to 34% for block random access memories (BRAMs), 13% for flip-flops (FFs), and 16% for look-up tables (LUTs) for on-chip, dataflow-style architectures. Tailor increases performance by 30% and reduces memory bandwidth by 45% for a two-dimensional processing element array architecture.

Cite

CITATION STYLE

APA

Weng, O., Marcano, G., Loncar, V., Khodamoradi, A., Abarajithan, G., Sheybani, N., … Kastner, R. (2024). Tailor: Altering Skip Connections for Resource-Efficient Inference. ACM Transactions on Reconfigurable Technology and Systems, 17(1). https://doi.org/10.1145/3624990

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free