Tailor: Altering Skip Connections for Resource-Efficient Inference

Olivia Weng; Gabriel Marcano; Vladimir Loncar; Alireza Khodamoradi; G. Abarajithan; Nojan Sheybani; Andres Meza; Farinaz Koushanfar; Kristof Denolf; Javier Mauricio Duarte; Ryan Kastner

Journal ArticleOPEN ACCESS

Tailor: Altering Skip Connections for Resource-Efficient Inference

ACM Transactions on Reconfigurable Technology and Systems (2024) 17(1)

DOI: 10.1145/3624990

11Citations

12Readers

Abstract

Deep neural networks use skip connections to improve training convergence. However, these skip connections are costly in hardware, requiring extra buffers and increasing on- and off-chip memory utilization and bandwidth requirements. In this article, we show that skip connections can be optimized for hardware when tackled with a hardware-software codesign approach. We argue that while a network's skip connections are needed for the network to learn, they can later be removed or shortened to provide a more hardware-efficient implementation with minimal to no accuracy loss. We introduce Tailor, a codesign tool whose hardware-aware training algorithm gradually removes or shortens a fully trained network's skip connections to lower the hardware cost. Tailor improves resource utilization by up to 34% for block random access memories (BRAMs), 13% for flip-flops (FFs), and 16% for look-up tables (LUTs) for on-chip, dataflow-style architectures. Tailor increases performance by 30% and reduces memory bandwidth by 45% for a two-dimensional processing element array architecture.

Author supplied keywords

Cite

CITATION STYLE

APA

Weng, O., Marcano, G., Loncar, V., Khodamoradi, A., Abarajithan, G., Sheybani, N., … Kastner, R. (2024). Tailor: Altering Skip Connections for Resource-Efficient Inference. ACM Transactions on Reconfigurable Technology and Systems, 17(1). https://doi.org/10.1145/3624990

Tailor: Altering Skip Connections for Resource-Efficient Inference

Abstract

Author supplied keywords

Cite

Register to see more suggestions