DREW: Efficient Winograd CNN Inference with Deep Reuse

Ruofan Wu; Feng Zhang; Jiawei Guan; Zhen Zheng; Xiaoyong Du; Xipeng Shen

Conference ProceedingsOPEN ACCESS

DREW: Efficient Winograd CNN Inference with Deep Reuse

WWW 2022 - Proceedings of the ACM Web Conference 2022 (2022) 1807-1816

DOI: 10.1145/3485447.3511985

12Citations

6Readers

Get full text

Abstract

Deep learning has been used in various domains, including Web services. Convolutional neural networks (CNNs), which are deep learning representatives, are among the most popular neural networks in Web systems. However, CNN employs a high degree of computing. In comparison to the training phase, the inference process is more frequently done on low-power computing equipments. The limited computing resource and high computation pressure limit the effective use of CNN algorithms in industry. Fortunately, a minimal filtering algorithm called Winograd can reduce convolution calculations by minimizing multiplication operations. We find that Winograd convolution can be sped up further by deep reuse technique, which reuses the similar data and computation processes. In this paper, we propose a new inference method, called DREW, which combines deep reuse with Winograd for further accelerating CNNs. DREW handles three difficulties. First, it can detect the similarities from the complex minimal filtering patterns by clustering. Second, it reduces the online clustering cost in a reasonable range. Third, it provides an adjustable method in clustering granularity balancing the performance and accuracy. Experiments show that 1) DREW further accelerates the Winograd convolution by an average of 2.06 × speedup; 2) when DREW is applied to end-to-end Winograd CNN inference, it achieves 1.71 × the average performance speedup with no (<0.4%) accuracy loss; 3) DREW reduces the number of convolution operations to 11% of the original operations on average.

Author supplied keywords

Cite

CITATION STYLE

APA

Wu, R., Zhang, F., Guan, J., Zheng, Z., Du, X., & Shen, X. (2022). DREW: Efficient Winograd CNN Inference with Deep Reuse. In WWW 2022 - Proceedings of the ACM Web Conference 2022 (pp. 1807–1816). Association for Computing Machinery, Inc. https://doi.org/10.1145/3485447.3511985

DREW: Efficient Winograd CNN Inference with Deep Reuse

Abstract

Author supplied keywords

Cite

Register to see more suggestions