LoWino: Towards Efficient Low-Precision Winograd Convolutions on Modern CPUs

Guangli Li; Zhen Jia; Xiaobing Feng; Yida Wang

Conference ProceedingsOPEN ACCESS

LoWino: Towards Efficient Low-Precision Winograd Convolutions on Modern CPUs

ACM International Conference Proceeding Series (2021)

DOI: 10.1145/3472456.3472464

14Citations

9Readers

Abstract

Low-precision computation, which has been widely supported in contemporary hardware, is considered as one of the most effective methods to accelerate convolutional neural networks. However, low-precision computation is not widely used to speed up Winograd, an algorithm for fast convolution computation, due to the numerical error introduced by combining Winograd transformation and quantization. In this paper, we propose a low-precision Winograd convolution approach, LoWino, based on post-training quantization, which employs a linear quantization method in the Winograd domain to reduce the precision loss caused by transformations. Moreover, we present an efficient implementation that integrates well-designed optimization techniques, thereby adequately exploiting the capability of low-precision computation on modern CPUs. We evaluate our approach on Intel Xeon Scalable Processors by leveraging representative convolutional layers in prevailing deep neural networks. Experimental results show that LoWino achieves up to 2.04 × speedup over state-of-the-art implementations in the vendor library while maintaining the accuracy at a reasonable level.

Author supplied keywords

Cite

CITATION STYLE

APA

Li, G., Jia, Z., Feng, X., & Wang, Y. (2021). LoWino: Towards Efficient Low-Precision Winograd Convolutions on Modern CPUs. In ACM International Conference Proceeding Series. Association for Computing Machinery. https://doi.org/10.1145/3472456.3472464

LoWino: Towards Efficient Low-Precision Winograd Convolutions on Modern CPUs

Abstract

Author supplied keywords

Cite

Register to see more suggestions