Towards real-time DNN inference on mobile platforms with model pruning and compiler optimization

Wei Niu; Pu Zhao; Zheng Zhan; Xue Lin; Yanzhi Wang; Bin Ren

Conference ProceedingsOPEN ACCESS

Towards real-time DNN inference on mobile platforms with model pruning and compiler optimization

IJCAI International Joint Conference on Artificial Intelligence (2020) 2021-January 5306-5308

DOI: 10.24963/ijcai.2020/778

5Citations

14Readers

Abstract

High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. However, the constrained computation and storage resources on these devices still pose significant challenges for real-time DNN inference executions. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on mobile devices. This demo shows that these optimizations can enable real-time mobile execution of multiple DNN applications, including style transfer, DNN coloring and super resolution.

Cite

CITATION STYLE

APA

Niu, W., Zhao, P., Zhan, Z., Lin, X., Wang, Y., & Ren, B. (2020). Towards real-time DNN inference on mobile platforms with model pruning and compiler optimization. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2021-January, pp. 5306–5308). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2020/778

Towards real-time DNN inference on mobile platforms with model pruning and compiler optimization

Abstract

Cite

Register to see more suggestions