CoDL: Efficient CPU-GPU Co-execution for Deep Learning Inference on Mobile Devices

Fucheng Jia; Deyu Zhang; Ting Cao; Shiqi Jiang; Yunxin Liu; Ju Ren; Yaoxue Zhang

Conference ProceedingsOPEN ACCESS

CoDL: Efficient CPU-GPU Co-execution for Deep Learning Inference on Mobile Devices

MobiSys 2022 - Proceedings of the 2022 20th Annual International Conference on Mobile Systems, Applications and Services (2022) 209-221

DOI: 10.1145/3498361.3538932

36Citations

35Readers

Get full text

Abstract

Concurrent inference execution on heterogeneous processors is critical to improve the performance of increasingly heavy deep learning (DL) models. However, available inference frameworks can only use one processor at a time, or hardly achieve speedup by concurrent execution compared to using one processor. This is due to the challenges to 1) reduce data sharing overhead, and 2) properly partition each operator between processors. By solving the challenges, we propose CoDL, a concurrent DL inference framework for the CPU and GPU on mobile devices. It can fully utilize the heterogeneous processors to accelerate each operator of a model. It integrates two novel techniques: 1) hybrid-type-friendly data sharing, which allows each processor to use its efficient data type for inference. To reduce data sharing overhead, we also propose hybrid-dimension partitioning and operator chain methods; 2) non-linearity- and concurrency-aware latency prediction, which can direct proper operator partitioning by building an extremely light-weight but accurate latency predictor for different processors. Based on the two techniques, we build the end-to-end CoDL inference framework, and evaluate it on different DL models. The results show up to 4.93× speedup and 62.3% energy saving compared with the state-of-the-art concurrent execution system.

Author supplied keywords

Cite

CITATION STYLE

APA

Jia, F., Zhang, D., Cao, T., Jiang, S., Liu, Y., Ren, J., & Zhang, Y. (2022). CoDL: Efficient CPU-GPU Co-execution for Deep Learning Inference on Mobile Devices. In MobiSys 2022 - Proceedings of the 2022 20th Annual International Conference on Mobile Systems, Applications and Services (pp. 209–221). Association for Computing Machinery, Inc. https://doi.org/10.1145/3498361.3538932

CoDL: Efficient CPU-GPU Co-execution for Deep Learning Inference on Mobile Devices

Abstract

Author supplied keywords

Cite

Register to see more suggestions