Deep Neural Networks (DNNs) are resilient to reduced data precision, which motivates exploiting low-precision data formats for more efficient computation, especially on custom hardware accelerators. Multiple low-precision types can be mixed to fit the dynamic range of different DNN layers. However, these formats are not often supported on popular microprocessors and Deep Learning (DL) frameworks, hence we have to manually implement and optimize such novel data types and integrate them with multiple DL framework components, which is tedious and error-prone. This paper first reviews three major challenges in programming mixed-precision DNNs, including generating high-performance arithmetic and typecast functions, reducing the recompilation time and bloated binary size caused by excessive template specialization, and optimizing mixed-precision DNN computational graphs. We present our approach, Lowgen, a framework that addresses these challenges. For each challenge, we present our solution implemented and tested on our in-house, TensorFlow-like DL framework. Empirical evaluation shows that Lowgen can automatically generate efficient data type implementations that enable significant speed-up, which greatly lowers the development effort and enhances research productivity on mixed-precision DNN.
CITATION STYLE
Zhao, R., Luk, W., Xiong, C., Niu, X., & Tsoi, K. H. (2020). On the challenges in programming mixed-precision deep neural networks. In MAPL 2020 - Proceedings of the 4th ACM SIGPLAN International Workshop on Machine Learning and Programming Languages, co-located with PLDI 2020 (pp. 20–28). Association for Computing Machinery. https://doi.org/10.1145/3394450.3397468
Mendeley helps you to discover research relevant for your work.