Multi-task learning systems are commonly adopted in many real-world AI applications such as intelligent robots and self-driving vehicles. Instead of improving single-network performance, this work proposes a specialized Multi-Task Deep Learning Accelerator architecture, MT-DLA, to improve the performance of concurrent networks by exploiting the shared feature and parameters across these models. It is shown in our evaluation with realistic multi-task workloads, MT-DLA dramatically eliminates the memory and computation overhead caused by the shared parameters, activations and computation result. In the experiments with real-world multi-task learning workloads, MT-DLA brings about 1.4x-7.0x energy efficiency boost when compared to the baseline neural network accelerator without multi-task support.
CITATION STYLE
Wang, M., Li, B., Wang, Y., Liu, C., Ma, X., Zhao, X., & Zhang, L. (2021). MT-DLA: An Efficient Multi-Task Deep Learning Accelerator Design. In Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI (pp. 1–8). Association for Computing Machinery. https://doi.org/10.1145/3453688.3461514
Mendeley helps you to discover research relevant for your work.