Unified virtual memory support for deep CNN accelerator on soC FPGA

Tao Xiao; Yuran Qiao; Junzhong Shen; Qianming Yang; Mei Wen

Conference Proceedings

Unified virtual memory support for deep CNN accelerator on soC FPGA

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9528 64-76

DOI: 10.1007/978-3-319-27119-4_5

4Citations

10Readers

Get full text

Abstract

Cooperation of CPU and hardware accelerator on SoC FPGA to accomplish computational intensive tasks, provides significant advantages in performance and energy efficiency. However, current operating systems provide little support for accelerators: the OS is unaware that a computational task can be executed either on a CPU core or an accelerator, and provides no assistance in efficient management of data sharing between CPU and accelerator on the DRAM, such as zero copy, data coherence. It’s also hard for current OS to allocate large contiguous physical memory space for accelerator. In this paper, we select the Xilinx ZYNQ as target and qualitatively analyze methods of sharing data. Besides using high-performance (HP) AXI interfaces of the ZYQN device, we develop a novel memory management system for FPGA-based accelerator. It provides a unified virtual space for CPU cores and accelerator so that they can access the same memory space in the operating systems user space. For a deep convolutional neural network task, our design gains up to speed-up of 5. 34x compared to traditional processoraccelerator cooperation.

Author supplied keywords

Cite

CITATION STYLE

APA

Xiao, T., Qiao, Y., Shen, J., Yang, Q., & Wen, M. (2015). Unified virtual memory support for deep CNN accelerator on soC FPGA. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9528, pp. 64–76). Springer Verlag. https://doi.org/10.1007/978-3-319-27119-4_5

Unified virtual memory support for deep CNN accelerator on soC FPGA

Abstract

Author supplied keywords

Cite

Register to see more suggestions