Unified virtual memory support for deep CNN accelerator on soC FPGA

4Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Cooperation of CPU and hardware accelerator on SoC FPGA to accomplish computational intensive tasks, provides significant advantages in performance and energy efficiency. However, current operating systems provide little support for accelerators: the OS is unaware that a computational task can be executed either on a CPU core or an accelerator, and provides no assistance in efficient management of data sharing between CPU and accelerator on the DRAM, such as zero copy, data coherence. It’s also hard for current OS to allocate large contiguous physical memory space for accelerator. In this paper, we select the Xilinx ZYNQ as target and qualitatively analyze methods of sharing data. Besides using high-performance (HP) AXI interfaces of the ZYQN device, we develop a novel memory management system for FPGA-based accelerator. It provides a unified virtual space for CPU cores and accelerator so that they can access the same memory space in the operating systems user space. For a deep convolutional neural network task, our design gains up to speed-up of 5. 34x compared to traditional processoraccelerator cooperation.

Cite

CITATION STYLE

APA

Xiao, T., Qiao, Y., Shen, J., Yang, Q., & Wen, M. (2015). Unified virtual memory support for deep CNN accelerator on soC FPGA. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9528, pp. 64–76). Springer Verlag. https://doi.org/10.1007/978-3-319-27119-4_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free