Field programmable gate arrays (FPGAs) have gained attention in high-performance computing (HPC) research because their computation and communication capabilities have dramatically improved in recent years as a result of improvements to semiconductor integration technologies that depend on Moore's Law. In addition to FPGA performance improvements, OpenCL-based FPGA development toolchains have been developed and offered by FPGA vendors, which reduces the programming effort required as compared to the past. These improvements reveal the possibilities of realizing a concept to enable on-the-fly offloading computation at which CPUs/GPUs perform poorly to FPGAs while performing low-latency data movement. We think that this concept is one of the keys to more improve the performance of modern heterogeneous supercomputers using accelerators like GPUs. In this paper, we propose a high-performance GPU-FPGA data communication using OpenCL and Verilog HDL mixed programming in order to make both devices smoothly work together. OpenCL is used to program application algorithms and data movement control when Verilog HDL is used to implement low-level components for memory copies between the two devices. Experimental results using toy programs showed that our proposed method achieves a latency of 0.6 μs and as much as 6.9 GB/s between the GPU and the FPGA, thus confirming that the proposed method is effective at realizing the highperformance GPU-FPGA cooperative computation.
CITATION STYLE
Kobayashi, R., Fujita, N., Yamaguchi, Y., & Boku, T. (2019). OpenCL-enabled high performance direct memory access for GPU-FPGA cooperative computation. In ACM International Conference Proceeding Series (pp. 6–9). Association for Computing Machinery. https://doi.org/10.1145/3317576.3317581
Mendeley helps you to discover research relevant for your work.