The need for high performance computing dictates constraints on the acceptable bandwidth of data transfer between processing units and the memory. Consequently it is crucial to build high performance, scalable, and energy efficient architectures capable of completing data transfer requests at satisfactory rates. Thanks to increased transfer rates obtained by exploiting high-speed serial data transfer links instead of traditional parallel ones, PCI Express provides a promising solution to the problem of connectivity for todays complex heterogeneous architectures. In this chapter, we first cover the principals of interfacing using PCI Express. To illustrate a practical situation, we select the Xilinx Zynq device and develop an example architecture which allows the x86 CPU cores of the host system, the ARM cores of the Zynq device, and the hardware accelerators directly realized on the FPGA fabric of the Zynq to share the available DRAM memory for efficient data sharing. We provide estimates on possible data transfer bandwidths in our architecture.
CITATION STYLE
Sadri, M., De Schryver, C., & Wehn, N. (2015). High-Bandwidth Low-Latency Interfacing with FPGA Accelerators Using PCI Express. In FPGA Based Accelerators for Financial Applications (pp. 117–141). Springer International Publishing. https://doi.org/10.1007/978-3-319-15407-7_6
Mendeley helps you to discover research relevant for your work.