Towards High-Bandwidth-Utilization SpMV on FPGAs via Partial Vector Duplication

8Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Sparse matrix-vector multiplication (SpMV) is widely used in many fields and usually dominates the execution time of a task. With large off-chip memory bandwidth, customizable on-chip resources and high-performance float-point operation, FPGA is a potential platform to accelerate SpMV tasks. However, as compressed data formats for SpMV usually introduce irregular memory access while it is also memory-intensive, implementing an SpMV accelerator on FPGA to achieve a high bandwidth utilization (BU) is a challenging work. Existing works either eliminate irregular memory access at the sacrifice of increasing data redundancy or try to locally reduce the port conflicts introduced by irregular memory access, leading to a limited BU improvement. To this end, this paper proposes a high-bandwidth-utilization SpMV accelerator on FPGAs using partial vector duplication, where read-conflict-free vector buffer, writing-conflict-free adder tree, and ping-pong-like accumulator registers are well elaborated. The FPGA implementation results show that the proposed design can achieve an average of 1.10x performance speedup compared to the state-of-the-art work.

Cite

CITATION STYLE

APA

Liu, B., & Liu, D. (2023). Towards High-Bandwidth-Utilization SpMV on FPGAs via Partial Vector Duplication. In Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC (pp. 33–38). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3566097.3567839

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free