A 0.57-GOPS/DSP Object Detection PIM Accelerator on FPGA

9Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The paper presents an object detection accelerator featuring a processing-in-memory (PIM) architecture on FPGAs. PIM architectures are well known for their energy efficiency and avoidance of the memory wall. In the accelerator, a PIM unit is developed using BRAM and LUT based counters, which also helps to improve the DSP performance density. The overall architecture consists of 64 PIM units and three memory buffers to store inter-layer results. A shrunk and quantized Tiny-YOLO network is mapped to the PIM accelerator, where DRAM access is fully eliminated during inference. The design achieves a throughput of 201.6 GOPs at 100MHz clock rate and correspondingly, a performance density of 0.57 GOPS/DSP.

Cite

CITATION STYLE

APA

Jiao, B., Zhang, J., Xie, Y., Wang, S., Zhu, H., Kang, X., … Chen, C. (2021). A 0.57-GOPS/DSP Object Detection PIM Accelerator on FPGA. In Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC (pp. 13–14). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1145/3394885.3431659

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free