Efficient GEMM Implementation for Vision-Based Object Detection in Autonomous Driving Applications

6Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.

Abstract

Convolutional Neural Networks (CNNs) have been incredibly effective for object detection tasks. YOLOv4 is a state-of-the-art object detection algorithm designed for embedded systems. It is based on YOLOv3 and has improved accuracy, speed, and robustness. However, deploying CNNs on embedded systems such as Field Programmable Gate Arrays (FPGAs) is difficult due to their limited resources. To address this issue, FPGA-based CNN architectures have been developed to improve the resource utilization of CNNs, resulting in improved accuracy and speed. This paper examines the use of General Matrix Multiplication Operations (GEMM) to accelerate the execution of YOLOv4 on embedded systems. It reviews the most recent GEMM implementations and evaluates their accuracy and robustness. It also discusses the challenges of deploying YOLOv4 on autonomous vehicle datasets. Finally, the paper presents a case study demonstrating the successful implementation of YOLOv4 on an Intel Arria 10 embedded system using GEMM.

Author supplied keywords

Cite

CITATION STYLE

APA

Guerrouj, F. Z., Rodríguez Flórez, S., Abouzahir, M., El Ouardi, A., & Ramzi, M. (2023). Efficient GEMM Implementation for Vision-Based Object Detection in Autonomous Driving Applications. Journal of Low Power Electronics and Applications, 13(2). https://doi.org/10.3390/jlpea13020040

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free