The increasing importance of depending on Convolutional Neural Networks (CNN) in many real-time applications especially for image classifications and Humanoid Robots leads to the search for an optimum solution to accelerate the computational process capabilities for the hardware-based systems. Multiply-Accumulate (MAC) is the most computational demanding unit in any CNN architectures. In this paper, three optimized 2D MAC hardware-based architecture units have been designed using VHDL and synthesized for the operation on the FPGA platform due to its parallelism-architecture support feature. The logic utilization, power dissipation, and timing analyze of the three proposed 2D MAC have been made using Quartus ii tools and showed that the 3rd MAC design can achieve a 18.34 Giga Operation per Second (GOPS) while keeping the core dynamic thermal power dissipation level at 303.67 mW.
CITATION STYLE
Ahmed, H. O., Ghoneima, M., & Dessouky, M. (2018). High-speed 2D parallel MAC unit hardware accelerator for convolutional neural network. In Advances in Intelligent Systems and Computing (Vol. 868, pp. 655–663). Springer Verlag. https://doi.org/10.1007/978-3-030-01054-6_47
Mendeley helps you to discover research relevant for your work.