Integrated Imager and 3.22 μs/Kernel-Latency All-Digital In-Imager Global-Parallel Binary Convolutional Neural Network Accelerator for Image Processing

1Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This paper presents an innovative approach to achieve ultralow-latency convolutional neural network (CNN) processing, which is critical for real-time image processing applications such as autonomous driving and virtual reality. Traditional CNN accelerators employing in/near-array-computing (inclusive of in/near-memory-computing and in/near-sensor-computing) architectures have struggled to meet real-time requirements due to latency bottlenecks encountered with conventional column-parallel processing for image processing. To address this challenge, we propose a novel, all-digital in- imager global-parallel binary convolutional neural network (IIGP-BNN) accelerator. This new approach employs a global-parallel processing concept, which enables multiply-and-accumulate operations (MACs) to be executed simultaneously within the imager array in a 2D manner, eliminating the additional latency associated with row-by-row processing and data access from random access memories (RAMs). In this design, convolution and subsampling operations using a 3 × 3 kernel are completed within just nine steps of global-parallel processing, regardless of image size. This results in a theoretical reduction of over 88.5% of repeated row scans compared to conventional column-parallel processing architectures, thus significantly reducing computing latency. We have designed and prototyped a 30 × 30 integrated imager and IIGP-BNN accelerator IC using a 0.18m CMOS process. This prototype achieved a latency of 3.22s /kernel on the first layer convolution at a power supply of 1 V and a clock frequency of 35.7 MHz. This represents a latency reduction of 35.6% compared to the state-of-the-art in/near-imager-computing works. This proposed global-parallel processing concept opens up the potential for processing high-resolution images in 4K and 8K with the same ultralow latency, marking a significant advancement in high-speed image processing.

Cite

CITATION STYLE

APA

Wang, R., Wu, C. H., & Takamiya, M. (2023). Integrated Imager and 3.22 μs/Kernel-Latency All-Digital In-Imager Global-Parallel Binary Convolutional Neural Network Accelerator for Image Processing. IEEE Access, 11, 74364–74378. https://doi.org/10.1109/ACCESS.2023.3296429

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free