A coarse-grained reconfigurable processing unit (RPU) consisting of 16 × 16 multi-functional processing elements (PEs) interconnected by an area-efficient line-switched mesh connect (LSMC) routing is implemented on a 5.4mm × 3.1 mm die in TSMC 65 nm LP1P8M CMOS technology. A hierarchical configuration context (HCC) organization scheme is proposed to reduce the implementation overhead and the energy dissipation spent on fast reconfiguration. The proposed RPU is integrated into two system-on-a-chips (SoCs), targeting multiple-standard video decoding. The high-performance chip, comprising two RPU processors (named REMUS-HPP), can decode 1920 × 1080 H.264 video streams at 30 frames per second (fps) under 200 MHz. REMUS-HPP achieves a 25% performance gain over the XPP-III reconfigurable processor with only 280 mW power consumption, resulting in a 14.3 × improvement on energy efficiency. The other chip (named REMUS-LPP), targeting low power applications, integrates only one RPU processor. REMUS-LPP can decode 720 × 480 H.264 video streams at 35fps with 24.5 mW under 75 MHz, achieving a 76% reduction in power dissipation and a 3.96 × improvement on energy efficiency compared with the ADRES reconfigurable processor.
CITATION STYLE
Liu, L., Wang, D., Zhu, M., Wang, Y., Yin, S., Cao, P., … Wei, S. (2015). An Energy-Efficient Coarse-Grained Reconfigurable Processing Unit for Multiple-Standard Video Decoding. IEEE Transactions on Multimedia, 17(10), 1706–1720. https://doi.org/10.1109/TMM.2015.2463735
Mendeley helps you to discover research relevant for your work.