This paper presents a many-core heterogeneous computational platform that employs a GALS compatible circuit-switched on-chip network. The platform targets streaming DSP and embedded applications that have a high degree of task-level parallelism among computational kernels. The test chip was fabricated in 65nm CMOS consisting of 164 simple small programmable cores, three dedicated-purpose accelerators and three shared memory modules. All processors are clocked by their own local oscillators and communication is achieved through a simple yet effective source-synchronous communication technique that allows each interconnection link between any two processors to sustain apeak throughput of one data word per cycle. A complete 802.11a WLAN baseband receiver was implemented on this platform. It has a real-time throughput of 54 Mbps with all processors running at 594 MHz and 0.95 V, and consumes an average 174.76 mW with 12.18 mW (or 7.0%) dissipated by its interconnection links. We can fully utilize the benefit of the GALS architecture and by adjusting each processor's oscillator to run at a workload-based optimal clock frequency with the chip's dual supply voltages set at 0.95 V and 0.75 V, the receiver consumes only 123.18 mW, a 29.5% in power reduction. Measured results of its power consumption on the real chip come within the difference of only 2-5% compared with the estimated results showing our design to be highly reliable and efficient. © 2009 IEEE.
CITATION STYLE
Tran, A. T., Truong, D. N., & Baas, B. M. (2009). A gals many-core heterogeneous DSP platform with source-synchronous on-chip interconnection network. In Proceedings - 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip, NoCS 2009 (pp. 214–223). https://doi.org/10.1109/NOCS.2009.5071470
Mendeley helps you to discover research relevant for your work.