With the recent design shift toward increasing the number of processing elements in a chip, supports for low power, low latency, and high bandwidth in on-chip interconnect are essential. Much of the previous work has focused on router architectures and network topologies using wide/long channels. However, such solutions may result in a complicated router design and a high interconnect power/area cost. In this chapter, we present a method to exploit a table-based data compression technique, relying on value patterns in cache traffic. Compressing a large packet into a small one saves power consumption by reducing required operations in network components and decreases contention by increasing the effective bandwidth of shared resources. The main challenges are providing a scalable implementation of tables and minimizing the latency overhead of compression. We propose a shared table scheme that needs one encoding and one decoding tables for each processing element, and a management protocol that does not require in-order delivery. This scheme eliminates table size dependence on a network size, which realizes scalability and reduces overhead cost of table for compression. Our simulation results are presented for 8-core and 16-core designs. Overall, our compression method improves the packet latency up to 44% with an average of 36% and reduces the network power consumption by 36% on average in 16-core tiled design. © 2011 Springer Science+Business Media LLC.
Jin, Y., Yum, K. H., & Kim, E. J. (2011). Adaptive data compression for low-power on-chip networks. In Low Power Networks-On-Chip (pp. 151–174). Springer. https://doi.org/10.1007/978-1-4419-6911-8_6