As the number of instructions executed in parallel increases, superscalar processors will require higher bandwidth from data caches. Because of the high cost of true multi-ported caches, alternative cache designs must be evaluated. The purpose of this study is to examine the data cache bandwidth requirements of high-degree superscalar processors, and investigate alternative solutions. The designs studied range from classic solutions like multi-banked caches to more complex solutions recently proposed in the literature. The performance tradeoffs of these different cache designs are examined in details. Then, using a chip area cost model, all solutions are compared with respect to both cost and performance. While many cache designs seem capable of achieving high cache bandwidth, the best cost/performance tradeoff varies significantly depending on the dedicated area cost, ranging from multi-banked cache designs to hybrid multi-banked/multi-ported caches or even true multi-ported caches. For instance, we find that an 8-bank cache with minor optimizations perform 10% better than a true 2-port cache at half the cost, or that a 4-bank 2 ports per bank cache performs better than a true 4-port cache and uses 45% less chip area.
CITATION STYLE
Juan, T., Navarro, J. J., & Temam, O. (1997). Data caches for superscalar processors. In Proceedings of the International Conference on Supercomputing (pp. 60–67). ACM. https://doi.org/10.1145/263580.263595
Mendeley helps you to discover research relevant for your work.