Abstract
This paper considers the design of a data memory hierarchy, with a level 1 (L1) data cache at the top, to support the data bandwidth demands of a future-generation superscalar processor capable of issuing about ten instructions per clock cycle. It introduces the notion of cache bandwidfh � the bandwidth with which a cache can accept requests from the processor � and shows how the bandwidth of a standard, blocking cache, can degrade greatly because of its inability to overlap the service of misses. Non-blocking or lockup-free caches are discussed as a way of reducing the bandwidth degradation due to misses. To improve the data bandwidth to greater than 1 request per cycle, multi-port, interleaved caches are introduced. Simulation results from a cycle-by-cycle simulator, using the MIPS R2000 instruction set, suggest that memory hierarchies with blocking L 1 caches will be unable to support the bandwidth demands of futuregeneration superscalar processors. Multi-port, nonblocking (MPNB) L1 caches introduced in this paper for the top of the data memory hierarchy appear to be capable of supporting such data bandwidth demands. © 1991, ACM. All rights reserved.
Cite
CITATION STYLE
Sohi, G. S., & Franklin, M. (1991). High-Bandwidth Data Memory Systems for Superscalar Processors. ACM SIGPLAN Notices, 26(4), 53–62. https://doi.org/10.1145/106973.106980
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.