In current multi-core systems with the shared last level cache (LLC) physically distributed across all the cores, both initial data placement and subsequent placement of data close to the requesting core can contribute to reducing memory access latency and power consumption. This paper extends a replication scheme that balances between access latency and cache capacity in shared NUCA designs by selectively replicating frequently used cache lines close to the requesting cores. Our scheme reduces completion time by 15% and improves energy consumption by 27% when compared to the Static-NUCA (S-NUCA) management scheme, when simulated on an eight core system.
Chaturvedi, N., Subramaniyan, A., & Gurunarayanan, S. (2015). Selective cache line replication scheme in shared last level cache. In Procedia Computer Science (Vol. 46, pp. 1095–1107). Elsevier B.V. https://doi.org/10.1016/j.procs.2015.01.022