Removing Communications in Clustered Microarchitectures through Instruction Replication

9Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

The need to communicate values between clusters can result in a significant performance loss for clustered microarchitectures. In this work, we describe an optimization technique that removes communications by selectively replicating an appropriate set of instructions. Instruction replication is done carefully because it might degrade performance due to the increased contention it can place on processor resources. The proposed scheme is built on top of a previously proposed state-of-the-art modulo-scheduling algorithm. Though this algorithm has been proved to be very effective at reducing communications, results show that the number of communications can be further decreased by around one-third through replication, which results in a significant speedup. IPC is increased by 25% on average for a four-cluster microarchitecture and by as much as 70% for selected programs. We also show that replicating appropriate sets of instructions is more effective than doubling the intercluster connection network bandwidth. © 2004, ACM. All rights reserved.

Cite

CITATION STYLE

APA

Aletà, A., Codina, J. M., González, A., & Kaeli, D. (2004). Removing Communications in Clustered Microarchitectures through Instruction Replication. ACM Transactions on Architecture and Code Optimization, 1(2), 127–151. https://doi.org/10.1145/1011528.1011529

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free