Performance evaluation of hybrid hardware and software distributed shared memory protocols

10Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Hardware distributed shared memory (DSM) systems efficiently support tine grain sharing of data by maintaining coherence at the level of individual cache lines and providing automatic replication in processor caches. Software DSM systems, on the other hand, amortize high communication costs by maintaining coherence at coarser granularities and replicating data at the level of local main memories. Even though software DSM systems have traditionally been targeted towards loosely coupled environments, some of the techniques are potentially useful in the context of tightly coupled multiprocessors. In particular, communicating data at a coarse grain can sometimes be more efficient than transferring the data as individual cache lines. Furthermore, replication in local memories can accommodate applications with larger working sets as compared to replication in processor caches only. Therefore, combining the two techniques in a hybrid protocol can potentially exploit the benefits of each approach. This paper proposes one such hybrid protocol and evaluates its performance in the context of the FLASH multiprocessor architecture [24]. The hybrid system allows the programmer to optionally identify regions of data shared at a coarse granularity. Coherence for such data is maintained at the grain of the entire region using a software-DSM-style protocol. We evaluate the performance gains of this approach through a detailed simulation study of several parallel applications. Our preliminary results show that the hybrid protocol can eliminate a substantial fraction of remote cache misses through bulk transfer of coarse grain data regions and replication of such data in local memories. The performance gains over hardware cache coherence are modest at low network latencies, but increase substantially at higher network latencies and processor speeds. Finally, we show that similar to cache-only memory architectures, the hybrid protocol is insensitive to data placement issues.

Cite

CITATION STYLE

APA

Chandra, R., Gharachorloo, K., Soundararajan, V., & Gupta, A. (1994). Performance evaluation of hybrid hardware and software distributed shared memory protocols. In Proceedings of the International Conference on Supercomputing (Vol. Part F129421, pp. 274–288). Association for Computing Machinery. https://doi.org/10.1145/181181.181543

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free