Abstract
A network multicomputer is a multiprocessor in which the processors are connected by general-purpose networking technology, in contrast to current distributed memory multi-processors where a dedicated special-purpose interconnect is used. The advent of high-speed general-purpose networks provides the impetus for a new look at the network multiprocessor model, by removing the bottleneck of current slow networks. However, major software issues remain unsolved. A convenient machine abstraction must be developed that hides from the application programmer low-level details such as message passing or machine failures. We use distributed shared memory as a programming abstraction, and rollback recovery through consistent checkpointing to provide fault tolerance. Measurements of our implementations of distributed shared memory and consistent checkpointing show that these abstractions can be implemented efficiently.
Cite
CITATION STYLE
Carter, J. B., Cox, A. L., Dwarkadas, S., Elnozahy, E. N., Johnson, D. B., Keleher, P., … Zwaenepoel, W. (1993). Network multicomputing using recoverable distributed shared memory. In 1993 IEEE Compcon Spring (pp. 519–527). Publ by IEEE. https://doi.org/10.1109/cmpcon.1993.289729
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.