The very long and highly variable latencies in the deep memory hierarchy of a petaflop-scale architecture design, such as the Hybrid Technology Multi-Threaded Architecture (HTMT) [13], present a new challenge to its programming and execution model. A solution to coping with such high and variable latencies is to directly and explicitly expose the different memory regions of the machine to the program execution model, allowing better management of communication. In this paper we describe the novel percolation model that lies at the heart of the HTMT program execution model [13]. The Percolation Model combines multithreading with dynamic prefetching of coarse-grain contexts. In the past, prefetching techniques have concentrated on moving blocks of data within the memory hierarchy. Instead of only moving contiguous blocks of data, the thread percolation approach manages contexts that include data, program instructions, and control states. The main contributions of this paper include the specification of the HTMT runtime execution model based on the concept of percolation, and a discussion of the role of the compiler in a machine that exposes the memory hierarchy to the programming model.
CITATION STYLE
Ryan, S., Amaral, J. N., Gao, G., Ruiz, Z., Marquez, A., & Theobald, K. (1999). Coping with very high latencies in petaflop computer systems. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1615, pp. 71–82). Springer Verlag. https://doi.org/10.1007/BFb0094912
Mendeley helps you to discover research relevant for your work.