High-performance scientific computing relies increasingly on high-level, large-scale, object-oriented software frameworks to manage both algorithmic complexity and the complexities of parallelism: dis- tributed data management, process management, inter-process commu- nication, and load balancing. This encapsulation of data management, together with the prescribed semantics of a typical fundamental compo- nent of such object-oriented frameworks|a parallel or serial array class library|provides an opportunity for increasingly sophisticated compile- time optimization techniques. This paper describes two optimizing trans- formations suitable for certain classes of numerical algorithms, one for re- ducing the cost of inter-processor communication, and one for improving cache utilization, demonstrates and analyzes the resulting performance gains, and indicates how these transformations are being automated.
CITATION STYLE
Bassetti, F., Davis, K., & Quinlan, D. (1998). Optimizing transformations of stencil operations for parallel object-oriented scientific frameworks on cache-based architectures. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1505, pp. 107–118). Springer Verlag. https://doi.org/10.1007/3-540-49372-7_10
Mendeley helps you to discover research relevant for your work.