Dependencies between iterations of an irregular DOACROSS loop cannot always be determined at compile-time because they may depend upon input data which is known only at run-time. To parallelize such loops, it is necessary to perform run-time analysis. In this paper, we present a new algorithm to parallelize these loops at run-time. The proposed algorithm handles all types of data dependencies without requiring any special architectural support in the multiprocessor. Our scheme has an inspector which builds the iteration schedule and an executor which uses the schedule to execute the various iterations. Our approach does not require any special synchronization instructions during the inspector stage and the executor can be implemented with or without synchronization support. It allows overlap among dependent iterations and requires very little processor communication. Further, the schedule formed by the inspector can be reused across loop invocations. Our scheme has a consistent performance (i.e., performance does not degrade rapidly with the number of iterations or accesses per iteration) during the inspector stage and ensures good speedup during the executor stage.
CITATION STYLE
Prasad Krothapalli, V., Jeyaraman, T., & Giesbrecht, M. (1995). Run-time parallelization of irregular DOACROSS loops. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 980, pp. 75–80). Springer Verlag. https://doi.org/10.1007/3-540-60321-2_5
Mendeley helps you to discover research relevant for your work.