General-purpose computing is taking an irreversible step toward on-chip parallel architectures. One way to enhance the performance of chip multi-processors is the use of thread-level speculation (TLS). Identifying the points where the speculative threads will be spawned becomes one of the critical issues of this kind of architectures. In this paper, a criterion for selecting the region to be speculatively executed is presented to identify potential sources of speculative parallelism in general-purpose programs. A dynamic profiling method has been provided to search a large space of TLS parallelization schemes and where parallelism was located within the application. We analyze key factors impacting speculative thread-level parallelism of SPEC CPU2000, evaluate whether a given application or parts of it are suitable for TLS technology, and study how to balance thread partition for efficiently exploiting speculative thread-level parallelism. It shows that the inter-thread data dependences are ubiquitous and the synchronization mechanism is necessary; Return value prediction and loop unrolling are important to improve performance. The information we got can be used to guide the thread partition of TLS. © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Wang, Y., An, H., Liang, B., Wang, L., Cong, M., & Ren, Y. (2007). Balancing thread partition for efficiently exploiting speculative thread-level parallelism. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4847 LNCS, pp. 40–49). Springer Verlag. https://doi.org/10.1007/978-3-540-76837-1_8
Mendeley helps you to discover research relevant for your work.