OpenMP applications with abundant parallelism are often characterized by their high-performance. Unfortunately, OpenMP applications with a lot of synchronization or serialization-points perform poorly because of blocking, i.e. the threads have to wait for each other. In this paper, we present methods based on hardware transactional memory (HTM) for executing OpenMP barrier, critical, and taskwait directives without blocking. Although HTM is still relatively new in the Intel and IBM architectures, we experimentally show a 73% performance improvement over traditional locking approaches, and 23% better than other HTM approaches on critical sections. Speculation over barriers can decrease execution time by up-to 41 %. We expect that future systems with HTM support and more cores will have a greater benefit from our approach as they are more likely to block.
Bonnichsen, L., & Podobas, A. (2015). Using transactional memory to avoid blocking in OpenMP synchronization directives: Don’t wait, speculate! In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9342, pp. 149–161). Springer Verlag. https://doi.org/10.1007/978-3-319-24595-9_11