The integration of the memory controller on the processor die enables ever larger core counts in commodity hardware shared memory systems with Non-Uniform Memory Architecture properties. Shared memory parallelization with OpenMP is an elegant and widely used approach to leverage the power of such systems. The binding of the OpenMP threads to compute cores and the corresponding memory association are becoming even more critical in order to obtain optimal performance. In this work we provide a method to measure the amount of remote socket memory accesses a thread generates. We use available performance monitoring CPU counters in combination with thread binding on a quad socket Nehalem EX system. For visualization of the collected data we use Vampir. © 2011 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Iwainsky, C., Reichstein, T., Dahnken, C., Mey, D. A., Terboven, C., Semin, A., & Bischof, C. (2011). An approach to visualize remote socket traffic on the intel Nehalem-EX. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6586 LNCS, pp. 523–530). https://doi.org/10.1007/978-3-642-21878-1_64
Mendeley helps you to discover research relevant for your work.