A cache oblivious matrix transposition algorithm is implemented and analyzed using simulation and hardware performance counters. Contrary to its name, the cache oblivious matrix transposition algorithm is found to exhibit a complex cache behavior with a cache miss ratio that is strongly dependent on the associativity of the cache. In some circumstances the cache behavior is found to be worst than that of a naïve transposition algorithm. While the total size is an important factor in determining cache usage efficiency, the sub-block size, associativity, and cache line replacement policy are also shown to be very important. © Springer-Verlag Berlin Heidelberg 2004.
CITATION STYLE
Tsifakis, D., Rendell, A. P., & Strazdins, P. E. (2004). Cache oblivious matrix transposition: Simulation and experiment. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3037, 17–25. https://doi.org/10.1007/978-3-540-24687-9_3
Mendeley helps you to discover research relevant for your work.