noticing that the three loops over i, j, and k in ...
# adiabatonauts
a
noticing that the three loops over i, j, and k in any #dense #linear #solver can be placed in any of the six permutations, and only some of those yield the desired data-reuse that lets #vector #architectures show their stuff paper "In-Place Transposition of Rectangular Matrices" by Fred G. Gustavson and Tadeusz Swirszcz, which provides a solution for the problem. An online version of the paper can be found here: http://www.orcca.on.ca/conferences/cca2008/papers/gustavson.pdf Cycles of Permutation Related to Rectangular Matrix Transposition