No communication will occur when the data is local to the processor that references that data. This relationship between data and computation is expressed by the following theorem.
Communication at the displacement level is inexpensive since the amount of data transferred can be significantly reduced by blocking. Thus the priority is in finding the best partitions and orientations, and we first focus on the version of Eqn. 2 that omits displacements. Letting ,
Given the array index function matrices for each array x in each loop nest j, a decomposition is free of reorganization communication if the data decomposition matrix and the computation decomposition are such that Eqn. 3 is true. A trivial solution that guarantees no communication is to execute everything sequentially by setting all computation decomposition matrices C = 0 and all data decomposition matrices D = 0. Therefore would span the entire iteration space and would span the entire array space . However, maximizing parallelism means finding data and computation decompositions such that the partition , the nullspace of C, for all loop nests is as small as possible.