The displacement specifies the offsets of the array elements and
iterations with respect to the processors.
In loop nest 2, accesses by the
loop to the columns
of array Y are offset by one from the rows of array Z.
In loop nest 1, accesses to arrays X and Y have no offset.
Assigning columns
of array X on processors
,
the columns of Y on processors
and
the rows Z on processors
satisfies this requirement.
Iterations
of loop
in the second loop nest are then
assigned to processors
.
The complete decompositions with displacements are illustrated in
Figure 1(c).
Formally, the displacements
and
are the constant vectors
from Definitions 2.1 and 2.2, respectively.
The orientation matrix derived from the partition, plus the displacement
forms the complete decomposition.
Figure 1(c) also shows the data and
computation displacements and the final decompositions for the example.
As was the case with orientations, there are also many possible
displacements that lead to communication-free decompositions.
We can now summarize the basis of our approach.
There are many different, yet equivalent, decompositions with the
same partition.
We reduce the complexity of finding the
decomposition functions
for each array and
for each loop nest by first finding a
partition that is guaranteed to lead to the desired decomposition.
Then a simple calculation can be used to find the appropriate orientations and
displacements that completely specify the decompositions.