When a loop nest accesses only a subsection of an array, the number of virtual processor dimensions may be larger than the nesting depth of a loop nest. As a result, only a fraction of the processors will be busy during the execution of the loop nest. To avoid idle processors, we use the computation decomposition to find those processor dimensions that have parallelism for all loops. The equation for the number of virtual processor dimensions n is modified so that n is limited to the minimum distributed iteration space:
Here is the array space accessed. We then reduce the number of virtual processor dimensions by projecting the n-dimensional virtual processor space onto an -dimensional processor space. In choosing the dimensions in the virtual processor space to project onto, vectors are selected such that for all computation decomposition matrices C. This means that there are no projections onto a processor dimension that is idle during the execution of any loop nest.