When a loop nest accesses only a subsection of an array, the
number of virtual processor dimensions may be larger than
the nesting depth of a loop nest.
As a result, only a fraction of the processors will be busy during
the execution of the loop nest.
To avoid idle processors,
we use the computation decomposition to find those processor
dimensions that have parallelism for all loops.
The equation for the number of virtual processor dimensions **n**
is modified so that **n** is limited to the minimum distributed iteration space:

Here
is the array space accessed.
We then reduce the number of virtual processor dimensions by projecting
the **n**-dimensional virtual processor space onto an
-dimensional processor space.
In choosing the dimensions in the virtual processor space
to project onto, vectors are selected
such that
for all computation decomposition
matrices **C**.
This means that there are no projections onto a processor dimension that
is idle during the execution of any loop nest.

Fri Apr 7 14:39:58 PDT 1995