The decomposition problem is very complex, as there are many inter-related issues that must be addressed. This paper addresses the full problem of automatically calculating data and computation decompositions for programs in a systematic way.

Our algorithms are based on the mathematical model of decompositions as affine functions. This framework is general enough to handle a broad class of array access patterns. Using the affine model we structure decompositions into three components: partition, orientation and displacement. Since equivalent decompositions have the same partition, we solve for the partition first and can therefore evaluate many possible decomposition designs simultaneously.

To maximize parallelism, our algorithm exploits ** forall**
parallelism, as well as ** doacross** parallelism using tiling.
To minimize communication, the algorithm tries to find a
static decomposition
that exploits the maximum degree of parallelism available in the
program such that there is no reorganization nor pipeline communication.
The algorithm will trade off extra degrees of parallelism to
eliminate communication.
If communication is needed, the algorithm
will try to
reduce expensive reorganization communication to inexpensive pipelined
communication by tiling.
Finally, any necessary data reorganization
communication is inserted into the least frequently executed parts of
the program.

Fri Apr 7 14:39:58 PDT 1995