Next: The Foldability Criteria Up: Folding Previous: Depthwise Folding

Widthwise Folding

I have shown how to reduce the hardware required from to PEs, but we can reduce it still further. Suppose this time that we feed the left half of the inputs in on the first eight clock cycles, and the right half of the inputs on the second eight clock cycles, as shown in Figure 2.5. Except for a data dependency between the nodes on the line dividing the array in half, the two halves operate independently, and are not in use at the same time. We can use short term memory to store the results that cross the dividing line, and then the array can be executed with half the number of processors as was done earlier. Again the PEs must switch tasks as necessary. This kind of folding is called widthwise folding.

The properties of widthwise folding are:

Depthwise folding and widthwise folding can be combined arbitrarily to execute the desired array with as little as a single PE. This brings us to the first major property of an origami array:

The amount of time required to execute an origami array is inversely proportional to the number of processors available, assuming the width and height of the desired array are perfect multiples of the width and height of the available processor array.



Next: The Foldability Criteria Up: Folding Previous: Depthwise Folding


Robert French