Go to the previous, next section.

Tiling Transforms

The tiling transformation, in the general case, maps an n-deep loop nest into a 2n-deep loop nest where the inner n loops include only a small fixed number of iterations.

The method tile_transform in the loop_transform class is used to generate the tiling transformations. The loop nest can be separated into regions, where each region is tiled separately. Each region can optionally be coalesced. For a region of size r, if coalescing is turned off, then the region is transformed into 2r loops. If coalescing is turned on for the region, then it is transformed into 2r - 1 loops. A region of size one will not be transformed, regardless of whether it is coalesced.

The tile_transform method takes four parameters:

trip
An array of integer trip sizes, one for each loop. This specifies the size of the block for the corresponding loop.

nregions
The number of tiling regions.

coalesce
An array of booleans, one for each region. If true then the corresponding region is coalesced.

first
An index array giving the position of the first loop in the region. This array must have its last element (at position nregions) set to the nesting depth of the entire loop nest.

In addition, the tiling transformation requires that the boolean array of doalls in the loop_transform class be filled in. See section Loop Transformations.

The method returns a boolean to indicate if the transformation was successful. In general, given the following region:

for (i = lb0; i <= ub0; i++) {
    for (j = lb1(i); j <= ub1(i); j++) {
        ...
    }
}

Using blocks of size b0 for the i loop, and blocks of size b1 for the j loop, the tiling transformation turns the code into:

for (i_tile = lb0; i_tile <= ub0; i_tile += b0) {
    for (j_tile = lb1(i_tile);
         j_tile <= max(ub1(i_tile), ub1(i_tile+b0-1)); j_tile += b1) {

        for (i = i_tile; i <= min(ub0, i_tile+b0-1), i++) {
            for (j = max(lb1(i), j_tile); 
                 j <= min(ub1(i), j_tile+b1-1); j++) {
                        ...
            }
        }
    }
}

In this example, we assume the lower and upper bounds of the loops are increasing functions of the enclosing loops plus a constant (the resulting code for decreasing functions is slightly different). If possible, the lower and upper bounds are calculated at compile-time. Otherwise, the max and min instructions in the bounds are performed at run-time.

Go to the previous, next section.