Go to the previous, next section.
In this section we describe how to generate multiple tiled regions in a
single loop nest.
We can treat the following loop nest of depth 4 as two regions of depth two,
and tile the i and j loops together, and tile the k and
l loops together.
for (i = 0; i <= N; i++) {
for (j = 0; j <= N; j++) {
for (k = 0; k <= N; k++) {
for (l = 0; l <= N; l++) {
...
}
}
}
}
To create blocks of size 100, we'd use the following parameters:
trip[0] = 100; trip[1] = 100; trip[2] = 100; trip[3] = 100; nregions = 2; coalesce[0] = FALSE; coalesce[1] = FALSE; first[0] = 0; first[1] = 2; first[2] = 4;
The resulting code is:
for (i_tile = 0; i_tile <= N; i_tile += 100) {
for (j_tile = 0; j_tile <= N; j_tile += 100) {
for (i = i_tile; i <= min(N, i_tile + 99); i++) {
for (j = max(0, j_tile); j <= min(N, j_tile + 99); j++) {
for (k_tile = 0; k_tile <= N; k_tile += 100) {
for (l_tile = 0; l_tile <= N; l_tile += 100) {
for (k = k_tile; k <= min(N, k_tile + 99); k++) {
for (l = max(0, l_tile); j <= min(N, l_tile + 99); l++) {
...
}
}
}
}
}
}
}
}
Go to the previous, next section.