Table 2 gives counts of the number of loops in the SUIF-parallelized program that require a particular technique to be parallelizable. In this table, we count all parallelizable loops, including those nested within other parallel loops which would consequently not be executed in parallel under our parallelization strategy. The first column gives the number of loops that are parallelizable in the baseline system. The next three columns measure the applicability of the intraprocedural versions of advanced array analyses. They measure the effect of including reduction recognition, privatization, and both reduction recognition and privatization, respectively. The next set of four columns all have interprocedural data dependence analysis. Similarly, the sixth to eighth columns measure the effect of adding interprocedural reduction recognition, privatization, and both reduction recognition and privatization, respectively.
Table 2: Static Measurements: Number of Loops Using Each Technique
We see from this table that the advanced array analyses are applicable to a majority of the programs in the benchmark suite, and several programs can take advantage of all the interprocedural array analyses. Although the techniques do not apply uniformly to all the programs, the frequency in which they are applicable for this relatively small set of programs demonstrates that the techniques are general and useful. We observe that there are many more loops that do not require any new array techniques. However, loops parallelized with advanced array analyses often involve more computation and, as shown below, can make a substantial difference in overall performance.