The interprocedural parallelization analysis described in the previous sections is implemented as part of the Stanford SUIF compiler. This section provides an empirical evaluation of the results of the parallelization analysis on a collection of benchmark programs.
Previous evaluations of interprocedural parallelization systems have provided static measurements of the number of additional loops parallelized as a result of interprocedural analysis [13,14,18,24]. We have compared our results with the most recent of these empirical studies, which examines the SPEC89 and PERFECT benchmark suites . When considering only those loops containing calls for this set of 16 programs, the SUIF system is able to parallelize greater than five times more of these loops . The key difference between the two systems is that SUIF contains full interprocedural array analysis, including array privatization and reduction recognition (see Section 5).
Static loop counts, however, are not good indicators of whether parallelization will be successful. Specifically, parallelizing just one outermost loop can have a profound impact on a program's performance. Dynamic measurements provide much more insight into whether a program may benefit from parallelization. Thus, in addition to static measurements on the benchmark suites, we also present a series of results gathered from executing the programs on a parallel machine. We present overall speedup results, as well as other measurements on some of the factors that determine the speedup. We also provide results that identify the contributions of the analysis components of our system, focusing on the advanced array analyses.