next up previous
Next: Empirical Evaluation Up: Detecting Coarse-Grain Parallelism Using Previous: Putting It All

Related Work

  In the late 1980s, a series of papers presented results on interprocedural parallelization analysis [13,18,24]. Their common approach was to determine the sections of arrays that are modified or referenced by each procedure call, enabling parallelization of some loops containing calls whenever each invocation modifies array elements distinct from those that are referenced or modified in other invocations. These techniques were shown to be effective in parallelizing linear algebra libraries. More recently, the FIDA system was developed at IBM to obtain more precise array sections through partial inlining of array accesses [14] (see Section 6).

Irigoin et al. have developed the PIPS system, an interprocedural analysis system that is part of an environment for parallel programming [16]. More recently, PIPS has been extended to incorporate interprocedural array privatization [15,6]. PIPS is most similar to our work, but lacks three important features: (1) path-specific interprocedural information such as obtained through selective procedure cloning, (2) interprocedural reductions, and (3) extensive interprocedural scalar data-flow analysis such as scalar privatization.

The Polaris system at University of Illinois is also currently being developed to advance the state of the art in parallelization technology [2]. The most fundamental difference between our system and Polaris is that Polaris performs no interprocedural analysis, instead relying on full inlining of the programs to obtain interprocedural information. The Polaris group has demonstrated that good coverage results (% of the program parallelized) can be obtained automatically. Although they report that full inlining is feasible on eight medium-sized programs, this approach will have difficulty parallelizing large loops containing thousands of lines of code.

A few commercial parallelizing compilers have initial interprocedural analysis systems. Most notably, the Convex Applications Compiler performs flow-insensitive array analysis and interprocedural constant propagation and obtains some path-specific information through inlining and procedure cloning [21]. Applied Parallel Research has demonstrated good speedup results on some of the programs presented here; these programs were parallelized with programmer directives that instruct the compiler to ignore dependences and to privatize certain variables. We know of no commercial system that currently employs any flow-sensitive array analysis, particularly interprocedural array privatization.



next up previous
Next: Empirical Evaluation Up: Detecting Coarse-Grain Parallelism Using Previous: Putting It All



Saman Amarasinghe
Fri Sep 15 09:15:06 PDT 1995