In the late 1980s, a series of papers presented results on interprocedural parallelization analysis [9,15,20]. Their common approach was to determine the sections of arrays that are modified or referenced by each procedure call, enabling parallelization of some loops containing calls whenever each invocation modifies array elements distinct from those that are referenced or modified in other invocations. These techniques were shown to be effective in parallelizing linear algebra libraries. More recently, the FIDA system was developed at IBM to obtain more precise array sections through partial inlining of array accesses  (see Section 7).
Irigoin et al. developed an interprocedural analysis system, called PIPS, that is part of an environment for parallel programming . More recently, PIPS has been extended to incorporate interprocedural array privatization [11,5]. PIPS is most similar to our work, but lacks three important features: (1) path-specific interprocedural information such as obtained through selective procedure cloning, (2) interprocedural reductions, and (3) extensive interprocedural scalar data-flow analysis such as scalar privatization.
The Polaris system at University of Illinois is also pushing the state of the art in parallelization technology . The most fundamental difference between our system and Polaris is that Polaris performs no interprocedural analysis, instead relying on full inlining of the programs to obtain interprocedural information. The Polaris group has demonstrated that good coverage (fraction of the program parallelized) can be obtained automatically. Although they report that full inlining is feasible on eight medium-sized programs, this approach will likely have difficulty parallelizing large loops containing thousands of lines of code.