A simple example motivating the need for array privatization is the K loop in Figure 2(a), a 160-line loop taken from the NAS sample benchmark appbt. (To concisely present these examples, we use Fortran 90 array notation in place of the actual loops.) Although the same array locations in TM are defined and used on different iterations of the outer loop, no value flows across iterations. Consequently, it is safe to parallelize this loop if a private copy of TM is accessed by each process. Finding privatizable arrays requires that data-flow analysis previously only performed for scalar variables be applied to individual array elements; this analysis is called array data-flow analysis.
Figure 2: Array analysis examples.
Array privatization with initialization. Array privatization is usually applied to the case where each iteration first defines the values in the array before they are used. However, it is also applicable to loops whose iterations use values computed outside the loop; the private copies must be initialized with these values before parallel execution begins. In other words, array privatization is illegal only when iterations refer to values generated by preceding iterations in the loop. An example of array privatization with initialization is shown in Figure 2(b). The figure shows a portion of a 1002-line loop in the Perfect benchmark spec77 (see Section 6.3.2). Here, part of array ZE, the second row, is modified before referenced; the remainder of the array is not modified at all in the loop. Array ZE is privatizable in the outer loop by giving each processor a private copy with all but the second row initialized with the original values.
Recognizing complicated regions. Data flow analysis on arrays is intrinsically more difficult than analysis on scalar variables, as it is necessary to keep track of accesses to individual array elements. In some cases, the reads and writes to an array may appear in multiple loops, and the read and write operations may be interleaved. The compiler must keep precise enough information for the analysis while maintaining efficiency. Figure 2(b), which is also part of the 1002-line loop from spec77, illustrates the complexity of the problem. Each statement in array notation here corresponds to a doubly nested loop. The compiler can determine that the array W is privatizable by inferring from the collection of write operations that W is completely defined before it is read.
Figure 3: Reduction analysis examples.