Go to the previous section.

In the SUIF vs. KAP tables (see section Performance Results), a number of
the *KAP only* loops are due to SUIF parallelizing an outer loop
while KAP is parallelizing an inner loop. However, there are a number
of limitations in the current SUIF compiler that cause it to miss some
loops that KAP is able to parallelize.

DO 30 K=1,NZ UM(K)=REAL(WM(K)) VM(K)=AIMAG(WM(K)) WRITE(6,40) K,ZET(K),UG(K),VG(K),TM(K),DKM(K),UM(K),VM(K) WRITE(8,40) K,ZET(K),UG(K),VG(K),TM(K),DKM(K),UM(K),VM(K) 30 CONTINUE

However, KAP is able to parallelize some of the statements in the loop by applying loop distribution:

DO 2 K=1,NZ UM(K)=REAL(WM(K)) VM(K)=AIMAG(WM(K)) 2 CONTINUE DO 3 K=1,NZ WRITE(6,40) K,ZET(K),UG(K),VG(K),TM(K),DKM(K),UM(K),VM(K) WRITE(8,40) K,ZET(K),UG(K),VG(K),TM(K),DKM(K),UM(K),VM(K) 3 CONTINUE

The `DO 2 K`

loop can now be parallelized, and the `DO 3 K`

loop
runs sequentially.

Each of the remaining SUIF limitations listed below have a lesser impact
than loop distribution and equivalences. They each account for only a
small number of the *KAP only* loops.

NS2 = (N+1)/2 NP2 = N+2 ... DO 102 K=2,NS2 KC = NP2-K XH(K) = W(K-1)*X(KC)+W(KC-1)*X(K) XH(KC) = W(K-1)*X(K)-W(KC-1)*X(KC) 102 CONTINUE

SUIF replaces `NS2`

with `(N+1)/2`

and `KC`

with
`N+2-K`

, leaving the following code:

DO 102 K=2,(N+1)/2 KC = NP2-K XH(K) = W(K-1)*X(N+2-K)+W(N-K+1)*X(K) XH(N+2-K) = W(K-1)*X(K)-W(N-K+1)*X(N+2-K) 102 CONTINUE

However, since SUIF cannot determine whether `N+1`

is even, the
function `(N+1)/2`

is non-linear. The dependence library
then does not use the bounds to determine that the accesses to
`XH(K)`

and `X(N+2-K)`

are independent.

DO 30 K=1,NZTOP DO 20 J=1,NY DO 10 I=1,NX L=L+1 DCDX(L)=-(UX(L)+UM(K))*DCDX(L)-(VY(L)+VM(K))*DCDY(L)+Q(L) 10 CONTINUE 20 CONTINUE 30 CONTINUE

The auxiliary induction variable `L`

is replaced with a function of
the loop indices, and the resulting accesses to the array `DCDX`

become `DCDX(NY * NX * (K-1) + NX * (J-1) + I - 1)`

. Because
of the multiple symbolic coefficients `NY`

and `NX`

, the
dependence library is not able to analyze the expression.

14 KBOT = KK - 1 KTOP = KK ... code containing IF statements deleted ... DO 18 K=KK,KTOP 18 Q(K) = Q(KBOT)

KAP is able replace `KBOT`

with `KK-1`

and can thus determine
that the two accesses to array `Q`

are independent.

```
porky
-scalarize
```

pass will only turn array elements into scalars if they are
accessed by constant indices throughout the entire program.

`EXP`

(on complex values) gets translated to a procedure
with two arguments in the

DO 30 I=2,N2P 30 WORK(1) = MAX(WORK(1), WORK(I))

Elements `WORK(2)`

through `WORK(N2P)`

are reduced into
`WORK(1)`

. Currently, SUIF will not find reductions of arrays into
a subsection of the array.

`porky -scalarize`

pass. However, the reduction
recognition pass cannot handle the indirect reductions through the
temporaries.

DO 40 I=0,249 IREG(I)=IREG(I+LVEC) 40 CONTINUE

This loop is sequential only when `LVEC <= 249`

, and
thus the second loop below can run in parallel:

IF (LVEC .LE. 249) THEN C Sequential DO 40 I=0,249 IREG(I)=IREG(I+LVEC) 40 CONTINUE ELSE C Parallel DO 2 I=0,249 IREG(I) = IREG(I+LVEC) 2 CONTINUE END IF

Go to the previous section.