Figure 5(B1) shows that the advanced array analyses dramatically increase parallelism coverage on 3 of the 10 programs. In other words, all the major loops that require sophisticated array analyses do not contain any loops that can be parallelized using conventional techniques. These new parallel loops are also rather coarse grained, as can be observed from Figure 5(C1). Overall the compiler achieves good results parallelizing SPEC92FP. Coverage is above 80% for 8 of the 10 programs, and a speedup is achieved on all of these 8.
The results also show that coverage is necessary but not sufficient for high speedups. Programs with fine granularity of parallelism, even those with high coverage such as su2cor, tomcatv and nasa7, tend to have lower speedups. Another important factor that affects speedups is data locality. Two of these programs, tomcatv and nasa7, have poor memory behavior. The performance of these programs can be improved significantly via data and loop transformations to improve cache locality and techniques to minimize synchronization.