While parallel speedups measure the overall effectiveness of a parallel system, they are also highly machine dependent. Not only do speedups depend on the number of processors, they are sensitive to many aspects of the architecture, such as the cost of synchronization, the interconnect bandwidth and the memory subsystem. Furthermore, speedups measure the effectiveness of the entire compiler system and not just the parallelization analysis, which is the focus of the paper. For example, techniques to improve data locality and minimize synchronization can greatly improve the speedups obtained. Thus, to more precisely capture how well the parallelization analysis performs, we use the two following metrics:
For the sake of completeness, we also present a set of speedup measurements. The programs in the benchmark suite have relatively short execution times as well as fine granularities of parallelism, as shown in Figure 5(C). Most of these programs cannot utilize a large number of processors effectively. For our experiment, we run all the programs on a 4-processor 200MHz SGI Challenge. Speedups are calculated as ratios between the execution time of the original sequential program and the parallel execution time. The results are shown in Figure 5(D).