Data and Computation Transformations for Multiprocessors
Jennifer M. Anderson, Saman P. Amarasinghe and Monica S. Lam
Computer Systems Laboratory
Stanford University, CA 94305
This research was supported in part by ARPA contracts DABT63-91-K-0003
and DABT63-94-C-0054, an NSF Young Investigator Award and fellowships
from Digital Equipment Corporation's Western Research Laboratory and
In Proceedings of Fifth ACM SIGPLAN Symposium on Principles and
Practice of Parallel Programming (PPoPP '95)
Santa Barbara, CA, July 19--21, 1995
Effective memory hierarchy utilization is critical to the performance
of modern multiprocessor architectures. We have developed the first
compiler system that fully automatically parallelizes sequential
programs and changes the original array layouts to improve memory
system performance. Our optimization algorithm consists of two steps.
The first step chooses the parallelization and computation assignment
such that synchronization and data sharing are minimized. The second
step then restructures the layout of the data in the shared address
space with an algorithm that is based on a new data transformation
We ran our compiler on a set of application programs and measured their
performance on the Stanford DASH multiprocessor.
Our results show that the compiler can effectively optimize parallelism
in conjunction with memory subsystem performance.
Fri Apr 7 11:22:17 PDT 1995