next up previous
Next: About this document Up: Global Optimizations for Parallelism Previous: Summary and Conclusions

References

1
J. R. Allen and K. Kennedy. Automatic translation of Fortran programs to vector form. ACM Transactions on Programming Languages and Systems, 9(4):491--542, October 1987.
2
S. P. Amarasinghe and M. S. Lam. Communication optimization and code generation for distributed memory machines. In Proceedings of the SIGPLAN '93 Conference on Programming Language Design and Implementation, June 1993.
3
C. Ancourt and F. Irigoin. Scanning polyhedra with DO loops. In Proceedings of the Third ACM/SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 39--50, April 1991.
4
V. Balasundaram, G. Fox, K. Kennedy, and U. Kremer. A static performance estimator to guide data partitioning decisions. In Proceedings of the Third ACM/SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 213--222, April 1991.
5
D. Callahan. A Global Approach to Detection of Parallelism. PhD thesis, Rice University, April 1987. Published as COMP-TR-87-50.
6
A. Carle, K. Kennedy, U. Kremer, and J. Mellor-Crummey. Automatic data layout for distributed-memory machines in the D programming environment. Technical Report CRPC-TR93-298, Rice University, February 1993.
7
B. Chapman, P. Mehrotra, and H. Zima. Programming in Vienna Fortran. Scientific Programming, 1(1):31--50, Fall 1992.
8
S. Chatterjee, J. R. Gilbert, R. Schreiber, and S.-H. Teng. Automatic array alignment in data-parallel programs. In Proceedings, 20th Annual ACM Symposium on Principles of Programming Languages, pages 16--28, January 1993.
9
E. Dahlhaus, D. S. Johnson, C. H. Papadimitriou, P. D. Seymour, and M. Yannakakis. The complexity of multiway cuts. In Proceedings of the 24th ACM Symposium on the Theory of Computing, pages 241--251, May 1992.
10
G. R. Gao, R. Olsen, V. Sarkar, and R. Thekkath. Collective loop fusion for array contraction. In Proceedings of the Fifth Workshop on Programming Languages and Compilers for Parallel Computing, pages 171--181, August 1992.
11
J. R. Gilbert and R. Schreiber. Optimal expression evaluation for data parallel architectures. Journal of Parallel and Distributed Computing, 13(1):58--64, September 1991.
12
M. Gupta and P. Banerjee. Demonstration of automatic data partitioning techniques for parallelizing compilers on multicomputers. Transactions on Parallel and Distributed Systems, 3(2):179--193, March 1992.
13
High Performance Fortran Forum. High Performance Fortran Language Specification, November 1992. Version 0.4.
14
S. Hiranandani, K. Kennedy, and C.-W. Tseng. Compiling Fortran D for MIMD distributed-memory machines. Communications of the ACM, 35(8):66--80, August 1992.
15
C. H. Huang and P. Sadayappan. Communication-free hyperplane positioning of nested loops. In U. Banerjee, D. Gelernter, A. Nicolau, and D. Padua, editors, Languages and Compilers for Parallel Computing, pages 186--200. Springer-Verlag, Berlin, Germany, 1992.
16
Y.-T. Hwang and Y. H. Hu. On systolic mapping of multi-stage algorithms. In Proceedings of the IEEE International Conference on Application Specific Array Processors, pages 47--61, August 1992.
17
Intel Corporation, Santa Clara, CA. iPSC/2 and iPSC/860 User's Guide, June 1990.
18
F. Irigoin and R. Triolet. Supernode partitioning. In Proceedings of the SIGPLAN '88 Conference on Programming Language Design and Implementation, pages 319--329, January 1988.
19
Y. Ju and H. Dietz. Reduction of cache coherence overhead by compiler data layout and loop transformation. In U. Banerjee, D. Gelernter, A. Nicolau, and D. Padua, editors, Languages and Compilers for Parallel Computing, pages 344--358. Springer-Verlag, Berlin, Germany, 1992.
20
K. Kennedy and K. S. McKinley. Optimizing for parallelism and data locality. In Proceedings of the 1992 ACM International Conference on Supercomputing, pages 323--334, July 1992.
21
K. Knobe, J. D. Lukas, and G. L. Steele. Data optimization: Allocation of arrays to reduce communication on SIMD machines. Journal of Parallel and Distributed Computing, 8:102--118, 1990.
22
C. Koelbel, P. Mehrotra, and J. Van Rosendale. Supporting shared data structures on distributed memory architectures. In Proceedings of the Second ACM/SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 177--186, March 1990.
23
D. Kulkarni, K. G. Kumar, A. Basu, and A. Paulraj. Loop partitioning for distributed memory multiprocessors as unimodular transformations. In Proceedings of the 1991 ACM International Conference on Supercomputing, pages 206--215, June 1991.
24
K. G. Kumar, D. Kulkarni, and A. Basu. Deriving good transformations for mapping nested loops on hierarchical parallel machines in polynomial time. In Proceedings of the 1992 ACM International Conference on Supercomputing, pages 82--91, July 1992.
25
M. S. Lam and M. E. Wolf. Compilation techniques to achieve parallelism and locality. In Proceedings of the DARPA Software Technology Conference, pages 150--158, April 1992.
26
D. Lenoski, K. Gharachorloo, J. Laudon, A. Gupta, J. Hennessy, M. Horowitz, and M. Lam. The Stanford DASH Multiprocessor. IEEE Computer, 25(3):63--79, March 1992.
27
J. Li and M. Chen. Generating explicit communication from shared-memory program references. In Supercomputing 1990, pages 865--876. IEEE, May 1990.
28
J. Li and M. Chen. Index domain alignment: Minimizing cost of cross-referencing between distributed arrays. In Proceedings of Frontiers '90: The Third Symposium on the Frontiers of Massively Parallel Computation, pages 424--432. IEEE, October 1990.
29
D. E. Maydan. Accurate Analysis of Array References. PhD thesis, Stanford University, September 1992. Published as CSL-TR-92-547.
30
J. F. Prins. A framework for efficient execution of array-based languages on SIMD computers. In Proceedings of Frontiers '90: The Third Symposium on the Frontiers of Massively Parallel Computation, pages 462--470. IEEE, October 1990.
31
A. Rogers and K. Pingali. Compiling for locality. In Proceedings of the 1990 International Conference on Parallel Processing, pages 142--146, June 1990.
32
V. Sarkar and G. R. Gao. Optimization of array accesses by collective loop transformations. In Proceedings of the 1991 ACM International Conference on Supercomputing, pages 194--204, June 1991.
33
C.-W. Tseng. An Optimizing Fortran D Compiler for MIMD Distributed-Memory Machines. PhD thesis, Rice University, January 1993. Published as Rice COMP TR93-199.
34
P.-S. Tseng. A Parallelizing Compiler for Distributed Memory Parallel Computers. PhD thesis, Carnegie Mellon University, May 1989. Published as CMU-CS-89-148.
35
S. Wholey. Automatic Data Mapping for Distributed-Memory Parallel Computers. PhD thesis, Carnegie Mellon University, May 1991. Published as CMU-CS-91-121.
36
M. E. Wolf. Improving Locality and Parallelism in Nested Loops. PhD thesis, Stanford University, August 1992. Published as CSL-TR-92-538.
37
M. E. Wolf and M. S. Lam. A data locality optimizing algorithm. In Proceedings of the SIGPLAN '91 Conference on Programming Language Design and Implementation, pages 30--44, June 1991.
38
M. E. Wolf and M. S. Lam. A loop transformation theory and an algorithm to maximize parallelism. Transactions on Parallel and Distributed Systems, 2(4):452--470, October 1991.
39
M. J. Wolfe. Optimizing Supercompilers for Supercomputers. MIT Press, Cambridge, MA, 1989.
40
H. P. Zima, H.-J. Bast, and M. Gerndt. SUPERB: A tool for semi-automatic MIMD / SIMD parallelization. Parallel Computing, 6(1):1--18, January 1988.



Jennifer-Ann M. Anderson
Fri Apr 7 14:39:58 PDT 1995