the proceedings contain 28 papers. the special focus in this conference is on languages and compilers for parallelcomputing. the topics include: Accurate shape analysis for recursive data structures;cost hierarchies ...
ISBN:
(纸本)3540428623
the proceedings contain 28 papers. the special focus in this conference is on languages and compilers for parallelcomputing. the topics include: Accurate shape analysis for recursive data structures;cost hierarchies for abstract parallel machines;recursion unrolling for divide and conquer programs;an empirical study of selective optimization;an application centric approach to high performance computing;extending scalar optimizations for arrays;searching for the best FFT formulas withthe SPL compiler;on materializations of array-valued temporaries;experimental evaluation of energy behavior of iteration space tiling;improving offset assignment for embedded processors;improving locality for adaptive irregular scientific codes;automatic coarse grain task parallel processing on SMP using openmp;compiler synthesis of task graphs for parallel program performance prediction;optimizing the use of high performance software libraries;compiler techniques for flat neighborhood networks;exploiting ownership sets in HPF;a performance advisor tool for shared-memory parallel programming;a comparative analysis of dependence testing mechanisms;safe approximation of data dependencies in pointer-based structures;OpenMP extensions for thread groups and their run-time support;compiling data intensive applications with spatial coordinates;efficient dynamic local enumeration for HPF;issues of the automatic generation of HPF loop programs;set operations for orthogonal processor groups;compiler based scheduling of java mobile agents and a bytecode optimizer to engineer bytecodes for performance.
the proceedings contain 39 papers. the special focus in this conference is on Java, Low-Level Transformations, Data Distribution and High-Level Transformations. the topics include: High performance numerical computing...
ISBN:
(纸本)9783540678588
the proceedings contain 39 papers. the special focus in this conference is on Java, Low-Level Transformations, Data Distribution and High-Level Transformations. the topics include: High performance numerical computing in java;comprehensive redundant load elimination for the ia-64 architecture;minimum register instruction scheduling;unroll-based copy elimination for enhanced pipeline scheduling;a linear algebra formulation for optimising replication in data parallel programs;accurate data and context management in message-passing programs;a compiler framework for tiling imperfectly-nested loops;parallel programming with interacting processes;application of the polytope model to functional programs;multilingual debugging support for data-driven and thread-based parallellanguages;an analytical comparison of the I-test and omega test;a precise fixpoint reaching definition analysis for arrays;demand-driven interprocedural array property analysis;language support for pipelining wavefront computations;a machine-independent abstraction for managing customized data motion;optimization of memory usage requirement for a class of loops implementing multi-dimensional integrals;compile-time based performance prediction;designing the Agassiz compiler for concurrent multithreaded architectures;speculative predication across arbitrary interprocedural control flow;a geometric semantics for program representation in the polytope model;compiler and run-time support for improving locality in scientific codes;code restructuring for improving real time response through code speed, size trade-offs on limited memory embedded DSPS;symbolic analysis in the PROMIS compiler;run-time parallelization optimization techniques;an empirical study of function pointers using spec benchmarks and a parallel program model for scheduling.
Partitioned Global Address Space (PGAS) languages are a popular alternative when building applications to run on large scale parallel machines. Unified parallel C (UPC) is a well known PGAS language that is available ...
详细信息
ISBN:
(纸本)9783319174730;9783319174723
Partitioned Global Address Space (PGAS) languages are a popular alternative when building applications to run on large scale parallel machines. Unified parallel C (UPC) is a well known PGAS language that is available on most high performance computing systems. Good performance of UPC applications is often one important requirement for a system acquisition. this paper presents the memory management techniques employed by the IBM XL UPC compiler to achieve optimal performance on systems with Remote Direct Memory Access (RDMA). Additionally we describe a novel technique employed by the UPC run-time for transforming remote memory accesses on a same shared memory node into local memory accesses, to further improve performance. We evaluate the proposed memory allocation policies for various UPC benchmarks and using the IBM (R) Power (R) 775 supercomputer [1].
Writing correct and efficient programs for parallel computers remains a challenging task, even after some decades of research in this area. One way to generate parallel programs is to write sequential programs and let...
详细信息
We consider a generalization of the SPMD programming model for distributed memory machines based on orthogonal processor groups. In this model different partitions of the processors into disjoint processor groups exis...
详细信息
Optimizing a parallel program is often difficult. this is true, in particular, for inexperienced programmers who lack the knowledge and intuition of advanced parallel programmers. We have developed a framework that ad...
详细信息
We have designed and implemented an interprocedural algorithm to analyze symbolic value ranges that can be assumed by variables at any given point in a program. Our algorithm contrasts with related work on interproced...
详细信息
ISBN:
(纸本)9783540693291
We have designed and implemented an interprocedural algorithm to analyze symbolic value ranges that can be assumed by variables at any given point in a program. Our algorithm contrasts with related work on interprocedural value range analysis in that it extends the ability to handle symbolic range expressions. It builds on our previous work of intraprocedural symbolic range analysis. We have evaluated our algorithm using 11 Perfect Benchmarks and 10 SPEC floating-point benchmarks of the CPU 95 and CPU 2000 suites. We have measured the ability to perform test elision, dead code elimination, and detect data dependences. We have also evaluated the algorithm's ability to help detect zero-trip loops for induction variable substitution and subscript ranges for array reductions.
暂无评论