The proceedings contains 47 papers. Topics discussed include memory performance and architecture, communication networks and routing, allocation and load balancing, algorithms and techniques, synchronization, communic...
详细信息
The proceedings contains 47 papers. Topics discussed include memory performance and architecture, communication networks and routing, allocation and load balancing, algorithms and techniques, synchronization, communication and prefetching, tools, environment and techniques, simulation, tools and techniques, parallel systems and algorithms.
The proceedings contain 42 papers. The topics discussed include: improvement of duplication scheduling heuristic algorithm with nonstrict triggering of program graph nodes;cohesion : an efficient distributed shared me...
ISBN:
(纸本)081867038X
The proceedings contain 42 papers. The topics discussed include: improvement of duplication scheduling heuristic algorithm with nonstrict triggering of program graph nodes;cohesion : an efficient distributed shared memory system supporting multiple memory consistency models;supercompilers for massively parallelarchitectures;investigation of some hardware accelerators for relational algebra operations;implementing higher-order gamma on MasPar: a case study;a framework for visual parallel programming;parallelizing a PDE solver: experiences with PISCES-MP;efficient scalable mesh algorithms for merging, sorting and selection;and constructing parallel implement at ions with algebraic programming tools.
A parallel distributed scheme is presented for coding random patterns generated via iteration of contraction mappings. The scheme is implemented by a decentralized computational process an distributed parameter system...
详细信息
ISBN:
(纸本)0818678704
A parallel distributed scheme is presented for coding random patterns generated via iteration of contraction mappings. The scheme is implemented by a decentralized computational process an distributed parameter system. The distributed parameter system generates a representation of missing probability for attractor as the basis of design of contraction mappings. The local minimals of missing probability is extracted and aggregated as the decentralized representation of contraction mappings. The coding scheme is verified through computer simulation.
A new parallel algorithm for Householder bidiagonalization on parallel computers with dynamic ring architecture is presented. The Householder bidiagonalization is the core for singular value decomposition (SVD) which ...
详细信息
ISBN:
(纸本)0818678704
A new parallel algorithm for Householder bidiagonalization on parallel computers with dynamic ring architecture is presented. The Householder bidiagonalization is the core for singular value decomposition (SVD) which has been found to be very useful as an analytical tool in the presence of roundoff error and inexact data. Two-sided Householder reduction/expansion technique is applied for bidiagonalization. Innovative systolic-like communication techniques are proposed which eliminate the need for computing explicitly the transpose of the matrix. The experimental study on CM-5 shows that the parallel algorithm developed in this paper achieves high speedup for large matrices.
The problem of global synchronization in massively parallel systems is discussed for the level of models represented by asynchronous cellular automata arrays. Synchronization is called global if a given asynchronous a...
详细信息
ISBN:
(纸本)0818678704
The problem of global synchronization in massively parallel systems is discussed for the level of models represented by asynchronous cellular automata arrays. Synchronization is called global if a given asynchronous automata array functions in logical time so that its behavior can be homomorphously mapped to the behavior of the prototype synchronous system in physical time. Our approach is decomposing the asynchronous array to synchro-stratum which acts as a distributed asynchronous clock and automata stratum whose automata have a construction similar to that of the synchronous prototype array automata. For various disciplines of prototype synchronization, the corresponding variants of synchro-stratum implementation for the asynchronous analogue are discussed.
The mpC language is an ANSI C superset supporting modular parallel programming for distributed memory machines. It allows the user to specify dynamically an application topology, and the mpC programming environment us...
详细信息
ISBN:
(纸本)0818678704
The mpC language is an ANSI C superset supporting modular parallel programming for distributed memory machines. It allows the user to specify dynamically an application topology, and the mpC programming environment uses this information in run time to provide the most efficient execution of the program on any particular distributed memory machine. The paper describes the features of mpC and its programming environment which allow to use them for developing libraries of parallel programs.
As the resolution of simulation models increases, scientific visualization algorithms which take advantage of the large memory and parallelism of Massively parallel Processors (MPPs) are becoming increasingly importan...
详细信息
ISBN:
(纸本)0818678704
As the resolution of simulation models increases, scientific visualization algorithms which take advantage of the large memory and parallelism of Massively parallel Processors (MPPs) are becoming increasingly important. For large applications rendering on the MPP tends to be preferable to rendering on a graphics workstation due to the MPP's abundant resources: memory, disk, and numerous processors. The challenge becomes developing algorithms that carl exploit these resources while minimizing overhead, typically communication costs. This paper will describe recent efforts in parallel rendering for polygonal primitives as well as parallel volumetric techniques. This paper presents rendering algorithms, developed for massively parallel processors (MPPs), for polygonal, spheres, and volumetric data. The polygon algorithm uses a data parallel approach whereas the sphere and volume render use a MIMD approach. Implementations for these algorithms are presented far the Thinking Machines Corporation CM-5 and the Cray Research Inc T3D.
Superscalar and VLIW architectures are based on instruction-level parallelism (ILP), which ideally achieve high performance to execute multiple instructions in parallel. However, the system performance is restricted b...
详细信息
ISBN:
(纸本)0818678704
Superscalar and VLIW architectures are based on instruction-level parallelism (ILP), which ideally achieve high performance to execute multiple instructions in parallel. However, the system performance is restricted because of the Von Neumann bottleneck. Therefore, the memory hierarchy design is very important in this kind of architecture. We have proposed a computer architecture named Jetpipeline, which can execute both vector and scalar instructions in parallel. To make full use of the computing ability of Jetpipeline, this paper presents the memory hierarchy design for Jetpipeline and evaluates the effect of the design on the system performance of Jetpipeline through simulations.
暂无评论