The proceedings contains 47 papers. Topics discussed include memory performance andarchitecture, communication networks and routing, allocation and load balancing, algorithms and techniques, synchronization, communic...
详细信息
The proceedings contains 47 papers. Topics discussed include memory performance andarchitecture, communication networks and routing, allocation and load balancing, algorithms and techniques, synchronization, communication and prefetching, tools, environment and techniques, simulation, tools and techniques, parallel systems andalgorithms.
A parallel distributed scheme is presented for coding random patterns generated via iteration of contraction mappings. The scheme is implemented by a decentralized computational process an distributed parameter system...
详细信息
ISBN:
(纸本)0818678704
A parallel distributed scheme is presented for coding random patterns generated via iteration of contraction mappings. The scheme is implemented by a decentralized computational process an distributed parameter system. The distributed parameter system generates a representation of missing probability for attractor as the basis of design of contraction mappings. The local minimals of missing probability is extracted and aggregated as the decentralized representation of contraction mappings. The coding scheme is verified through computer simulation.
A new parallel algorithm for Householder bidiagonalization on parallel computers with dynamic ring architecture is presented. The Householder bidiagonalization is the core for singular value decomposition (SVD) which ...
详细信息
ISBN:
(纸本)0818678704
A new parallel algorithm for Householder bidiagonalization on parallel computers with dynamic ring architecture is presented. The Householder bidiagonalization is the core for singular value decomposition (SVD) which has been found to be very useful as an analytical tool in the presence of roundoff error and inexact data. Two-sided Householder reduction/expansion technique is applied for bidiagonalization. Innovative systolic-like communication techniques are proposed which eliminate the need for computing explicitly the transpose of the matrix. The experimental study on CM-5 shows that the parallel algorithm developed in this paper achieves high speedup for large matrices.
The problem of global synchronization in massively parallel systems is discussed for the level of models represented by asynchronous cellular automata arrays. Synchronization is called global if a given asynchronous a...
详细信息
ISBN:
(纸本)0818678704
The problem of global synchronization in massively parallel systems is discussed for the level of models represented by asynchronous cellular automata arrays. Synchronization is called global if a given asynchronous automata array functions in logical time so that its behavior can be homomorphously mapped to the behavior of the prototype synchronous system in physical time. Our approach is decomposing the asynchronous array to synchro-stratum which acts as a distributed asynchronous clock and automata stratum whose automata have a construction similar to that of the synchronous prototype array automata. For various disciplines of prototype synchronization, the corresponding variants of synchro-stratum implementation for the asynchronous analogue are discussed.
The mpC language is an ANSI C superset supporting modular parallel programming for distributed memory machines. It allows the user to specify dynamically an application topology, and the mpC programming environment us...
详细信息
ISBN:
(纸本)0818678704
The mpC language is an ANSI C superset supporting modular parallel programming for distributed memory machines. It allows the user to specify dynamically an application topology, and the mpC programming environment uses this information in run time to provide the most efficient execution of the program on any particular distributed memory machine. The paper describes the features of mpC and its programming environment which allow to use them for developing libraries of parallel programs.
Superscalar and VLIW architectures are based on instruction-level parallelism (ILP), which ideally achieve high performance to execute multiple instructions in parallel. However, the system performance is restricted b...
详细信息
ISBN:
(纸本)0818678704
Superscalar and VLIW architectures are based on instruction-level parallelism (ILP), which ideally achieve high performance to execute multiple instructions in parallel. However, the system performance is restricted because of the Von Neumann bottleneck. Therefore, the memory hierarchy design is very important in this kind of architecture. We have proposed a computer architecture named Jetpipeline, which can execute both vector and scalar instructions in parallel. To make full use of the computing ability of Jetpipeline, this paper presents the memory hierarchy design for Jetpipeline and evaluates the effect of the design on the system performance of Jetpipeline through simulations.
This paper presents an outline of our strategy for program synthesis within the VIM film technology where special-purpose animation films are used as a new type of abstraction. To specify an algorithm/method the user ...
详细信息
ISBN:
(纸本)0818678704
This paper presents an outline of our strategy for program synthesis within the VIM film technology where special-purpose animation films are used as a new type of abstraction. To specify an algorithm/method the user develops his/her own film. New films are created through cutting, editing and another few click operations as well as by combining and merging component films. These operations predefine transformations rules to be performed with templates related to the system films. These templates are hand-made programs or files of a few hierarchical levels taking into account various kinds of programming know-how and techniques for the efficient implementation, of computation. on a target computer system. The program synthesis is performed by sequential transformations of the above-mentioned programs and files.
This paper presents a new approach for parallel machines simulation, based on he discrete-event system specification (DEVS) formalism. Our simulation approach is directed to parallel machine's simulation at the co...
详细信息
ISBN:
(纸本)0818678704
This paper presents a new approach for parallel machines simulation, based on he discrete-event system specification (DEVS) formalism. Our simulation approach is directed to parallel machine's simulation at the concurrent threads' level and is applicable for analysis the influence of internal algorithm/application concurrency on the performance characteristics of parallel machines. It uses as model's environment an abstraction of the parallel program's concurrent threads. The description of the modeled parallel machine is based an template models of thread We consider a program environment for discrete-event simulation of parallel computers based on our simulation approach. We present also some performance/utilization results from the simulation of a parallel database machine class, performed with our simulation environment.
The paper(1) presents parallelalgorithms for efficient solution of the SVD (Singular Value Decomposition) problem by the block two-sided Jacobi method. It is shown how the method could be applied to MIMD computers wi...
详细信息
ISBN:
(纸本)0818678704
The paper(1) presents parallelalgorithms for efficient solution of the SVD (Singular Value Decomposition) problem by the block two-sided Jacobi method. It is shown how the method could be applied to MIMD computers with the hypercube and ring topology Three types of orderings for solving SVD on block-structured submatrices are analysed from the point of view of communication requirements and suitability for a parallel execution of the computational process. which is carried out on block-columns of the matrix. All three orderings fit well to the hypercube topology. Two of them cart be directly implemented also on rings. The optimality in parallelization of the method and data transfers has been achieved there within Each sweep. For the third scheme, an efficient numbering of processor nodes is discussed. Computer results obtained on an Intel Paragon system are shown for a chosen ordering.
As the resolution of simulation models increases, scientific visualization algorithms which take advantage of the large memory andparallelism of Massively parallel Processors (MPPs) are becoming increasingly importan...
详细信息
ISBN:
(纸本)0818678704
As the resolution of simulation models increases, scientific visualization algorithms which take advantage of the large memory andparallelism of Massively parallel Processors (MPPs) are becoming increasingly important. For large applications rendering on the MPP tends to be preferable to rendering on a graphics workstation due to the MPP's abundant resources: memory, disk, and numerous processors. The challenge becomes developing algorithms that carl exploit these resources while minimizing overhead, typically communication costs. This paper will describe recent efforts in parallel rendering for polygonal primitives as well as parallel volumetric techniques. This paper presents rendering algorithms, developed for massively parallel processors (MPPs), for polygonal, spheres, and volumetric data. The polygon algorithm uses a data parallel approach whereas the sphere and volume render use a MIMD approach. Implementations for these algorithms are presented far the Thinking Machines Corporation CM-5 and the Cray Research Inc T3D.
暂无评论