We develop a new software layer called the Automatic parallel Detection Layer (APDL) for the automatic transformation from sequential to parallel code. The main interest, in this research, is the parallelism at loop l...
详细信息
We develop a new software layer called the Automatic parallel Detection Layer (APDL) for the automatic transformation from sequential to parallel code. The main interest, in this research, is the parallelism at loop level, because significant parallelism in programs almost invariably occurs in loops. The proposed APDL has five processes for code transformation: the sequential source code parser, data dependence analysis of this code, partitioning, scheduling both task and data, and generating parallel source code. Many cases have been studied to evaluate the performance of the developed layer. The performance is evaluated depending on the execution time of: the sequential code, the parallel programmer code, and the code output from APDL for the same case study. Performance results show that APDL greatly improves the execution time with respect to sequential execution time, and saves on the high cost of a parallel programmer.
Cellular automata is a nature inspired parallel processing model. It has been proposed several years ago by J. Von Neumann to simulate complex dynamical processes. In the past two decades several models of cellular au...
详细信息
ISBN:
(纸本)0769509878
Cellular automata is a nature inspired parallel processing model. It has been proposed several years ago by J. Von Neumann to simulate complex dynamical processes. In the past two decades several models of cellular automata that differ from the original one proposed by Von Neumann have been defined for modeling real-world systems and phenomena. This paper describes the design and implementation of standard and nonstandard parallel cellular automata in the CARPET language. CARPET is a cellular automata based language that has been implemented on MIMD parallel computers. The language is specifically designed for programming cellular computations supporting concise and efficient coding of parallel cellular algorithms. The paper analyzes the main features of the language and describes as they can be exploited to implement different cellular automata on parallel computers, starting from the standard model to its modifications and generalizations. Inhomogeneous, partitioned, asynchronous, and probabilistic cellular automata programmed in CARPET are presented.
作者:
B. KrysztopH. KrawczykFaculty of Electronics
Telecommunication and Informatics Computer Architecture Department Technical University of GdaDsk GdaDsk Poland Faculty of Electronics
elecommunication and Informatics Computer Architecture Department Technical University of GdaDsk GdaDsk Poland
A new approach for developing efficient and flexible component-based distributed applications is proposed. It is based on a new programming platform TL (Transformation Language) which allows to express both abstract s...
详细信息
A new approach for developing efficient and flexible component-based distributed applications is proposed. It is based on a new programming platform TL (Transformation Language) which allows to express both abstract sequential code and parallel processing model of an application. To minimize execution cost and maximize flexibility, Distributed Partial Executor (DPE) tool and optimization algorithm is introduced. The example of the distributed image processing application is considered and its optimization in TL is analyzed. The obtained results confirm usability of the proposed methodology.
Incremental stack-copying is a technique which has been successfully used to support efficient parallel execution of a variety of search-based Al systems-e.g., logic-based and constraint-based systems. The idea of inc...
详细信息
Incremental stack-copying is a technique which has been successfully used to support efficient parallel execution of a variety of search-based Al systems-e.g., logic-based and constraint-based systems. The idea of incremental stack-copying is to only copy the difference between the data areas of two agents, instead of copying them entirely, when distributing parallel work. In order to further reduce the communication during stack-copying and make its implementation efficient on message-passing platforms, a new technique, called stack-splitting, has recently been proposed. In this paper, we describe a scheme to effectively combine stack-splitting with incremental stack copying, to achieve superior parallel performance in a non-shared memory environment. We also describe a scheduling scheme for this incremental stack-splitting strategy. These techniques are currently being implemented in the PALS system-a parallel constraint logic programming system.
Presents a visualization technique based on particle tracking. The technique consists in defining a set of points distributed on a closed surface and following the surface deformations as the velocity field changes in...
详细信息
ISBN:
(纸本)0780372239
Presents a visualization technique based on particle tracking. The technique consists in defining a set of points distributed on a closed surface and following the surface deformations as the velocity field changes in time. Deformations of the surface contain information about dynamics of the flow; in particular, it is possible to identify zones where flow stretching and foldings occur. Because the points on the surface are independent of each other, it is possible to calculate the trajectory of each point concurrently. Two parallel algorithms are studied; the first one for a shared memory Origin 2000 supercomputer and the second one for a distributed memory PC cluster. The technique is applied to a fluid moving by natural convection inside a cubic container.
Cyclic reduction for the solution of linear equation systems with banded matrices exhibits fine to medium grain potential parallelism with regular but diverse data dependencies. We consider the parallel implementation...
详细信息
ISBN:
(纸本)0769509878
Cyclic reduction for the solution of linear equation systems with banded matrices exhibits fine to medium grain potential parallelism with regular but diverse data dependencies. We consider the parallel implementation for this algorithm on a distributed shared memory machine with different programming models. As distributed shared memory machine we use the Convex SPP2000. We compare the runtime results with results from a Cray T3E.
Drago is an experimental Ada extension designed to facilitate the implementation of fault-tolerant and cooperative distributed applications. It is the result of an effort to impose discipline and give linguistic suppo...
详细信息
ISBN:
(纸本)0769509878
Drago is an experimental Ada extension designed to facilitate the implementation of fault-tolerant and cooperative distributed applications. It is the result of an effort to impose discipline and give linguistic support to the main concepts of the group communication paradigm. In this paper we focus our attention on the Drago linguistic support for the implementation of distributed cooperative applications. We introduce Drago and give some simple examples of its use.
While traditional parallel computing systems are still struggling to gain a wider acceptance, the largest parallel computer that has ever been available is currently growing with the communication resource Internet. U...
详细信息
ISBN:
(纸本)0769509878
While traditional parallel computing systems are still struggling to gain a wider acceptance, the largest parallel computer that has ever been available is currently growing with the communication resource Internet. Unfortunately it is also rarely used in the parallel computation field. The reason for the rejection of parallel computers is mainly the difficulty of parallel programming. In this paper we propose the Self Distributing Associative ARChitecture (SDAARC). It has been derived from the Cache Only Memory Architecture (COMA). COMAs provide a distributed shared memory (DSM) with automatic distribution of data. We show how this paradigm of data distribution can be extended to the automatic distribution of instruction sequences (microthreads). We show how microthreads can be extracted from legacy C code to produce code that can automatically be parallelized by SDAARC at run time. We also discuss how SDAARC can be implemented on a rightly coupled multiprocessor systems on heterogenous LAN based computer networks (Intranet) and on WANs of computing resources.
暂无评论