In this paper, we describe the design and implementation of a portable run-time system for GOP, a graph-oriented programming framework aiming at providing high-bevel abstractions for configuring and programming cooper...
详细信息
ISBN:
(纸本)0769509363
In this paper, we describe the design and implementation of a portable run-time system for GOP, a graph-oriented programming framework aiming at providing high-bevel abstractions for configuring and programming cooperative parallel processes. The runtime system provides an interface with a library of programming primitives to the low-level facilities required to support graph-oriented communications and synchronization. The implementation is on top of the parallel Virtual Machine (PVM) in a local area network of Sun workstations. Issues related to the implementation of graph operations in a distributed environment are discussed. Performance of the runtime system is evaluated by estimating the overheads associated with using GOP primitives as opposed to PVM.
We present a synchronous parallel programming model designed for massively parallel fine grained applications such as cellular automata, finite element methods or partial differential equations. In this model we assum...
详细信息
We present a synchronous parallel programming model designed for massively parallel fine grained applications such as cellular automata, finite element methods or partial differential equations. In this model we assume that the number of parallel processes in a program is much larger than the number of processors of the machine on which it is run. We present the computational model and the communication model. We introduce the virtual cellular machine, an abstract machine implementing this programming model which requires means to simulate efficiently the execution of many processes on a single processor; and to use the available communication bandwidth efficiently. Finally, we show an example program written in a prototype language designed for programming the virtual machine.
Shared object Distributed Shared Memory (DSM) minimizes the problem of false sharing by allowing programmer to control the sharing size. This shared object approach for distributed parallel programming works well in t...
详细信息
ISBN:
(纸本)0769517609
Shared object Distributed Shared Memory (DSM) minimizes the problem of false sharing by allowing programmer to control the sharing size. This shared object approach for distributed parallel programming works well in task parallelism but not in data parallelism. When the data of a shared object is being modified, a lock on that object must be enforced to exclude any concurrent access on that same object. If the shared data within an object is large, internal false sharing would become a problem. We present a multi-locking mechanism for shared object DSM which allows multiple locks be applied to the different data sets of a shared object and thus enhances its concurrency power.
Investigations of the parallel computing of the non-ideal 3-D space detonation wave propagation are presented in this paper on the hi-performance computer based on CC-NUMA architecture. Upon analyzing and testing the ...
详细信息
ISBN:
(纸本)0769515126
Investigations of the parallel computing of the non-ideal 3-D space detonation wave propagation are presented in this paper on the hi-performance computer based on CC-NUMA architecture. Upon analyzing and testing the previous serial program, the computation of curvature, the first-order and the second-order difference were determined to be the main objects of parallelization. Some processing techniques were applied to convert the serial program into parallel program, such as the strategy of "Divide and Conquer", the balance of the loading distribution. Numerical simulation computation of the parallel program results in a great increase of computing speed of the non-ideal 3-D space detonation wave propagation.
We introduce a new reusable component for parallel programming, the double-scan skeleton. For this skeleton, we formulate and formally prove sufficient conditions under which the double-scan can be parallelized, and d...
详细信息
ISBN:
(纸本)3540440496
We introduce a new reusable component for parallel programming, the double-scan skeleton. For this skeleton, we formulate and formally prove sufficient conditions under which the double-scan can be parallelized, and develop its efficient MPI implementation. The solution of a tridiagonal system of equations is considered as our case study. We describe how this application can be developed using the double-scan and report experimental results for both absolute performance and performance predictability of the skeleton-based solution.
Increasing complexity of distributed applications and commodity of resources through grids are making the tasks of deploying those applications harder There is a clear need for standard tools allowing versatile deploy...
详细信息
ISBN:
(纸本)0769516866
Increasing complexity of distributed applications and commodity of resources through grids are making the tasks of deploying those applications harder There is a clear need for standard tools allowing versatile deployment and analysis of distributed applications. We present here a solution for the deployment and monitoring of applications written using ProActive, an experimental Java-based library for concurrent, distributed and mobile computing. We describe the use of XML-based descriptor for the deployment part of a distributed application and the use of IC2D (Interactive Control and Debugging of Distribution), for the monitoring and steering of the running application. Those ideas, concepts, and experiments are a contribution towards the construction of integrated environments for component-based grid programming.
The performance of a parallel simulation system depends very much on partitioning simulation workload evenly among the set of processors in the computing environment to ensure load-balance between processors. Most par...
详细信息
ISBN:
(纸本)0769515525
The performance of a parallel simulation system depends very much on partitioning simulation workload evenly among the set of processors in the computing environment to ensure load-balance between processors. Most parallel simulation systems employ user-defined static partitioning. However static partitioning requires in-depth domain knowledge of the specific simulation model in study. It is not effective if the workload of a simulation model could not be quantified accurately, or changes over time during a simulation run. Dynamic load-balancing allows the simulation system to automatically balance the workload of different simulation models without user's input. In this paper, the use of dynamic load-balancing in the context of BSP Time Warp optimistic protocol is examined. Based on the BSP cost model, a dynamic load-balancing algorithm for the BSP Time Warp protocol is developed. Using different simulation models, we show that to achieve consistent performance, the dynamic load-balancing algorithm for BSP Time Warp needs to consider both computation and communication workload, as well as lookaheads between processors.
As the Internet began its exponential growth into a global information, environment, software was often unreliable, slow and had difficulty in interoperating with other systems. Supercomputing node counts also continu...
详细信息
ISBN:
(纸本)0769516866
As the Internet began its exponential growth into a global information, environment, software was often unreliable, slow and had difficulty in interoperating with other systems. Supercomputing node counts also continue to follow high growth trends. Supercomputer and grid resource management, software must mature into a reliable computational platform in much the same way that web services matured for the Internet. DOGMA The Next Generation (DOGMA-NG) improves on current resource management approaches by using tested off-the-shelf enterprise technologies to build a robust, scalable, and extensible resource management platform. Distributed web service technologies constitute the core of DOGMA-NGs design and provide fault-tolerance and scalability. DOGMA-NGs use of open standard web technologies and efficient management algorithms promises to reduce management time and accommodate. the growing size of future supercomputers. The use of web -technologies also provides the opportunity for a new parallel programming paradigm, enterprise web services parallel programming, that also gains benefit from the scalable, robust component architecture.
This paper threats about a work that is inserted in the context of CRUX project, which aims the conception of a complete environment for parallel programming, in development in the Course of Pos-graduation in Computer...
详细信息
ISBN:
(纸本)0769518532
This paper threats about a work that is inserted in the context of CRUX project, which aims the conception of a complete environment for parallel programming, in development in the Course of Pos-graduation in Computer Science of Santa Catarina Federal University. This paper makes an evaluation of performance of several scheduling algorithms of real time found in the bibliography, about a simulation model that represents as the processing as the communication of this multi-computer. The objective was to quantify the effect of the scheduling algorithm and other factors about some metrics of selected performance, in order to verify the applicability of CRUX in real time systems.
Program performance may be improved by efficiently programming some key sections of the software. In this paper, we present a methodology for converting selected portions of source code into automatically scalable mul...
详细信息
ISBN:
(纸本)0769518400
Program performance may be improved by efficiently programming some key sections of the software. In this paper, we present a methodology for converting selected portions of source code into automatically scalable multithreaded routines, without forcing programmers to concentrate on parallel programming issues. These developed routines can be reused across various projects, operating systems and system architectures. To support this methodology two separate but tightly coupled tools have been developed - PARSA(TH) Software Development Environment (SDE) and the ThreadMan(TM) Thread Manager. The SDE addresses programming issues by allowing a graphical object based approach to develop multithreaded routines that abstracts the users from parallel programming. ThreadMan manages the software developed using SDE. ThreadMan is a user-level thread manager that automatically spawns and schedules threads at runtime. Two examples have been developed using this methodology to demonstrate that there is virtually no degradation in performance when compared to sequential code, in a single processor system and scalability is achieved as the number of processors is increased.
暂无评论