parallel computing on interconnected workstations is becoming a viable and attractive proposition due to the rapid growth in speeds of interconnection networks and processors. In the case of workstation clusters, ther...
详细信息
parallel computing on interconnected workstations is becoming a viable and attractive proposition due to the rapid growth in speeds of interconnection networks and processors. In the case of workstation clusters, there is always a considerable amount of unused computing capacity available in the network. However, heterogeneity in architectures and operating systems, load variations on machines, variations in machine availability, and failure susceptibility of networks and workstations complicate the situation for the programmer. In this context, new programming paradigms that reduce the burden involved in programming for distribution, load adaptability, heterogeneity, and fault tolerance gain importance. This paper identifies the issues involved in parallel computing on a network of workstations. The Anonymous Remote Computing (ARC) paradigm is proposed to address the issues specific to parallel programming on workstation systems. ARC differs from the conventional communicating process model by treating a program as one single entity consisting of several loosely coupled remote instruction blocks instead of treating it as a collection of processes. The ARC approach results in distribution transparency and heterogeneity transparency. At the same time, it provides fault tolerance and load adaptability to parallel programs on workstations. ARC is developed in a two-tiered architecture consisting of high level language constructs and low level ARC primitives. The paper describes an implementation of the ARC kernel supporting ARC primitives.
The performance of a parallel simulation system depends very much on partitioning simulation workload evenly among the set of processors in the computing environment to ensure load-balance between processors. Most par...
详细信息
The performance of a parallel simulation system depends very much on partitioning simulation workload evenly among the set of processors in the computing environment to ensure load-balance between processors. Most parallel simulation systems employ user-defined static partitioning. However static partitioning requires in-depth domain knowledge of the specific simulation model in the study. It is not effective if the workload of a simulation model could not be quantified accurately or changes over time during a simulation run. Dynamic load-balancing allows the simulation system to automatically balance the workload of different simulation models without user's input. In this paper the use of dynamic load-balancing in the context of the BSP Time Warp optimistic protocol is examined. Based on the BSP cost model, a dynamic load-balancing algorithm for the BSP Time Warp protocol is developed. Using different simulation models, the paper shows that to achieve consistent performance, the dynamic load-balancing algorithm for BSP Time Warp needs to consider both computation and communication workload, as well as lookaheads between processors.
In recent years, cluster computing has been accepted widely as a parallel platform because of its high performance at an affordable cost. To make the best use of the cluster computing resources, a resource monitoring ...
详细信息
In recent years, cluster computing has been accepted widely as a parallel platform because of its high performance at an affordable cost. To make the best use of the cluster computing resources, a resource monitoring program is needed. The information collected can be used by any parallel application, i.e. parallel motion estimation, for handling load variation in typical time-sharing computers. Therefore, the parallel workload can be distributed properly among n processors. In this paper, we present the development of resource monitoring for cluster computing using the MPI programming model and its application to parallel motion estimation. Results show the effectiveness of our method in which a faster parallel execution time can be achieved.
We report the development of an SPMD parallel application which computes the macroscopic thermal dispersion in porous media. The performance of SPMD programs is strongly affected by dynamic load imbalancing factors. T...
详细信息
We report the development of an SPMD parallel application which computes the macroscopic thermal dispersion in porous media. The performance of SPMD programs is strongly affected by dynamic load imbalancing factors. The use of a suitable load balancing algorithm is essential for overcoming the effects of these imbalancing factors. We developed nine versions of the SPMD application, each one adopting a different load balancing strategy. The main contribution of this work is the performance evaluation and comparison of these nine versions. The experimental results showed the importance of using an appropriate load balancing strategy for the characteristics of this scientific parallel application.
The paper considers the creation of intelligent solving machines and the arrangement of parallel programming in intelligent distributed multiprocessor systems based on those. There are proposed some main concepts. A s...
详细信息
The paper considers the creation of intelligent solving machines and the arrangement of parallel programming in intelligent distributed multiprocessor systems based on those. There are proposed some main concepts. A system is designed for programming in the C+Graph high-level language. C+Graph provides an efficient operation with knowledge (complicated data structures) and centralized-decentralized control exercised in virtually distributed computation space. parallel C+Graph programming model is based on a model used for multiple-flow monoprocessor programming of the basic Java language. The model operates in a virtual C+Graph machine network. The ideology proposed can be considered as an efficient development of structural high-level language interpretation when applied to multi-microprocessor systems. Equipment structure of the basic version of intelligent solving machines is considered and some characteristics are discussed.
The presented multi-level storage memory system uses a self-adaptive method that improves the cell model with each successive program cycle, and accommodates cell variations and noise. An accuracy of 5 mV is achieved ...
详细信息
The presented multi-level storage memory system uses a self-adaptive method that improves the cell model with each successive program cycle, and accommodates cell variations and noise. An accuracy of 5 mV is achieved within eight cycles, which total 125 /spl mu/s. Algorithm control circuits occupy 1 mm/sup 2/ of area in a 0.5 /spl mu/m SSI FLASH process.
Presents the design of the Coven framework for construction of problem solving environments (PSEs) for parallel computers. PSEs are an integral part of modern high performance computing (HPC) and Coven attempts to sim...
详细信息
Presents the design of the Coven framework for construction of problem solving environments (PSEs) for parallel computers. PSEs are an integral part of modern high performance computing (HPC) and Coven attempts to simplify PSE construction. Coven targets Beowulf cluster parallel computers but independent of any particular domain for the PSE. Multithreaded parallel applications are created with Coven that are capable of supporting most of the constructs in a typical parallel programming language. Coven uses an agent-based front-end which allows multiple custom interfaces to be constructed. Examples of the use of Coven in the construction of prototype PSEs are shown, and the effectiveness of these PSEs is evaluated in terms of the performance of the applications they generate.
In this paper, we introduce DLoVe, a new paradigm for designing and implementing distributed and nondistributed virtual reality applications, using one-way constraints. DLoVe allows programs written in its framework t...
详细信息
ISBN:
(纸本)0769514928
In this paper, we introduce DLoVe, a new paradigm for designing and implementing distributed and nondistributed virtual reality applications, using one-way constraints. DLoVe allows programs written in its framework to be executed on multiple computers for improved performance. It also allows easy specification and implementation of multi-user interfaces. DLoVe hides all the networking aspects of message passing among the machines in the distributed environment and performs the needed network optimizations. As a result, a user of DLoVe does not need to understand parallel and distributed programming to use the system; he or she needs only be able to use the serial version of the user interface description language. parallelizing the computation is performed by DLoVe, without modifying the interface description.
Run-time errors in concurrent programs are generally due to the wrong usage of synchronization primitives such as monitors. Conventional validation techniques such as testing become ineffective for concurrent programs...
详细信息
ISBN:
(纸本)9781581135626
Run-time errors in concurrent programs are generally due to the wrong usage of synchronization primitives such as monitors. Conventional validation techniques such as testing become ineffective for concurrent programs since the state space increases exponentially with the number of concurrent processes. In this paper, we propose an approach in which 1) the concurrency control component of a concurrent program is formally specified, 2) it is verified automatically using model checking, and 3) the code for concurrency control component is automatically generated. We use monitors as the synchronization primitive to control access to a shared resource by multipleconcurrent processes. Since our approach decouples the concurrency control component from the rest of the implementation it is scalable. We demonstrate the usefulness of our approach by applying it to a case study on Airport Ground Traffic *** use the Action Language to specify the concurrency control component of a system. Action Language is a specification language for reactive software systems. It is supported by an infinite-state model checker that can verify systems with boolean, enumerated and udbounded integer variables. Our code generation tool automatically translates the verified Action Language specification into a Java monitor. Our translation algorithm employs symbolic manipulation techniques and the specific notification pattern to generate an optimized monitor class by eliminating the context switch overhead introduced as a result of unnecessary thread notification. Using counting abstraction, we show that we can automatically verify the monitor specifications for arbitrary number of threads.
暂无评论