Many thread packages are freely available on the Internet. Yet, most parallel language design groups seem to have rejected all existing packages and implemented their own. This is unsurprising. Existing thread package...
详细信息
For effective use of parallel computers, a tool which assists users to know the way of parallelization is needed. Since we believe visualization is a useful tool for parallelization, we are developing a tool named Nar...
详细信息
parallel applications with inconstant usage patterns presents a big challenge to programmers in that the spawning of tasks and the communication between them may be conditional (named »conditional parallel progra...
详细信息
With the increased complexity of applications, parallel computing has proved to be an alternative to supercomputing in solving large problems. However developing parallel applications is more difficult compared to seq...
详细信息
With the increased complexity of applications, parallel computing has proved to be an alternative to supercomputing in solving large problems. However developing parallel applications is more difficult compared to sequential programming. Visual technologies can be employed to aid the multi-dimensional tasks of parallel programming. This paper presents a case study of the use of an integrated visual programming environment in creating parallel applications. Techniques for the hierarchical construction of parallel programs are presented, together with an evaluation of the performance of the generated code for a matrix multiplication application.
This paper proposes a novel queue-based programming abstraction, parallel Dispatch Queue (PDQ), that enables efficient parallel execution of fine-grain software communication protocols. parallel systems often use fine...
详细信息
This paper proposes a novel queue-based programming abstraction, parallel Dispatch Queue (PDQ), that enables efficient parallel execution of fine-grain software communication protocols. parallel systems often use fine-grain software handlers to integrate a network message into computation. Executing such handlers in parallel requires access synchronization around resources. Much as a monitor construct in a concurrent language protects accesses to a set of data structures, PDQ allows messages to include a synchronization key protecting handler accesses to a group of protocol resources. By simply synchronizing messages in a queue prior to dispatch, PDQ not only eliminates the overhead of acquiring/releasing synchronization primitives but also prevents busy-waiting within handlers. In this paper, we study PDQ's impact on software protocol performance in the context of fine-grain distributed shared memory (DSM) on an SMP cluster. Simulation results running shared-memory applications indicate that: (i) parallel software protocol execution using PDQ significantly improves performance in fine-grain DSM, (ii) tight integration of PDQ and embedded processors into a single custom device can offer performance competitive or better than an all-hardware DSM, and (iii) PDQ best benefits cost-effective systems that use idle SMP processors (rather than custom embedded processors) to execute protocols. On a cluster of 4 16-way SMPs, a PDQ-based parallel protocol running on idle SMP processors improves application performance by a factor of 2.6 over a system running a serial protocol on a single dedicated processor.
A parallel programming system, called MPC++, provides parallel primitives such as remote function invocation, a global pointer, and a synchronization structure using the C++ template feature. The system has run on a c...
详细信息
A parallel programming system, called MPC++, provides parallel primitives such as remote function invocation, a global pointer, and a synchronization structure using the C++ template feature. The system has run on a cluster of homogeneous computers. In this paper, the runtime system is extended to run on a cluster made up of a heterogeneous computer environment. Unlike other distributed or parallel programming systems on heterogeneous computers, the same program in the homogeneous environment runs in the heterogeneous environment in this extension.
This paper presents P-RIO, a parallel programming environment that supports an object based software configuration methodology. It promotes a clear separation of the individual sequential computation components from t...
详细信息
This paper presents P-RIO, a parallel programming environment that supports an object based software configuration methodology. It promotes a clear separation of the individual sequential computation components from the interconnection structure used for the interaction between these components. This makes the data and control interactions explicit, simplifying program visualization and understanding. P-RIO includes a graphical tool that helps to configure, monitor and debug parallel programs.
We describe Actors, a flexible, scalable and efficient model of computation, and develop a framework for analyzing the parallel complexity of programs written in it. Actors are asynchronous, autonomous objects which i...
详细信息
We describe Actors, a flexible, scalable and efficient model of computation, and develop a framework for analyzing the parallel complexity of programs written in it. Actors are asynchronous, autonomous objects which interact by message-passing. The data and process decomposition inherent in Actors simplifies modeling real-world systems. High-level concurrent programming abstractions have been developed to simplify program development using Actors; such abstractions do not compromise an efficient and portable implementation. In this paper, we define a parallel complexity model for Actors. The model we develop gives an accurate measure of performance on realistic architectures. We illustrate its use by analyzing a number of examples.
An important development in cluster computing is the availability of multiprocessor workstations. These are able to provide additional computational power to the cluster without increasing network overhead and allow m...
详细信息
ISBN:
(纸本)9780818681172
An important development in cluster computing is the availability of multiprocessor workstations. These are able to provide additional computational power to the cluster without increasing network overhead and allow multiparadigm parallelism, which we define to be the simultaneous application of both distributed and shared memory parallel processing techniques to a single problem. In this paper we compare execution times and speedup of parallel programs written in a pure message-passing paradigm with those that combine message passing and shared-memory primitives in the same application. We consider three basic applications that are common building blocks for many scientific and engineering problems: numerical integration, matrix multiplication and Jacobi iteration. Our results indicate that the added complexity of combining shared- and distributed-memory programming methods in the same program does not contribute sufficiently to performance to justify the added programming complexity.
This paper presents a new structured parallel programming model, "SEQ OF PAR", based on the Communication Closed Layer (CCL) principle of causal composition for parallel programs and Bird-Meertens formalism ...
详细信息
ISBN:
(纸本)0818678763
This paper presents a new structured parallel programming model, "SEQ OF PAR", based on the Communication Closed Layer (CCL) principle of causal composition for parallel programs and Bird-Meertens formalism (BMF) of locality-based parallel computation. This model is to support for more general, architecture-independent parallel programming. It provides a structured approach to integrate task (or process) parallelism and data-parallelism in one framework. The well-founded algebra of CCL and BMF makes it also possible to derive, optimize and verify parallel programs through algebraic transformations. Experimental results show that it is very promising to adopt this programming model for getting efficient, portable parallel code.
暂无评论