Dataflow computation models enable simpler and more efficient management of the memory hierarchy - a key barrier to the performance of many parallel programs. This paper enumerates some advantages of using the dataflo...
详细信息
Dataflow computation models enable simpler and more efficient management of the memory hierarchy - a key barrier to the performance of many parallel programs. This paper enumerates some advantages of using the dataflow model;it argues that the programming model is simple and easily managed by a programmer and demonstrates some of the efficiencies that the dataflow model allows an underlying run-time system to achieve.
While Java has provided a mechanism for concurrent programming implemented as language constructs, it is too rudimentary for most programmers and has certain limitations that make programs unnecessarily complex and pr...
详细信息
While Java has provided a mechanism for concurrent programming implemented as language constructs, it is too rudimentary for most programmers and has certain limitations that make programs unnecessarily complex and prevent fine-grained concurrency. We have implemented Java4P, and extension of the Java language, that offers a simpler concurrency model and overcomes Java's limitations. Threads are no longer associated with thread objects, allowing concurrency at any level of granularity. Thread creation is made implicit and syncronisation is achieve through method guards. Synchronisation specification is separated from the functional specification to provide a parallelprogramming model closer to sequential programming.
Runtime systems are critical to the implementation of concurrent object-oriented programming languages. This paper describes a concurrent object-oriented programming language - Balinda C++, running on a distributed me...
详细信息
Runtime systems are critical to the implementation of concurrent object-oriented programming languages. This paper describes a concurrent object-oriented programming language - Balinda C++, running on a distributed memory system and its runtime implementation. The runtime system is built on the top of the Nexus communication library. The tuplespace is the key of Balinda C++. A distributed tuplespace model is presented to improve data locality. Some experiments have been done to verify our model. The results indicate that our model is effective to improve system performance.
Dataflow computation models enable simpler and more efficient management of the memory hierarchy - a key barrier to the performance of many parallel programs. This paper enumerates some advantages of using the dataflo...
详细信息
Dataflow computation models enable simpler and more efficient management of the memory hierarchy - a key barrier to the performance of many parallel programs. This paper enumerates some advantages of using the dataflow model; it argues that the programming model is simple and easily managed by a programmer and demonstrates some of the efficiencies that the dataflow model allows an underlying run-time system to achieve.
parallel simulation has the potential to accelerate the execution of simulation applications. However developing a parallel discrete-event simulation from scratch requires an in-depth knowledge of the mapping process ...
详细信息
parallel simulation has the potential to accelerate the execution of simulation applications. However developing a parallel discrete-event simulation from scratch requires an in-depth knowledge of the mapping process from the physical model to the simulation model, and a substantial effort in optimising performance. This paper presents an overview of the SPaDES (Structured parallel Discrete-Event Simulation) parallel simulation framework. We focus on the performance analysis of SPaDES/C++, an implementation of SPaDES on a distributed-memory Fujitsu AP3000 parallel computer. SPaDES/C++ hides the underlying complex parallel simulation synchronization and parallelprogramming details from the simulationist. Our empirical results show that the SPaDES framework can deliver good speedup if the process granularity is properly optimised.
Run-time systems are critical to the implementation of concurrent object oriented programming languages. The paper describes a concurrent object oriented programming language, Balinda C++, running on a distributed mem...
详细信息
Run-time systems are critical to the implementation of concurrent object oriented programming languages. The paper describes a concurrent object oriented programming language, Balinda C++, running on a distributed memory system and its run-time implementation. The run-time system is built on the top of the Nexus communication library. The tuplespace is the key to Balinda C++. A distributed tuplespace model is presented to improve data locality. Some experiments have been done to verify our model. The results indicate that our model is effective at improving system performance.
parallel simulation has the potential to accelerate the execution of simulation applications. However, developing a parallel discrete-event simulation from scratch requires an in-depth knowledge of the mapping process...
详细信息
parallel simulation has the potential to accelerate the execution of simulation applications. However, developing a parallel discrete-event simulation from scratch requires an in-depth knowledge of the mapping process from the physical model to the simulation model, and a substantial effort in optimising performance. This paper presents an overview of the SPaDES (Structured parallel Discrete-Event Simulation) parallel simulation framework. We focus on the performance analysis of SPaDES/C++, an implementation of SPaDES on a distributed-memory Fujitsu AP3000 parallel computer. SPaDES/C++ hides the underlying complex parallel simulation synchronization and parallelprogramming details from the simulationist. Our empirical results show that the SPaDES framework can deliver good speedup if the process granularity is properly optimised.
Gang scheduling has been widely used as a practical solution to the dynamic parallel job scheduling problem. parallel threads of a single job are scheduled for simultaneous execution on a parallel computer even if the...
详细信息
Gang scheduling has been widely used as a practical solution to the dynamic parallel job scheduling problem. parallel threads of a single job are scheduled for simultaneous execution on a parallel computer even if the job does not fully utilize all available processors. Non allocated processors go idle for the duration of the time quantum assigned to the threads. In this paper we propose a class of scheduling policies, dubbed Concurrent Gang, that is a generalization of gang-scheduling, and allows for the flexible simultaneous scheduling of multiple parallel jobs, thus improving the space sharing characteristics of gang scheduling. However, all the advantages of gang scheduling such as responsiveness, efficient sharing of resources, ease of programming, etc., are maintained.
While Java has provided a mechanism for concurrent programming implemented as language constructs, it is too rudimentary for most programmers and has certain limitations that make programs unnecessarily complex and pr...
详细信息
While Java has provided a mechanism for concurrent programming implemented as language constructs, it is too rudimentary for most programmers and has certain limitations that make programs unnecessarily complex and prevent fine-grained concurrency. We have implemented Java4P, an extension of the Java language, that offers a simpler concurrency model and overcomes Java's limitations. Threads are no longer associated with thread objects, allowing concurrency at any level of granularity. Thread creation is made implicit and synchronisation is achieved through method guards. Synchronisation specification is separated from the functional specification to provide a parallelprogramming model closer to sequential programming.
Gang scheduling has been widely used as a practical solution to the dynamic parallel job scheduling problem. parallel threads of a single job are scheduled for simultaneous execution on a parallel computer even if the...
详细信息
Gang scheduling has been widely used as a practical solution to the dynamic parallel job scheduling problem. parallel threads of a single job are scheduled for simultaneous execution on a parallel computer even if the job does not fully utilize all available processors. Non allocated processors go idle for the duration of the time quantum assigned to the threads. In this paper we propose a class of scheduling policies, dubbed concurrent gang, that is a generalization of gang-scheduling, and allows for the flexible simultaneous scheduling of multiple parallel jobs, thus improving the space sharing characteristics of gang scheduling. However all the advantages of gang scheduling such as responsiveness, efficient sharing of resources, ease of programming, etc., are maintained.
暂无评论