One of the major difficulties of explicit parallel programming for a shared memory machine model is detecting the potential for nondeterminacy and identifying its causes. There will often be shared variables in a para...
详细信息
ISBN:
(纸本)0897913418
One of the major difficulties of explicit parallel programming for a shared memory machine model is detecting the potential for nondeterminacy and identifying its causes. There will often be shared variables in a parallel program, and the tasks comprising the program may need to be synchronized when accessing these *** paper discusses this problem and presents a method for automatically detecting non-determinacy in parallel programs that utilize event style synchronization instructions, using the Post, Wait, and Clear primitives. With event style synchronization, especially when there are many references to the same event, the difficulty lies in computing the execution order that is guaranteed given the synchronization instructions and the sequential components of the program. The main result in this paper is an algorithm that computes such an execution order and yields a Task Graph upon which a nondeterminacy detection algorithm can be *** have focused on events because they are a frequently used synchronization mechanism in parallel versions of Fortran, including Cray [Cray87], IBM [IBM88], Cedar [GPHL88], and PCF Fortran [PCF88].
The University of Illinois has traditionally been a major center for computer architecture education and research in the nation. This short paper briefly describes the computer architecture curriculum at the Universit...
详细信息
Collective movement simulations are challenging and important in many areas,including life science,mathematics,physics,information science and public *** this survey,we provide a comprehensive review of the state-of-t...
详细信息
Collective movement simulations are challenging and important in many areas,including life science,mathematics,physics,information science and public *** this survey,we provide a comprehensive review of the state-of-the-art techniques for collective movement *** start with a discussion on certain concepts to help beginners understand it more ***,we analyze the intelligence among different collective objects and the emphasis in different ***,we classify existing collective movement simulation methods into four categories according to their effects,namely versatility,accuracy,dynamic adaptability,and assessment feedback ***,we introduce five applications of layout optimization,emergency control,dispatching,unmanned systems,and other derivative ***,we summarize possible future research directions.
Software Defined Radio (SDR) is a radio communication technique which converts hardware problems to software problems using a programmable processing system. With the increase of SDR applications, the demand for high ...
详细信息
The development of efficient numerical programs and library routines for high-performance parallel computers is a complex task requiring not only an understanding of the algorithms to be implemented, but also detailed...
详细信息
In this paper, we develop and evaluate a high-density multi-GPU hardware sub-system for high performance computing and deep learning. The high-density multi-GPU hardware is implemented as an out-of-box hardware by ext...
详细信息
Directory schemes and software schemes have been proposed to solve the cache coherence problem for the MIN-based large-scale multiprocessor system. The authors compare the performance of the two schemes using trace-dr...
详细信息
ISBN:
(纸本)0818621583
Directory schemes and software schemes have been proposed to solve the cache coherence problem for the MIN-based large-scale multiprocessor system. The authors compare the performance of the two schemes using trace-driven simulation including the effect of false sharing caused by a nontrivial cache line size. It is shown that the simplest software scheme can have a hit ratio and shared memory traffic comparable to those of the directory scheme. The invalidations and the sharing behavior of the director scheme are classified and analyzed.
General purpose parallel processing machines are increasingly being used to speed up a variety of VLSI CAD applications. This paper addresses logic simulation on parallel machines by exploiting the concurrency in the ...
详细信息
ISBN:
(纸本)0897913418
General purpose parallel processing machines are increasingly being used to speed up a variety of VLSI CAD applications. This paper addresses logic simulation on parallel machines by exploiting the concurrency in the circuit being simulated (called data parallelism) as opposed to exploiting parallelism inherent in the simulation algorithm itself (called functional parallelism). The most crucial step in obtaining the maximum parallelism using data parallelism is the partitioning of circuit elements. We introduce a cost function which tries to model the simulation of a logic circuit in a parallel environment. The cost function tries to estimate the parallel run time for logic simulation given the processor assignment and the underlying multiprocessor architecture. We then present different heuristic algorithms to partition the circuit and evaluate the efficiency of these algorithms using the proposed cost function. Partitioning algorithms for both event-driven and compiled code simulation are given.
In large multiprocessor systems, fast synchronisation is crucial for high performance. However, synchronisation traffic tends to create "hot-spots" in shared memory and cause network congestion. Multistage s...
详细信息
In the past three years, the Perfect BenchmarkTM Suite has evolved from a supercomputer performance evaluation plan, presented by Kuck and Sameh at the 1987 International Conference on supercomputing, to a vigorous in...
详细信息
ISBN:
(纸本)0897913698
In the past three years, the Perfect BenchmarkTM Suite has evolved from a supercomputer performance evaluation plan, presented by Kuck and Sameh at the 1987 International Conference on supercomputing, to a vigorous international activity. This paper surveys the current state of this supercomputer performance evaluation effort with particular focus on the adopted methodology. While there has been considerable success in achieving the goals of the plan, some issues remain unresolved, and new questions have surfaced.
暂无评论