Designing fast parallel discrete event simulation systems for shared-memory parallel computers is simplified by the efficient communication operations enabled by the common memory space. The difficulties involved in d...
详细信息
ISBN:
(纸本)0769521118
Designing fast parallel discrete event simulation systems for shared-memory parallel computers is simplified by the efficient communication operations enabled by the common memory space. The difficulties involved in designing large shared-memory computers and the resulting high cost of even modest size systems has led to the proliferation of computer systems consisting of small shared-memory computers connected via low-latency message-passing interconnection networks. This paper describes how a network simulation system using a simulation kernel optimized for high performance operation on shared-memory parallel computers has been extended to operate on computers that mix shared-memory and message-passing paradigms. Results are presented showing that the system can achieve over 60 million simulated packet transmissions per second on 32 4-processor nodes. The results demonstrate the advantage of using a mixture of shared-memory and message-passing over using only message-passing in many cases.
Memory management issues associated with Time Warp simulation are examined. The focus is on the discussion of parallelsimulation executing on a distributed memory computer. Software developed for such systems is char...
详细信息
Memory management issues associated with Time Warp simulation are examined. The focus is on the discussion of parallelsimulation executing on a distributed memory computer. Software developed for such systems is characterized by the fact that the dynamic memory is allocated from a pool memory that is shared by all of the processes at a given processor. In this connection, a new memory management protocol, pruneback, is presented which recovers space by discarding previous states and not by changing the local virtual time of one or more processes. An empirical study which suggests that using pruneback is significantly more effective that artificial rollback on a distributed memory computer is presented.
The proceedings contain 30 papers from the workshop on Principles of Advanced and distributedsimulation, PADS 2005. The topics discussed include: modeling and simulating the brain as a system;towards time-parallel ro...
详细信息
The proceedings contain 30 papers from the workshop on Principles of Advanced and distributedsimulation, PADS 2005. The topics discussed include: modeling and simulating the brain as a system;towards time-parallel road traffic simulation;parallel event-driven neural network simulations using the Hodgkin-Huxley Neuron model;optimistic protocol analysis in a performance analyzer and prediction tool;distributed worm simulation with a realistic internet model;mobile contagion: simulation of infection and defense;efficient simulation of Wireless Networks Using Lazy MAC state update;the WarpIV simulation kernel;preliminary simulation of the effect of scanning worm activity on multicast;and merging parallelsimulation programs.
This paper presents the results of an experimental study to evaluate the effectiveness of parallelsimulation in reducing the execution time of gate-level models of VLSI circuits. Specific contributions of this paper ...
详细信息
This paper presents the results of an experimental study to evaluate the effectiveness of parallelsimulation in reducing the execution time of gate-level models of VLSI circuits. Specific contributions of this paper include (i) the design of a gate-level parallel simulator that can be executed, without any changes on both distributed memory and shared memory parallel architectures, (ii) demonstrated speedups with both conservative and optimistic simulation protocols (almost all previous studies on circuit simulation have failed to extract speedups with conservative protocols);in particular we showed that a speedup of about 3 was obtained on 8 processors of a Sparc1000 for conservative algorithms and about 2 for optimistic algorithms for circuits in the ISCAS85 benchmark suite;and (iii) performance comparison between shared memory and distributed memory implementations of the simulator.
Presented is a conservative algorithm for the parallelsimulation of billiard balls. A spatial approach to these simulations is commonly employed, in which the billiard table is partitioned into segments which are sim...
详细信息
Presented is a conservative algorithm for the parallelsimulation of billiard balls. A spatial approach to these simulations is commonly employed, in which the billiard table is partitioned into segments which are simulated by different processors. The conservative algorithm differs from previous approaches in that it makes use of shared variables to enable processors to ascertain the state of the computation at neighboring processors. The shared variable corresponds to a region at the boundary of the table segments. By making use of shared variables a significant speed-up is obtained.
The implementation of a cloning mechanism that allows for the evaluation of multiple simulated futures is presented and its performance is analyzed. A running parallel discrete event simulation is dynamically cloned a...
详细信息
ISBN:
(纸本)0818684577
The implementation of a cloning mechanism that allows for the evaluation of multiple simulated futures is presented and its performance is analyzed. A running parallel discrete event simulation is dynamically cloned at decision points to explore different execution paths concurrently. Ita this way what-if and alternative scenario analysis in gaming, tactical and strategic applications can be evaluated interactively or non-interactively. Performance results show that virtual logical processes, a new mechanism developed to avoid repeating common computations among clones improves efficiency.
One of the six categories of management services provided in the Run Time Infrastructure (RTI) to federated simulations is Time Management. Currently, it provides only two message ordering policies, that is, time stam...
详细信息
ISBN:
(纸本)0769511058
One of the six categories of management services provided in the Run Time Infrastructure (RTI) to federated simulations is Time Management. Currently, it provides only two message ordering policies, that is, time stamp ordering and receive ordering. Temporal anomalies occurred during the execution of federation due to the heterogeneous latencies in the communication network are nor handled in receive ordering. While time stamp ordering eliminates the temporal anomalies entirely, it incurs great communication latency and huge bandwidth requirement. This paper presents a new time management mechanism which provides a less costly message ordering service, namely causal ordering, to federates. It does not require the specification of lookahead and allows federates that do not require stringent message ordering properties to achieve much more efficient execution. A series of experiments has been carried out to benchmark the performance of this new time management mechanism and the results show that it incurs a slight overhead compared to receive ordering mechanism but achieves significant performance improvement over time stamp ordering mechanism.
This paper threats about a work that is inserted in the context of CRUX project, which aims the conception of a complete environment for parallel programming, in development in the Course of Pos-graduation in Computer...
详细信息
ISBN:
(纸本)0769518532
This paper threats about a work that is inserted in the context of CRUX project, which aims the conception of a complete environment for parallel programming, in development in the Course of Pos-graduation in Computer Science of Santa Catarina Federal University. This paper makes an evaluation of performance of several scheduling algorithms of real time found in the bibliography, about a simulation model that represents as the processing as the communication of this multi-computer. The objective was to quantify the effect of the scheduling algorithm and other factors about some metrics of selected performance, in order to verify the applicability of CRUX in real time systems.
Many simulations in the natural sciences and engineering require the numerical solution of nonlinear differential equations. For this class of numerical methods, we propose an appropriate parallel computation model on...
详细信息
Many simulations in the natural sciences and engineering require the numerical solution of nonlinear differential equations. For this class of numerical methods, we propose an appropriate parallel computation model on distributed memory machines that supports the prediction of execution times. As a case study, we investigate the parallel implementation of the diagonal-implicitly iterated Runge-Kutta method, a solution method for stiff systems of ordinary differential equations. An implementation on the Intel iPSC/860 confirms the accuracy of the prediction model.
暂无评论