The widespread use of sequential simulation in large scale parameter studies means that large cost savings can be made by improving the performance of these simulators. Sequential discrete event simulation systems usu...
详细信息
ISBN:
(纸本)0769523838
The widespread use of sequential simulation in large scale parameter studies means that large cost savings can be made by improving the performance of these simulators. Sequential discrete event simulation systems usually employ a central event list to manage future events. This is a priority queue ordered by event timestamps. Many different priority queue algorithms have been developed with the aim of improving simulator performance. Researchers developing asynchronous conservative parallel discrete event simulations have reported exceptional performance for their systems running sequentially in certain cases. This paper compares the performance of simulations using a selection of high performance central event list implementations to that achieved using techniques borrowed from the parallelsimulation community. Theoretical and empirical analysis of the algorithms is presented demonstrating the range of performance that can be achieved, and the benefits of employing parallelsimulation techniques in a sequential execution environment.
The proceedings contain 30 papers from the workshop on Principles of Advanced and distributedsimulation, PADS 2005. The topics discussed include: modeling and simulating the brain as a system;towards time-parallel ro...
详细信息
The proceedings contain 30 papers from the workshop on Principles of Advanced and distributedsimulation, PADS 2005. The topics discussed include: modeling and simulating the brain as a system;towards time-parallel road traffic simulation;parallel event-driven neural network simulations using the Hodgkin-Huxley Neuron model;optimistic protocol analysis in a performance analyzer and prediction tool;distributed worm simulation with a realistic internet model;mobile contagion: simulation of infection and defense;efficient simulation of Wireless Networks Using Lazy MAC state update;the WarpIV simulation kernel;preliminary simulation of the effect of scanning worm activity on multicast;and merging parallelsimulation programs.
parallel discrete event simulation (PDES) of models with fine-grained computation remains a challenging problem. We explore the usage of POSE, our parallel Object-oriented simulation Environment, for application perfo...
详细信息
ISBN:
(纸本)0769523838
parallel discrete event simulation (PDES) of models with fine-grained computation remains a challenging problem. We explore the usage of POSE, our parallel Object-oriented simulation Environment, for application performance prediction on large parallel machines such as BlueGene. This study involves the simulation of communication at the packet level through a detailed network model. This presents an extremely fine-grained simulation: events correspond to the transmission and receipt of packets. Computation is minimal, communication dominates, and strong dependencies between events result in a low degree of parallelism. There is limited look-ahead capability since the outcome of many events is determined by the application whose performance the simulation is predicting. Thus conservative synchronization approaches are challenging for this type of problem. We present recent experiences and performance results for our network simulator and illustrate the utility of our simulator through prediction and validation studies for a molecular dynamics application.(1)
We describe a model for an interdisciplinary course in scientific modeling and simulation. We discuss the course structure and content, as well as the results of our evaluation process.
ISBN:
(纸本)0769522947
We describe a model for an interdisciplinary course in scientific modeling and simulation. We discuss the course structure and content, as well as the results of our evaluation process.
In earlier work cloning is proposed as a means for efficiently splitting a running simulation midway through its execution into multiple parallelsimulations. In simulation cloning, clones usually are able to share co...
详细信息
ISBN:
(纸本)0769523838
In earlier work cloning is proposed as a means for efficiently splitting a running simulation midway through its execution into multiple parallelsimulations. In simulation cloning, clones usually are able to share computations that occur early in the simulation, but as their states diverge individual LPs are replicated as necessary so that their computations proceed independently. However if overtime the state of the clones (or their constituent LPs) converges there is, as of yet, no means for recombining them. In this case some efficiency is lost because they will execute identical events. This idea is the reverse of cloning, as we merge logical processes that have been previously cloned and we show that this can further increase efficiency because the new uncloned LPs will complete computations that would otherwise be duplicated. We discuss our implementation of merging, and illustrate its effectiveness in several example simulation scenarios.
simulation of large-scale networks requires enormous amounts of memory and processing time. One way of speeding up these simulations is to distribute the model over a number of connected workstations. However, this in...
详细信息
ISBN:
(纸本)0769523838
simulation of large-scale networks requires enormous amounts of memory and processing time. One way of speeding up these simulations is to distribute the model over a number of connected workstations. However, this introduces inefficiencies caused by the need for synchronization and message passing between machines. In distributed network simulation, one of the factors affecting message passing overhead is the amount of cross-traffic between machines. We perform an independent benchmark of the parallel/distributed Network Simulator (PDNS) based on experimental results processed at the Australian Centre for Advanced Computing and Communications (AC3) supercomputing cluster. We measure the effect of cross-traffic on wall-clock time needed to complete a simulation for a set of basic network topologies by comparing the result with the wall-clock time needed on a single processor. Our results show that although efficiency is reduced with large amounts of cross-traffic, speedup can still be achieved with PDNS. With these results, we developed a performance model that can be used as a guideline for designing future simulations.
In this work we illustrate the design and implementation guidelines of a recently developed middleware defined to support the parallel and distributedsimulation of large scale, complex and dynamically interacting sys...
详细信息
ISBN:
(纸本)0769524478
In this work we illustrate the design and implementation guidelines of a recently developed middleware defined to support the parallel and distributedsimulation of large scale, complex and dynamically interacting system models. The distributedsimulation of complex system models, may suffer the communication and synchronization required to maintain the causality constraints between distributed model components. We designed and implemented the ARTIS middleware as a new framework by incorporating a set of features that allow adaptive optimization by exploiting many complex and dynamic model and distributedsimulation characteristics. As an example, a dynamic migration mechanism for the run-time adaptive allocation of model entities has been designed and exploited for dynamic load and communication balancing. Optimizations have been introduced to obtain the maximum advantage from heterogeneous and asymmetric communication systems, from shared memory to LAN and Internet communication. Other optimizations have been introduced by the exploitation of concurrent replications of parallel and distributedsimulations, in order to increase the resources utilization and to maximize the speedup of simulation processes. Solutions have been designed, implemented and tuned to obtain a significant reduction in the communication and synchronization overheads between the physical execution units, and an increased model scalability and simulation speedup, even in worst-case modeling assumptions and simulation scenarios.
In this paper we introduce a new concept, network atomic operations (NAOs) to create a zero-cost consistent cut. Using NAOs, we define a wall-clock-time driven GVT algorithm called Seven O' Clock that is an extens...
详细信息
ISBN:
(纸本)0769523838
In this paper we introduce a new concept, network atomic operations (NAOs) to create a zero-cost consistent cut. Using NAOs, we define a wall-clock-time driven GVT algorithm called Seven O' Clock that is an extension of Fujimoto's shared memory GVT algorithm. Using this new GVT algorithm, we report good optimistic parallel performance on a cluster of state-of-the-art Itanium-II quad processor systems for both benchmark applications such as PHOLD and real-world applications such as a large-scale TCP/Internet model. In some cases, super-linear speedup is observed.
In this paper, a new event scheduling mechanism XEQ and a new rollback procedure rb-messages are proposed for use in optimistic logic simulation. We incorporate both of these techniques in a simulator XTW. XTW groups ...
详细信息
ISBN:
(纸本)0769523838
In this paper, a new event scheduling mechanism XEQ and a new rollback procedure rb-messages are proposed for use in optimistic logic simulation. We incorporate both of these techniques in a simulator XTW. XTW groups LPs into clusters, and makes use of a multi-level queue,XEQ, to schedule events in the cluster. XEQ has an O(1) event scheduling time complexity. Our new rollback mechanism replaces the use of anti-messages by an rb-message, and eliminates the need for an output queue at each LP. Experimental comparisons to Time Warp reveal a superior performance on the part of XTW, while experimental results over large circuits (5-million-gate to 25-million-gate) shows XTW scales well with both the size of circuits and the number of processors.
Efficient computer simulation of complex physical phenomena has long been challenging due to their multi-physics and multi-scale nature. In contrast to traditional time-stepped execution methods, we describe an approa...
详细信息
暂无评论