Most successful examples of parallelsimulation models were developed for parallel execution, from the beginning. A number of simulation models are designed only for sequential simulation, even in languages like PARSE...
ISBN:
(纸本)9780769501550
Most successful examples of parallelsimulation models were developed for parallel execution, from the beginning. A number of simulation models are designed only for sequential simulation, even in languages like PARSEC, that support both sequential and parallelsimulation algorithms. Converting such simulation models to a form that yields good performance with a parallel implementation can be non-trivial. In this paper we describe a case study showing this conversion process for a simulation model of replicated file systems. The details of the major steps taken in converting the simulation into a parallelsimulation are presented: correctness changes; performance changes such as communication topology simplification and lookahead specification; and modeling changes to eliminate performance bottlenecks. The details and performance improvements of each step are presented in this paper.
Traditionally, parallel discrete-event simulators based on the Time Warp synchronization protocol have been implemented using either the shared memory programming model or the distributed memory, message passing progr...
ISBN:
(纸本)9780769501550
Traditionally, parallel discrete-event simulators based on the Time Warp synchronization protocol have been implemented using either the shared memory programming model or the distributed memory, message passing programming model. This was because the preferred hardware platform was either a shared memory multiprocessor workstation or a network of uniprocessor workstations. However, with the advent of "clumps" (cluster of shared memory multiprocessors), a change in this dichotomous view becomes necessary. This paper explores the design and implementation issues involved in exploiting this new platform for Time Warp simulations. Specifically, this paper presents two generic strategies for implementing Time Warp simulators on clumps. In addition, we present our experiences in implementing these strategies on an extant distributed memory, message passing Time Warp simulator (WARPED). Preliminary performance results comparing the modified clump-specific simulation kernel to the unmodified distributed memory, message passing simulation kernel are also presented.
In this paper, we present a parallel simulator (SWiMNet) for PCS networks using a combination of optimistic and conservative paradigms. The proposed methodology exploits event precomputation permitted by model indepen...
ISBN:
(纸本)9780769501550
In this paper, we present a parallel simulator (SWiMNet) for PCS networks using a combination of optimistic and conservative paradigms. The proposed methodology exploits event precomputation permitted by model independence within the PCS components. The low percentage of blocked calls is exploited in the channel allocation simulation of precomputed events by means of an optimistic approach. %To illustrate and verify the developed approach, Experiments were conducted with various call arrival rates and mobile host densities on a cluster of Pentium workstations. Performance results indicate that the SWiMNet achieves a speedup of 6 employing 8 workstations, and a speedup of 12 with 16 workstations.
The High Level Architecture (HLA) provides the specification of a software architecture for distributedsimulation. The baseline definition of the HLA includes the HLA Rules, The HLA Interface Specification, and the H...
ISBN:
(纸本)9780769501550
The High Level Architecture (HLA) provides the specification of a software architecture for distributedsimulation. The baseline definition of the HLA includes the HLA Rules, The HLA Interface Specification, and the HLA Object Model Template. The HLA Rules are a set of 10 basic rules that define the responsibilities and relationships among the components of an HLA federation. The HLA Interface Specification provides a specification of the functional interfaces between HLA federates and the HLA Runtime Infrastructure. The HLA OMT provides a common presentation format for HLA simulation and Federation Object *** HLA was developed over the past three years. It is currently in the process of being applied with simulations developed for analysis, training and test and evaluation and incorporated into industry standards for distributedsimulation by both the Object Management Group and the IEEE. This paper provides a discussion of key areas where there are technology challenges in the future implementation and application of the HLA.
This paper describes a new, auto-adaptive algorithm for dead reckoning in DIS. In general dead-reckoning algorithms use a fixed threshold to control the extrapolation errors. Since a fixed threshold cannot adequately ...
详细信息
ISBN:
(纸本)9780769501550
This paper describes a new, auto-adaptive algorithm for dead reckoning in DIS. In general dead-reckoning algorithms use a fixed threshold to control the extrapolation errors. Since a fixed threshold cannot adequately handle the dynamic relationships between moving entities, a multi-level threshold scheme is proposed. The definition of threshold levels is based on the concepts of area of interest (AOI) and sensitive region (SR), and the levels of threshold are adaptively adjusted based on the relative distance between entities during the simulation. Various experiments were conducted. The results show that the proposed auto-adaptive dead reckoning algorithm can achieve considerable reduction in update packets without sacrificing accuracy in extrapolation.
We have developed a set of performance prediction tools which help to estimate the achievable speedups from parallelizing a sequential simulation. The tools focus on two important factors in the actual speedup of a pa...
详细信息
ISBN:
(纸本)9780769501550
We have developed a set of performance prediction tools which help to estimate the achievable speedups from parallelizing a sequential simulation. The tools focus on two important factors in the actual speedup of a parallelsimulation program : (a) the simulation protocol used, and (b) the inherent parallelism in the simulation model. The first two tools are a performance/parallelism analyzer for a conservative, asynchronous simulation protocol, and a similar analyzer for a conservative, synchronous ("super-step") protocol. Each analyzer allows us to study how the speedup of a model changes with increasing number of processors, when a specific protocol is used. The third tool -- a critical path analyzer -- gives an ideal upper bound to the model's speedup. This paper gives an overview of the prediction tools, and reports the predictions from applying the tools to a discrete-event wafer fabrication simulation model. The predictions are close to speedups from actual parallel implementations. These tools help us to set realistic expectations of the speedup from a parallelsimulation program, and to focus our work on issues which are more likely to yield performance improvement.
This paper presents a checkpointing scheme for optimistic simulation which is a mixed approach between periodic and probabilistic checkpointing. The latter, basing on statistical data collected during the simulation, ...
ISBN:
(纸本)9780769501550
This paper presents a checkpointing scheme for optimistic simulation which is a mixed approach between periodic and probabilistic checkpointing. The latter, basing on statistical data collected during the simulation, aims at recording as checkpoints states of a logical process that have high probability to be restored due to rollback (this is done in order to make those states immediately available). The periodic part prevents performance degradation due to state reconstruction (coasting forward) cost whenever the collected statistics do not allow to identify states highly likely to be ***, this scheme can be seen as a highly general solution to tackle the checkpoint problem in optimistic simulation. A performance comparison with previous solutions is carried out through a simulation study of a store-and-forward communication network in a two-dimensional torus topology.
Ordering of simultaneous events in DES is an important issue as it has an impact on modelling expressiveness, model correctness as well as causal dependencies. In sequential DES this is a problem which has attracted m...
ISBN:
(纸本)9780769501550
Ordering of simultaneous events in DES is an important issue as it has an impact on modelling expressiveness, model correctness as well as causal dependencies. In sequential DES this is a problem which has attracted much attention over the years and most systems provide the user with tools to deal with such issues. This has also attracted some attention within the PDES community and we present an overview of these efforts. We have, however, not yet found a scheme which provides us with the desired functionality. Thus, we present and evaluate some simple schemes to achieve a well defined ordering of events and means to identify both causally dependent and independent events with identical timestamps in the context of optimistic simulations. These schemes should be applicable also to conservative PDES.
This paper introduces the Critical Channel Traversing (CCT) algorithm, a new scheduling algorithm for both sequential and parallel discrete event simulation. CCT is a general conservative algorithm that is aimed at th...
详细信息
ISBN:
(纸本)9780769501550
This paper introduces the Critical Channel Traversing (CCT) algorithm, a new scheduling algorithm for both sequential and parallel discrete event simulation. CCT is a general conservative algorithm that is aimed at the simulation of low-granularity network models on shared-memory multi-processor *** implementation of the CCT algorithm within a kernel called TasKit has demonstrated excellent performance for large ATM network simulations when compared to previous sequential, optimistic and conservative kernels. TasKit has achieved two to three times speedup on a single processor with respect to a splay tree central-event-list based sequential kernel. On a 16 processor (R8000) Silicon Graphics PowerChallenge, TasKit has achieved an event-rate of 1.2 million events per second and a speedup of 26 relative to the sequential kernel for a large ATM network *** is achieved through a multi-level scheduling scheme that supports the scheduling of large grains of computation even with low-granularity events. Performance is also enhanced by supporting good cache behavior and automatic load *** paper describes the algorithm and its motivation, proves its correctness and briefly presents performance results for TasKit.
暂无评论