We have previously shown that the mathematical technique of uniformization can serve as the basis of synchronization for the parallelsimulation of continuous-time Markov chains. This paper reviews the basic method an...
详细信息
ISBN:
(纸本)1565550552
We have previously shown that the mathematical technique of uniformization can serve as the basis of synchronization for the parallelsimulation of continuous-time Markov chains. This paper reviews the basic method and compares four different methods based on uniformization, evaluating their strengths and weaknesses as a function of problem characteristics. The methods vary in their use of optimism, logical aggregation, communication management, and adaptivity. Performance evaluation is conducted on the Intel Touchstone Delta multiprocessor, using up to 256 processors.
The major goal of this work has been to develop an implementation of a parallel partitioning algorithm which is suitable for use in a conservatively synchronized parallel Discrete Event simulation (PDES) environment. ...
详细信息
ISBN:
(纸本)1565550552
The major goal of this work has been to develop an implementation of a parallel partitioning algorithm which is suitable for use in a conservatively synchronized parallel Discrete Event simulation (PDES) environment. Effective partitioning is essential for performance and capacity consideration, for any PDES problem. The performance of the partitioning algorithm is very important, to the overall simulation performance. There are two possible approaches to improve performance for the partitioning step: algorithm modifications;and parallelize the partitioning process. In this work, an efficient parallelized version of the iterative improvement based partitioning algorithm (Fiduccia and Mattheyses, 1982) is developed. The basic algorithm has been modified, first for parallel execution with a similar quality of final partition;and then further modified to increase the parallelism of the algorithm, at the expense of partition quality.
In this paper we consider the effect of using bus interconnection structures on the overheads present in conservative parallelsimulations of multicomputer programs. We use a modified version of the Poker Programming ...
详细信息
ISBN:
(纸本)1565550552
In this paper we consider the effect of using bus interconnection structures on the overheads present in conservative parallelsimulations of multicomputer programs. We use a modified version of the Poker Programming Environment to empirically measure the overhead in three parallel algorithms using buses. We discuss the sources of overhead and compare them with those found using point-to-point communication. Preliminary results indicate that the overheads encountered using a bus interconnection structure were not predicted by our previous results using point-to-point communications.
An approach for high performance parallel logic simulation on a local area network of workstation computers is discussed in this paper. The single, shared transmission medium often found in such networks places limita...
详细信息
ISBN:
(纸本)1565550552
An approach for high performance parallel logic simulation on a local area network of workstation computers is discussed in this paper. The single, shared transmission medium often found in such networks places limitations on parallel execution, hence a reduction in the frequency of synchronization is pursued by combining a circuit partitioning methodology with a specific synchronization constraint. A consequence of the partitioning methodology is replication of objects between blocks of a partition. A partitioning procedure based on iterative improvement is described for reducing replication while preserving load balance. Two interprocessor synchronization techniques for parallelsimulation are studied: conservative and optimistic synchronization. Experiments conducted on three large sequential circuits indicate that reasonable speedup is achievable for well-balanced partitions, and that optimistic synchronization provides a modest improvement in performance over conservative synchronization.
Time-parallelsimulations exploit parallelism by partitioning the time domain of a simulation model. Exploiting temporal parallelism requires predicting future states of a simulation model. A poor prediction of future...
详细信息
ISBN:
(纸本)1565550552
Time-parallelsimulations exploit parallelism by partitioning the time domain of a simulation model. Exploiting temporal parallelism requires predicting future states of a simulation model. A poor prediction of future states may cause extensive recomputation so that a time-parallelsimulation requires more real time to execute than a corresponding sequential simulation. Recurrent states of a simulation model provide a way to predict future states. In this paper, we propose a time-parallelsimulation method which uses a pre-simulation to identify recurrent states. For simulation models in which recurrent states do not exist or can not support sufficient time parallelism, an approximation technique is suggested to extend the class of simulation models to which time-parallelsimulation can be applied. Several queueing network models are investigated with the proposed time-parallelsimulation. Experimental results suggest that the proposed approach can exploit massive parallelism while yielding accurate results.
Global virtual time (GVT) is used in distributedsimulations to reclaim memory, commit output, detect termination, and handle errors. It is a global function that is computed many times during the course of a simulati...
详细信息
ISBN:
(纸本)1565550552
Global virtual time (GVT) is used in distributedsimulations to reclaim memory, commit output, detect termination, and handle errors. It is a global function that is computed many times during the course of a simulation. A small GVT latency (delay between its occurrence and detection) allows for more efficient use of resources. We present an algorithm which minimizes the latency, and we prove its correctness. The algorithm is unique in that a target virtual time (TVT) is predetermined by an initiator who then detects when GVT≥TVT. This approach eliminates the avalanche effect because the collection phase is spread out over time, and it allows for regular and timely GVT updates. The algorithm does not require messages to be acknowledged, which significantly reduces the message overhead of the simulation. One possible application is with interactive simulators, where regular and timely updates would produce output that is up to date and appears smooth.
Time Warp and Breathing Time Buckets are two general-purpose optimistic synchronization strategies for supporting parallel discrete-event simulations. However, each one of these approaches has potential fatal shortcom...
详细信息
ISBN:
(纸本)1565550552
Time Warp and Breathing Time Buckets are two general-purpose optimistic synchronization strategies for supporting parallel discrete-event simulations. However, each one of these approaches has potential fatal shortcomings. Time Warp may exhibit rollback explosions that can cause an avalanche of antimessages. Breathing Time Buckets, on the other hand, may not be able to process enough events per synchronization cycle to remain efficient. A new strategy, called Breathing Time Warp, has been developed in the Synchronous parallel Environment for Emulation and Discrete-Event simulation (SPEEDES) operating system. This new strategy solves both of these problems by mixing the two algorithms together, resulting in the best of both methods. This paper describes the implementation of the Breathing Time Warp algorithm in SPEEDES, and then shows how this new approach sometimes improves the performance of parallel discrete-event simulations.
A hardware-based framework which supports a wide range of parallel discrete event synchronization protocols has been proposed in [Reyn92]. This framework offloads all synchronization activity from the host processors ...
详细信息
ISBN:
(纸本)1565550552
A hardware-based framework which supports a wide range of parallel discrete event synchronization protocols has been proposed in [Reyn92]. This framework offloads all synchronization activity from the host processors and host communication network in the system. The underlying hardware computes results of global, binary associative operations, or global reductions. In this paper we present results of simulations that strongly suggest the need for a next-generation reduction network which computes and disseminates results of target-specific reductions to support both aggressive and non-aggressive parallel discrete event simulations. Target-specific reductions allow a logical process to receive synchronization information only from those logical processes which may have a direct or indirect impact on its performance.
The two main approaches to parallel discrete event simulation - conservative and optimistic - are likely to encounter some limitations when the size and complexity of the simulation system increases. For such large sc...
详细信息
ISBN:
(纸本)1565550552
The two main approaches to parallel discrete event simulation - conservative and optimistic - are likely to encounter some limitations when the size and complexity of the simulation system increases. For such large scale simulations, the conservative approach appears to be limited by blocking overhead and sensitivity to lookahead, whereas the optimistic approach may become prone to cascading rollbacks, state saving overhead, and demands for larger memory space. These drawbacks restrict the synchronization schemes based on each of the two approaches from scaling up. A combined approach may resolve these limitations, while preserving and utilizing potential advantages of each method. However, the schemes proposed so far integrate the two views at the same level, i.e. local to a logical process, and hence may not be able to fully solve the problems. In this paper we propose the Local Time Warp method for parallel discrete-event simulation and present a novel synchronization scheme for it called HCTW. The new scheme hierarchically combines a Conservative Time Window algorithm with Time Warp and aims at reducing cascade rollbacks, sensitivity to lookahead, and the scalability problems. Local Time Warp is believed to be suitable for parallel machines equipped with thousands of processors and thus an appropriate candidate for simulation of large and complex systems.
The successful application of optimistic synchronization techniques in parallelsimulation requires that rollback overheads be contained. The chief contributions to rollback overhead in a Time Warp simulation are the ...
详细信息
ISBN:
(纸本)1565550552
The successful application of optimistic synchronization techniques in parallelsimulation requires that rollback overheads be contained. The chief contributions to rollback overhead in a Time Warp simulation are the time required to save state information and the time required to restore a previous state. Two competing techniques for reducing rollback overhead are periodic checkpointing (Lin and Lazowska, 1989) and incremental state saving (Bauer et al., 1991). This paper analytically compares the relative performance of periodic checkpointing to incremental state savings. The analytical model derived for periodic checkpointing is based almost entirely on the previous model developed by Lin (Lin and Lazowska, 1989). The analytical model for incremental state saving has been developed for this study. The comparison assumes an optimal checkpoint interval and shows under what simulation parameters each technique performs best.
暂无评论