In traditional distributedsimulation schemes, the entire simulation needs to be restarted if any of the participating logical processes (LPs) crash. This is highly undesirable for long running simulations. Some form ...
详细信息
In traditional distributedsimulation schemes, the entire simulation needs to be restarted if any of the participating logical processes (LPs) crash. This is highly undesirable for long running simulations. Some form of fault tolerance is required to minimize the wasted computation. A rollback based optimistic fault tolerance scheme is integrated with an optimistic distributedsimulation scheme. In rollback recovery schemes, checkpoints are periodically saved on stable storage. After a crash, these saved checkpoints are used to restart the computation. We make use of the novel insight that a failure can be modeled as a straggler event with the receive time equal to the virtual time of the last checkpoint saved on stable storage. This results in saving of implementation efforts, as well as reduced overheads. We define stable global virtual time (SGVT), as the virtual time such that no state with a lower timestamp will ever be rolled back despite crash failures. A simple change is made in existing GVT algorithms to compute SGVT. Our use of transitive dependency tracking eliminates antimessages. LPs are clubbed in clusters to minimize stable storage access time.
In message passing environments, the message send time is dominated by overheads that are relatively independent of the message size. Therefore, fine grained applications (such as Time Warp simulators) suffer high ove...
详细信息
In message passing environments, the message send time is dominated by overheads that are relatively independent of the message size. Therefore, fine grained applications (such as Time Warp simulators) suffer high overheads because of frequent communication. We investigate the optimization of the communication subsystem of Time Warp simulators using dynamic message aggregation. Under this scheme, Time Warp messages with the same destination LP, occurring in close temporal proximity are dynamically aggregated and sent as a single physical message. Several aggregation strategies that attempt to minimize the communication overhead without harming the progress of the simulation (because of messages being delayed) are developed. The performance of the strategies is evaluated for a network of workstations, and an SMP, using a number of applications that have different communication behavior.
Processor and network management have a great impact on the performance of distributed memory parallel computers. Dynamic process migration allows load balancing and communication balancing at execution time. Managing...
详细信息
Processor and network management have a great impact on the performance of distributed memory parallel computers. Dynamic process migration allows load balancing and communication balancing at execution time. Managing the communications involving the migrating process is one of the problems that dynamic process migration implies. To study this problem, which we have called the message integrity problem, six algorithms have been analysed. These algorithms have been studied by sequential simulation, and have also been implemented in a parallel machine for different user process pattern in presence of dynamic migration. To compare the algorithms, different performance parameters have been considered. The results obtained have given preliminary information about the algorithms' behaviour, and have allowed us to perform an initial comparative evaluation among them.
The performance of the Chandy-Misra algorithm in distributedsimulation has been studied in the context of a particular simulation application: a cellular network. The logical process structure under the algorithm is ...
详细信息
The performance of the Chandy-Misra algorithm in distributedsimulation has been studied in the context of a particular simulation application: a cellular network. The logical process structure under the algorithm is modified in such a way that the excessive synchronisation caused by the algorithm can be avoided. The synchronisation is minimised by reducing the number of connections between logical processes (LP). The concept of a neighbourhood of an LP is defined in such way that an LP is connected via logical channels only to those LPs that belong to its neighbourhood. A broadcast messages method is proposed to solve the communication between non connected logical processes, i.e. those outside the neighbourhood. simulation experiments are carried out in a previously implemented distributedsimulation environment, Diworse. A GSM network is used as a simulation application where target of the simulation is to obtain estimates for the channel utilisation. Carrier per interference (C/I) values for GSM channels are used for determining the need for handovers. Execution time of the simulation and deviations in the C/I values are measured for completely connected and broadcast message methods in order to find out the effect of connection reduction. The results indicate that the broadcast messages method enables significantly faster simulation. With the GSM application, the proposed method has only a negligible distorting effect on the simulation.
As parallel discrete event simulation becomes increasingly important to the solution of very large systems design problems, it becomes increasingly critical to establish whether PDES technology will scale up with incr...
详细信息
As parallel discrete event simulation becomes increasingly important to the solution of very large systems design problems, it becomes increasingly critical to establish whether PDES technology will scale up with increasing problem size and architecture. We address the problem in a general setting, and provide a resounding conclusion: maybe. To scale requires that the simulation model not grow in ways that defeat an ability to load balance, and that do not overwhelm any one processor with communication. It requires an architecture that scales as well. It requires a partitioner that balances workload and exploits locality of communication. The specific partition strategy we examined is very simple; our point is not to promote its specific use. Our point is that scalability is possible using it, and hence if a more refined partitioner can balance workload and maintain locality of communication, then simulations built using it will scale also. If these conditions apply, then we demonstrate by example a simple conservative synchronization protocol, QS, that scales. Using QS, we then examine the trade-off between load imbalance and synchronization overhead. We show how to efficiently manage that trade-off by probing the space of potential restricted partitions.
暂无评论