This paper discusses distributed checkpointing with "Time Warp techniques", a typical uncoordinated checkpointing technique that is often used in the parallel and distributedsimulations. Relaxing the assump...
详细信息
This paper discusses distributed checkpointing with "Time Warp techniques", a typical uncoordinated checkpointing technique that is often used in the parallel and distributedsimulations. Relaxing the assumption of the previous model of Soliman et al., we show a discrete time model where the number of available checkpoints each process can hold is finite. In addition, we propose an adaptive distributed checkpointing technique, that gives an effective time arrangement of checkpoints for a recovery point distribution, and we give numerical examples.
This paper presents two new versions of the Critical Channel Traversing (CCT) algorithm. CCT is a conservative parallel discrete event simulation algorithm that has been shown to achieve very high performance when use...
详细信息
ISBN:
(纸本)9780769516080
This paper presents two new versions of the Critical Channel Traversing (CCT) algorithm. CCT is a conservative parallel discrete event simulation algorithm that has been shown to achieve very high performance when used in a wide area computer network simulator. The first of the new algorithms called simple sender side CCT is similar to the original, but busy waiting is eliminated. Results presented show that simple sender side CCT avoids performance problems that can be caused by busy *** second new algorithm called receive side CCT employs a different strategy for updating channel clocks and determining which objects should be scheduled on critical channels. Performance results show that this version provides better scaling with respect to the connectivity of the model, at the expense of some added complexity.
Of critical importance to any real-time system is the issue of predictability. We divide overall system predictability into two parts: algorithmic and systemic. Algorithmic predictability is concerned with ensuring th...
详细信息
ISBN:
(纸本)9780769516080
Of critical importance to any real-time system is the issue of predictability. We divide overall system predictability into two parts: algorithmic and systemic. Algorithmic predictability is concerned with ensuring that the parallelsimulation engine and model from a complexity point of view are able to consistently yield results within a real-time deadline. Systemic predictability is concerned with ensuring that OS scheduling, interrupts and virtual memory overheads are consistent over a real-time period. To provide a framework for investigating systemic predictability, we define a new class of parallelsimulation called Extreme simulation or XSim. An XSim is any analytic parallelsimulation that is able to generate a statistically valid result by a real-time deadline. Typically, this deadline is between 10 and 100 milliseconds. XSims are expected to provide decision support to existing complex, real-time systems. As a new design and implementation methodology for realizing XSims, we embed a state-of-the-art optimistic simulator into the Linux operating system. In this operating environment, OS scheduling and interrupts are disabled. Given a 50 millisecond model completion deadline, we observe that the XSim has a systemic predictability, measure of 98% compared with only 56% for the same Time Warp system operating in user-level.
parallel Discrete Event simulation (PDES) on a cluster of workstations is a fine grained application where the communication performance can dictate the effiency of the simulation. The high performance Local/System Ar...
详细信息
ISBN:
(纸本)9780769516080
parallel Discrete Event simulation (PDES) on a cluster of workstations is a fine grained application where the communication performance can dictate the effiency of the simulation. The high performance Local/System Area Networks used in high-end clusters are capable of delivering data with high bandwidth and low latency. Unfortunately, the communication rate far out-paces the capabilities of workstation nodes to handle it (I/0 bus, memory bus, CPU resources). For this reason, many vendors are offering a programmable processor on the NIC to allow application specific optimization of the communication path. This invites a new implementation model for distributed applications where: (i) application specific communication optimizations can be implemented on the NIC; (ii) portions of the application that are most heavily communicating can be migrated to the NIC; (iii) some messages can be filtered out at the NIC without burdening the primary processor resources; and (iv) critical events are detected and handled early. The aim of our research is to investigate the utility of this model for PDES and to gain initial experiences in the implementation challenges and potential performance improvement. In this paper, we present our experiences with Early Cancellation --- an optimization for Time-Warp that cancels messages in place upon early discovery of a rollback. We believe that there is a large scope for additional optimizations using this model.
Faster-than-real-time simulation (FRTS) can be used for performance evaluation of systems behavior in real time, providing significant capabilities for studying systems with a time-varying behavior. FRTS enables model...
ISBN:
(纸本)9780769516080
Faster-than-real-time simulation (FRTS) can be used for performance evaluation of systems behavior in real time, providing significant capabilities for studying systems with a time-varying behavior. FRTS enables model validation through comparing simulation results with the corresponding system observations. However, experimentation proves to be rather demanding, as both delivering output results and ensuring their reliability must be accomplished within a predetermined time frame. Output analysis of system observations and model results and relevant timing issues are discussed. A method is introduced that determines whether it is possible to execute the "optimal" faster-than-real-time experiment, in which case multiple replications are scheduled for execution, or a compromise has to be made between the ability to predict for the long future and the degree of reliability achieved for predictions. FRTS experimental are also presented to support the effectiveness of the proposed method.
Reconfigurable architectures represent an innovative approach to the computer system design paradigm, which tries to cope with a problem of inefficiency of conventional computing systems, due to their general purpose ...
详细信息
ISBN:
(纸本)0769514448
Reconfigurable architectures represent an innovative approach to the computer system design paradigm, which tries to cope with a problem of inefficiency of conventional computing systems, due to their general purpose nature. On the other hand, cellular automata are attractive computing models due to their fine grain parallelism, simple computational structures and local communication patterns. The inherently parallel cellular automata model is well suited to be implemented on reconfigurable hardware architectures such as field programmable gate arrays (FPGA) that can provide significant speedup. This paper describes the CAREM system that provides an efficient implementation of cellular automata algorithms on FPGA systems exploiting their reconfigurable features for executing different cellular automata rules. Its application to an image processing application and a forest fire simulation are presented and discussed. Performance evaluation and comparison with different implementations of cellular automata are presented.
Clusters of workstations (COWs) are becoming increasingly popular as a cost-effective alternative to parallel computers. The in-transit buffer (ITB) mechanism can improve network performance when applied to COWs with ...
详细信息
ISBN:
(纸本)0769514448
Clusters of workstations (COWs) are becoming increasingly popular as a cost-effective alternative to parallel computers. The in-transit buffer (ITB) mechanism can improve network performance when applied to COWs with irregular topology and source routing. This mechanism considerably improves the performance of this kind of network when compared to current source routing algorithms; however, it introduces a latency penalty. An implementation of this mechanism was performed, showing that the latency overhead of the mechanism may be noticeable, especially for short messages and at low network loads. In this paper, we analyze in detail the latency overhead of ITBs, proposing several mechanisms to reduce, hide and remove it. Firstly, we show, by simulation, the effect of an ITB implementation that is much slower than the one implemented. Then we propose three mechanisms that try to overcome the latency penalty. All the mechanisms are simple and can be easily implemented; also, they are out of the critical path of the ITB packet-processing procedure. The results show very good behaviour of the proposed mechanisms, considerably reducing or even completely removing the latency overhead.
Recently, a Checkpointing and Communication Library (CCL) to support optimistic parallelsimulation on myrinet based clusters has been presented. Beyond classical low latency message delivery functionalities, this lib...
详细信息
ISBN:
(纸本)9780769516080
Recently, a Checkpointing and Communication Library (CCL) to support optimistic parallelsimulation on myrinet based clusters has been presented. Beyond classical low latency message delivery functionalities, this library additionally offers CPU offloaded checkpointing functionalities based on data transfer capabilities provided by a programmable DMA engine on board of myrinet network cards. A re-synchronization functionality is also supported for both logical (i.e. data consistency) and practical (i.e. hardware contention) reasons, which is implemented according to the following semantic: at any re-synchronization point, the simulation application is momentarily frozen until the last activated DMA based checkpoint operation is completed. In case long freezing periods are experienced, the checkpointing functionalities offered by CCL might not be fully effective in reducing the real checkpointing overhead at the simulation application level. To tackle this drawback, we present an alternative semantic for re-synchronization, namely conditional checkpoint abort, leading to application freezing only in case at least a threshold fraction of the state vector currently being checkpointed has already been transferred into the checkpoint buffer. In the opposite case, the checkpoint operation is aborted and the simulation application is immediately allowed to proceed, thus avoiding excessive checkpointing overhead (due to freezing) at the simulation application level. We also report the results of an evaluation, carried out using classical parameterized synthetic benchmarks, which show that the execution speed of the simulation application can be significantly increased by the alternative semantic we propose.
The proceedings contains 22 papers from the conference on proceedings of the 15th workshop on parallel and distributedsimulation. Toipcs discussed include: improving lookahead in parallel discrete event simulations o...
详细信息
The proceedings contains 22 papers from the conference on proceedings of the 15th workshop on parallel and distributedsimulation. Toipcs discussed include: improving lookahead in parallel discrete event simulations of large-scale applications using compiler analysis;looking ahead of real time in hybrid component networks;lock-free scheduling of logical processes in parallelsimulation;and the dependence list in time wrap.
Conference proceedings front matter may contain various advertisements, welcome messages, committee or program information, and other miscellaneous conference information. This may in some cases also include the cover...
ISBN:
(纸本)076951104X
Conference proceedings front matter may contain various advertisements, welcome messages, committee or program information, and other miscellaneous conference information. This may in some cases also include the cover art, table of contents, copyright statements, title-page or half title-pages, blank pages, venue maps or other general information relating to the conference that was part of the original conference proceedings.
暂无评论