In this paper we consider the distributedsimulation of queueing networks of FCFS servers with infinite buffers, and irreducible Markovian routing. We first show that for either the conservative or optimistic synchron...
详细信息
In this paper we consider the distributedsimulation of queueing networks of FCFS servers with infinite buffers, and irreducible Markovian routing. We first show that for either the conservative or optimistic synchronization protocols the simulation of such networks can prematurely block owing to event buffer exhaustion. Buffer exhaustion can occur in the simulator whether or not the simulator is stable, and, unlike simulators of feedforward networks, cannot be prevented by interprocessor flow control alone. We propose a simple technique (which we call compactfication), which, when used in conjunction with interprocessor flow control, prevents buffer exhaustion. This leads to a general algorithm, for both conservative and optimistic synchronization, that allows one to simulate the queueing network within the finite amount of memory available at each processor. For each algorithm presented we also provide the proof that it cannot get deadlocked owing to buffer exhaustion.
This paper presents an analytical model for evaluating the performance of Time Warp simulators. The proposed model is formalized based on two important time components in parallel and distributed processing: computati...
详细信息
This paper presents an analytical model for evaluating the performance of Time Warp simulators. The proposed model is formalized based on two important time components in parallel and distributed processing: computation time and communication time. The communication time is modeled by buffer access time and message transmission time. Logical processes of the Time Warp simulation, and the processors executing them are assumed to be homogeneous. Performance metrics such as rollback probability, rollback distance, elapsed time and Time Warp efficiency are derived. More importantly, we also analyze the impact of cascading rollback waves on the overall Time Warp performance. By rendering the deviation in state numbers of sender-receiver pairs, we investigate the performance of throttled Time Warp scheme. Our analytical model shows that the deviation in state numbers and the communication delay have a profound impact on Time Warp efficiency. The performance model has been validated against implementation results obtained on a Fujitsu AP3000 parallel computer. The analytical framework can be readily used to estimate performance before the Time Warp simulator is implemented.
In traditional distributedsimulation schemes, entire simulation needs to be restarted if any of the participating LP crashes. This is highly undesirable for long running simulations. Some form of fault-tolerance is r...
详细信息
In traditional distributedsimulation schemes, entire simulation needs to be restarted if any of the participating LP crashes. This is highly undesirable for long running simulations. Some form of fault-tolerance is required to minimize the wasted computation. In this paper, a rollback based optimistic fault-tolerance scheme is integrated with an optimistic distributedsimulation scheme. In rollback recovery schemes, checkpoints are periodically saved on stable storage. After a crash, these saved checkpoints are used to restart the computation. We make use of the novel insight that a failure can be modeled as a straggler event with the receive time equal to the virtual time of the last checkpoint saved on stable storage. This results in saving of implementation efforts, as well as reduced overheads. We define stable global virtual time (SGVT), as the virtual time such that no state with a lower timestamp will ever be rolled back despite crash failures. A simple change is made in existing GVT algorithms to compute SGVT. Our use of transitive dependency tracking eliminates antimessages. LPs are clubbed in clusters to minimize stable storage access time.
In previous papers, we have described a reduction model for computing near-perfect state information (NPSI) in support of adaptive synchronization in a parallel discrete-event simulation. Here, we report on an impleme...
详细信息
In previous papers, we have described a reduction model for computing near-perfect state information (NPSI) in support of adaptive synchronization in a parallel discrete-event simulation. Here, we report on an implementation of this model on a popular high performance computing platform - a network of workstations - without the use of special purpose hardware. The specific platform is a set of Pentium Pro PC's inter-connected by Myrinet - a Gbps network. We describe the reduction model and its use in our Elastic Time Algorithm. We summarize our design, described in an earlier paper and focus on the details of the implementation of this design. We present performance results that indicate that NPSI is feasible for simulations with medium to large event granularity.
The proceedings contain 69 papers. The topics discussed include: exploiting the parallel divide-and-conquer method to solve the symmetric tridiagonal eigenproblem;biological sequence analysis on distributed-shared mem...
ISBN:
(纸本)0818683325
The proceedings contain 69 papers. The topics discussed include: exploiting the parallel divide-and-conquer method to solve the symmetric tridiagonal eigenproblem;biological sequence analysis on distributed-shared memory multiprocessors;scheduling tasks in DAG to heterogeneous processor system;effective scheduling in a mixed parallel and sequential computing environment;distributed computation in a three-dimensional mesh with communication delays;automatic performance evaluation of parallel programs;architecture-dependent partitioning of dependence graphs;modeling and simulation of integrated modular avionics;improving parallel computer communication: dynamic routing balancing;using channel pipelining in reconfigurable interconnection networks;and exploiting write semantics in implementing partially replicated casual objects.
In message passing environments, the message send time is dominated by overheads that are relatively independent of the message size. Therefore, fine-grained applications (such as Time-Warp simulators) suffer high ove...
详细信息
In message passing environments, the message send time is dominated by overheads that are relatively independent of the message size. Therefore, fine-grained applications (such as Time-Warp simulators) suffer high overheads because of frequent communication. In this paper, we investigate the optimization of the communication subsystem of Time-Warp simulators using dynamic message aggregation. Under this scheme, Time-Warp messages with the same destination LP, occurring in close temporal proximity are dynamically aggregated and sent as a single physical message. Several aggregation strategies that attempt to minimize the communication overhead without harming the progress of the simulation (because of messages being delayed) are developed. The performance of the strategies is evaluated for a network of workstations, and an SMP, using a number of applications that have different communication behavior.
parallelization is a popular technique for improving the performance of discrete event simulation. Due to the complex, distributed nature of parallelsimulation algorithms, debugging implemented systems is a daunting,...
详细信息
ISBN:
(纸本)9780897919548
parallelization is a popular technique for improving the performance of discrete event simulation. Due to the complex, distributed nature of parallelsimulation algorithms, debugging implemented systems is a daunting, if not impossible task. Developers are plagued with transient errors that prove difficult to replicate and eliminate. Recently, researchers at The University of Cincinnati developed a parallelsimulation kernel, WARPED, implementing a generic parallel discrete event simulator based on the Time Warp optimistic synchronization algorithm. The intent was to provide a common base from which domain specific simulators can be developed. Due to the complexity of the Time Warp algorithm and the dependence of many simulators on the simulation kernel's correctness, a formal specification was developed and verified for critical aspects of the Time Warp system. This paper describes these specifications, their verification and their interaction with the development process.
Mastering increasing complexity of civil airborne equipment systems needs new architectural concepts mainly based on modular design, generic resources and multiplexed communication buses. These new architectures, such...
详细信息
ISBN:
(纸本)0818683325
Mastering increasing complexity of civil airborne equipment systems needs new architectural concepts mainly based on modular design, generic resources and multiplexed communication buses. These new architectures, such as Integrated Modular Avionics (IMA) architecture, rely on the definition of standardized hardware and software. However the development of an IMA architecture requires new tools enabling platform designer, applications developer and system integrator to describe and evaluate different implementation choices. This paper, identifies and characterizes different levels of needed models in order to catch essential information for performance evaluation of avionics applications integrated in IMA. Four model levels are proposed: application model, architectural model, execution model and allocation model. These different modeling levels allow the generation of a simulation model of avionics systems allocated on an IMA platform.
This paper describes a database approach to parallel discrete event simulation. It employs a set of production rules to describe the behavior of active objects in a simulation system so that production rules can be me...
详细信息
ISBN:
(纸本)0818685824
This paper describes a database approach to parallel discrete event simulation. It employs a set of production rules to describe the behavior of active objects in a simulation system so that production rules can be merged and evaluated collectively in a rule network. To maintain correctness and exploit all possible parallelism, each token is time-stamped and can be processed asynchronously. An object relational database is employed to allow simulators located at different sites of an Intranet to communicate with each other. A dynamic, object-relational query tool is provided for the user to interact with the simulation system.
The proceedings contain 18 papers. The topics discussed include: scheduling resources in multi-user, heterogeneous, computing environments with SmartNet;the Globus project: a status report;Netsolve: a network-enabled ...
ISBN:
(纸本)0818683651
The proceedings contain 18 papers. The topics discussed include: scheduling resources in multi-user, heterogeneous, computing environments with SmartNet;the Globus project: a status report;Netsolve: a network-enabled solver;examples and users;implementing distributed synthetic forces simulations in metacomputing environments;CCS resource management in networked HPC systems;a dynamic matching and scheduling algorithm for heterogeneous computing systems;dynamic, competitive scheduling of multiple DAGS in a distributed heterogeneous environment;the relative performance of various mapping algorithms is independent of sizable variances in run-time predictions;modeling the slowdown of data-parallel applications in homogeneous and heterogeneous clusters of workstations;specification and control of cooperative work in a heterogeneous computing environment;a mathematical model, heuristic, and simulation study for a basic data staging problem in a heterogeneous networking environment;modular heterogeneous system development: a critical analysis of java;fault-tolerance: Java's missing buzzword;heterogeneous parallel computing with Java: jabber or justified?;on the interaction between mobile processes and objects;steps toward understanding performance in Java;and heterogeneous programming with Java: gourmet blend or just a hill of beans?.
暂无评论