The event horizon is a very important concept that is useful for both parallel and sequential discrete-event simulations. By exploiting the event horizon, parallelsimulations can process events in a manner that is ri...
详细信息
ISBN:
(纸本)9781565550278
The event horizon is a very important concept that is useful for both parallel and sequential discrete-event simulations. By exploiting the event horizon, parallelsimulations can process events in a manner that is risk-free (i.e., no antimessages) in adaptable “breathing” time cycles with variable time widths. Additionally, exploiting the event horizon can greatly reduce the event list management overhead that is common to virtually all discrete-event *** paper develops an analytic model describing the event horizon from first principles using equilibrium considerations and the hold model (where each event, when consumed, generates a single new event with future-time statistics described by a known probability function). Exponential and Beta-density functions are used to verify the mathematics presented in this paper.
Can parallelsimulations efficiently exploit a network of workstations? Why haven't PDES models followed standard modeling methodologies? Will the field of PDES survive, and if so, in what form? Researchers in the...
详细信息
Can parallelsimulations efficiently exploit a network of workstations? Why haven't PDES models followed standard modeling methodologies? Will the field of PDES survive, and if so, in what form? Researchers in the PDES field have addressed these questions and others in a series of papers published in the last few years [1,2,3,4]. The purpose of this paper is to shed light on these questions, by documenting an actual case study of the development of an optimistically synchronized PDES application on a network of workstations. This paper is unique in that its focus is not necessarily on performance, but on the whole process of developing a model, from the physical system being simulated, through its conceptual design, validation, implementation, and, of course, its performance. This paper also presents the first reported performance results indicating the impact of risk on performance. The results suggest that the optimal value of risk is sensitive to the latency parameters of the communications network.
The proceedings contains 76 articles. Topics discussed include systems, networking, distributedsimulation, queueing systems, multiprocessor architecture, modeling techniques, parallel systems. Tools, processors, network and system simulation, optimizing parallel programs, petri nets, neural networks and genetic algorithms, real time systems and systems modelling.
We simulate ballistic particle deposition wherein a large number of spherical particles are 'dropped' vertically over a planar horizontal surface. Upon first contact (with the surface or with a previously depo...
详细信息
We simulate ballistic particle deposition wherein a large number of spherical particles are 'dropped' vertically over a planar horizontal surface. Upon first contact (with the surface or with a previously deposited particle) each particle stops. This model helps material scientists to study the adsorption and sediment formation [1]. The model is sequential, with particles deposited one by one. We have found an equivalent formulation using a continuous time random process and we simulate the latter in parallel using a method similar to the one previously employed for simulating Ising spins [2]. We augment the parallel algorithm for simulating Ising spins with several techniques aimed at the increase of efficiency of producing the particle configuration and statistics collection. Some of these techniques are similar to [3], [4], and [5]. We implement the resulting algorithm on a 16K PE MasPar MP-1 and a 4K PE MasPar MP-2. The parallel code runs on MasPar computers two orders of magnitude faster than an optimized sequential code runs on a fast workstation.
This paper examines the cost/performance of simulating a hypothetical target parallel computer using a commercial host parallel computer. We address the question of whether parallelsimulation is simply faster than se...
详细信息
ISBN:
(纸本)9781565550278
This paper examines the cost/performance of simulating a hypothetical target parallel computer using a commercial host parallel computer. We address the question of whether parallelsimulation is simply faster than sequential simulation, or if it is also more cost-effective. To answer this, we develop a performance model of the Wisconsin Wind Tunnel (WWT), a system that simulates cache-coherent shared-memory machines on a message-passing Thinking Machines CM-5. The performance model uses Kruskal and Weiss's fork-join model to account for the effect of event processing time variability on WWT's conservative fixed-window simulation algorithm. A generalization of Thiebaut and Stone's footprint model accurately predicts the effect of cache interference on the CM-5. The model is calibrated using parameters extracted from a fully-parallelsimulation (p=N), and validated by measuring the speedup as the number of processors (p) ranges from one to the number of target nodes (N. Together with simple cost models, the performance model indicates that for target system sizes of 32 nodes and larger, parallelsimulation is more cost-effective than sequential simulation. The key intuition behind this result is that large simulations require large memories, which dominate the cost of a uniprocessor; parallel computers allow multiple processors to simultaneously access this large memory.
Compared to highly optimized optimistic simulators which use local event queues for individual processors on a shared-memory computer, we demonstrate that employing a single global event queue drastically reduces the ...
详细信息
Compared to highly optimized optimistic simulators which use local event queues for individual processors on a shared-memory computer, we demonstrate that employing a single global event queue drastically reduces the number of rollbacks, brings down the storage requirements, and achieves superior load balance. On a bus-based Silicon Graphics multiprocessor, these virtues consistently translated into faster execution times and higher speedups on those synthetic networks of medium- to coarse-grained logical processes which were ridden with rollbacks and load imbalance on local-queue-based simulators. A dynamic randomization-based load distribution scheme for local-event-queue simulators is also shown to be an effective improvement.
Optimistic computation methods typically save copies of objects' state information, so that they can recover from erroneous 'over-optimistic' computations. Such state saving is generally time and space con...
详细信息
Optimistic computation methods typically save copies of objects' state information, so that they can recover from erroneous 'over-optimistic' computations. Such state saving is generally time and space consuming, and can be rather complicated both to implement and to use. I show how the data structure community's theory of persistence can be used not only to analyse and explain the treatment of state in optimistic systems, but also as a simple yet general mechanism for performing the necessary state saving with minimal impact on application code. Preliminary results based on a benchmark application and an existing optimistic simulator are presented, showing that providing support for fully general object states is a realistic and practical option. In addition, I show how some existing state saving techniques - including support for shared state - can be derived, and discuss a number of ways in which the model might be extended.
This paper describes an extension of the TNE algorithm, the objective of which is to increase its parallelism and to break the inter-processor deadlocks inherent with the use of TNE. The algorithm, which we call the S...
详细信息
This paper describes an extension of the TNE algorithm, the objective of which is to increase its parallelism and to break the inter-processor deadlocks inherent with the use of TNE. The algorithm, which we call the SGTNE algorithm (Semi Global TNE), is executed over a cluster of processors as opposed to TNE, which is executed over a cluster of processes assigned to a single processor. SGTNE helps to break the inter-processor deadlocks by executing a shortest path algorithm over a snapshot of the LPs in a cluster of processors. This paper discusses the algorithm and its implementation and reports on the performance results of simulations of a partitioned FCFS queueing network model executed on the Intel Paragon A4 multiprocessor machine. We also examine the impact of partitioning on the efficient implementation of the SGTNE algorithm. The results obtained indicate that SGTNE yields good speedups and that a partitioning which makes use of a strongly connected component algorithm results in a reduction of 30% in the running time of a simulation when compared to simple partitioning strategies. The results also indicate that SGTNE outperforms TNE.
An implementation of a conservative parallel simulator with deadlock avoidance is presented. Its performance when working with a realistic model of a message routing network is evaluated and contrasted against a seque...
详细信息
The design of a specialized computer architecture for qualitative simulation is presented. Our interest focuses on the hardware design of an application-specific computer architecture which is composed of programmable...
详细信息
暂无评论