Many simulations in the natural sciences and engineering require the numerical solution of nonlinear differential equations. For this class of numerical methods, we propose an appropriate parallel computation model on...
详细信息
Many simulations in the natural sciences and engineering require the numerical solution of nonlinear differential equations. For this class of numerical methods, we propose an appropriate parallel computation model on distributed memory machines that supports the prediction of execution times. As a case study, we investigate the parallel implementation of the diagonal-implicitly iterated Runge-Kutta method, a solution method for stiff systems of ordinary differential equations. An implementation on the Intel iPSC/860 confirms the accuracy of the prediction model.
In a distributed memory environment the communication overhead of Time Warp is the dominating performance factor. In order to limit the optimism to the extent that can be justified from the inherent model parallelism,...
详细信息
In a distributed memory environment the communication overhead of Time Warp is the dominating performance factor. In order to limit the optimism to the extent that can be justified from the inherent model parallelism, an optimism control mechanism is proposed. After investigating statistical forecast methods, it is shown that arrival processes in the context of Time Warp simulations of timed Petri nets have certain predictable and consistent ARIMA characteristics.
In this paper we discuss and compare three different causality inconsistency tracking mechanisms in support Of preemptive rollback in optimistic parallelsimulation on myrinet clusters. These mechanisms exhibit differ...
详细信息
ISBN:
(纸本)0769518532
In this paper we discuss and compare three different causality inconsistency tracking mechanisms in support Of preemptive rollback in optimistic parallelsimulation on myrinet clusters. These mechanisms exhibit different communication/processing overhead and also different effectiveness in revealing causality inconsistency of the currently executed, simulation event. By the results of an empirical study on a classical simulation benchmark we have found some trade-offs between these mechanisms, pointing out indications of application contexts for which each mechanism is expected to be well tailored.
We examine a parallel processing method for simulations of large-scale networks with a hybrid traffic representation combining both a time-stepped fluid model and a discrete-event packet-oriented model. This method be...
详细信息
ISBN:
(纸本)9780769528984
We examine a parallel processing method for simulations of large-scale networks with a hybrid traffic representation combining both a time-stepped fluid model and a discrete-event packet-oriented model. This method benefits from the observation that the time it takes to propagate-fluid characteristics along the path taken by the traffic flows has a lower bound equal to the minimum link delay as manifested by the governing ordinary differential equations (ODEs). A better lookahead can thus be used to allow parallelsimulation of the hybrid model to run without more synchronization overhead than the corresponding discrete-event packet-oriented model. We derive an analytical model comparing the fluid model and the packet-oriented model both for sequential and parallelsimulations. We demonstrate the benefit of the parallel hybrid model through a series of simulation experiments of a large-scale network consisting of over 170,000 hosts and 1.6 million traffic flows on a small parallel cluster.
SystemC is a system-level modeling language and simulation framework which facilitates design and verification of processor designs at different levels. Recently, SystemC is becoming a popular choice for designers of ...
详细信息
ISBN:
(纸本)9780769537139
SystemC is a system-level modeling language and simulation framework which facilitates design and verification of processor designs at different levels. Recently, SystemC is becoming a popular choice for designers of both System-On-Chip (SoC) and embedded processors, clue to its adaptability at cycle as well as transaction levels, and ability to model concurrent processes. However, the single threaded simulation kernel inherent to SystemC, prevents it froth utilizing the potential computing power of symmetric multiprocessing (SMP) machines to speed up hardware simulation. We present a parallel SystemC simulation kernel, which is implemented using parallel programming techniques and leverages the parallel execution capabilities of multi-core machines to speed up hardware simulation. We discuss the mechanism we use for mapping parallel SystemC modules into different cores. Finally we report the performance of the parallelized SystemC kernel using a linear pipelined performance model and a pipelined performance model tailored to exhibit the behavior of real world simulation. Our results demonstrate that the performance improvement obtained by using parallelized SystemC for simulation of the above models is significant and improves with increasing design complexity of the simulated design and the number of cores in the machine running the simulators.
Lookahead is a critical factor in conservative parallelsimulation. Greater lookahead usually brings better performance. However, in the simulation of computer networks, lookahead is usually determined by the minimal ...
详细信息
ISBN:
(纸本)9780769531595
Lookahead is a critical factor in conservative parallelsimulation. Greater lookahead usually brings better performance. However, in the simulation of computer networks, lookahead is usually determined by the minimal delay of the border links between any two subnets that simulated by different sequential logical processes (LPs), which is too small to get good performance. Traditionally, the lookahead exploitation usually only reflects the parallelism among LPs, which possibly wastes the potential parallelism inside each LP, especially, in the case that each LP simulates thousands of entities. Here we present a simple method called micro-synchronization to exploit the parallelism inside each LP. Different from the previous work, such as lookahead accumulation and local time warp, we keep the traditional usage of lookahead among LPs unchanged, and however, we impose the relaxed sequential event scheduling inside each LP, which can indirectly improve the lookahead We also present a state causality model to prove the correctness of our method, which means that there is no risk in the relaxed sequential execution. Finally, the experiment evaluates our method and shows that it can improve the performance of conservative parallelsimulation of computer networks to some extent.
This paper presents the results of an experimental study to evaluate the effectiveness of multiple synchronization protocols and partitioning algorithms in reducing the execution time of switch-level models of VLSI ci...
详细信息
This paper presents the results of an experimental study to evaluate the effectiveness of multiple synchronization protocols and partitioning algorithms in reducing the execution time of switch-level models of VLSI circuits. Specific contributions of this paper include: (i) parallelizing an existing switch-level simulator such that the model can be executed using conservative and optimistic simulation protocols with minor changes, (ii) evaluating effectiveness of several partitioning algorithms for parallelsimulation, and (iii) demonstrating speedups with both consecutive and optimistic simulation protocols for seven circuits, ranging in size from 3 K transistors to about 87 K transistors.
With the emerging of broadband networks based on ATM technology, performance evaluation tools that allow the study of large systems are desperately needed. In this article we present our experiments in distributed sim...
详细信息
ISBN:
(纸本)0818684577
With the emerging of broadband networks based on ATM technology, performance evaluation tools that allow the study of large systems are desperately needed. In this article we present our experiments in distributedsimulation of large and complex ATM network models with a conservative simulator. The goal here was not to achieve the maximum speedup with well-shaped topologies but rather to see what speedup can be obtained with a realistic model on a "state-of-the art" parallel computer. A network model with 78 switches is simulated on a Gray T3E using 3 different traffic loads. The performance results show that good speedups can be achieved but also highlight partitioning problems and bottlenecks in the simulation model that can seriously limit the speedup of realistic model simulations.
Large-scale parallel discrete event simulations of massive networks, such as the Internet, are "Grand Challenge" problems: packet level simulation of even a small fraction of the Internet would consume the r...
详细信息
ISBN:
(纸本)0769519709
Large-scale parallel discrete event simulations of massive networks, such as the Internet, are "Grand Challenge" problems: packet level simulation of even a small fraction of the Internet would consume the resources of the most powerful computers available. We reimplement the SSF Scalable simulation Framework so we can run large-scale network simulations originally written for DaSSF Our implementation, CraySSF is designed for the Cray-MTA, a multi-threaded supercomputer architecture developed specifically to address large-scale computations of the kind that are not easily distributed. This paper describes the architecture, implementation issues, and preliminary performance results on a variety of (stock) serial and parallel architectures.
Advances in massively parallel platforms are increasing the prospects for high performance discrete event simulation. Still the difficulty in parallel programming persists and there is increasing demand for high level...
详细信息
Advances in massively parallel platforms are increasing the prospects for high performance discrete event simulation. Still the difficulty in parallel programming persists and there is increasing demand for high level support for building discrete event models to execute on such platforms. We present a parallel DEVS-based (Discrete Event System Specification) simulation environment that can execute on distributed memory multicomputer systems with bench-marking results of a class of high resolution, large scale ecosystem models. Underlying the environment is a parallel container class library for hiding the details of message passing technology while providing high level abstractions for hierarchical, modular DEVS models. The C++ implementation working on the Thinking Machines CM-5 demonstrates that the desire for high level modeling support need not be irreconcilable with sustained high performance.
暂无评论