We describe a model for an interdisciplinary course in scientific modeling and simulation. We discuss the course structure and content, as well as the results of our evaluation process.
ISBN:
(纸本)0769522947
We describe a model for an interdisciplinary course in scientific modeling and simulation. We discuss the course structure and content, as well as the results of our evaluation process.
In earlier work cloning is proposed as a means for efficiently splitting a running simulation midway through its execution into multiple parallelsimulations. In simulation cloning, clones usually are able to share co...
详细信息
ISBN:
(纸本)0769523838
In earlier work cloning is proposed as a means for efficiently splitting a running simulation midway through its execution into multiple parallelsimulations. In simulation cloning, clones usually are able to share computations that occur early in the simulation, but as their states diverge individual LPs are replicated as necessary so that their computations proceed independently. However if overtime the state of the clones (or their constituent LPs) converges there is, as of yet, no means for recombining them. In this case some efficiency is lost because they will execute identical events. This idea is the reverse of cloning, as we merge logical processes that have been previously cloned and we show that this can further increase efficiency because the new uncloned LPs will complete computations that would otherwise be duplicated. We discuss our implementation of merging, and illustrate its effectiveness in several example simulation scenarios.
simulation of large-scale networks requires enormous amounts of memory and processing time. One way of speeding up these simulations is to distribute the model over a number of connected workstations. However, this in...
详细信息
ISBN:
(纸本)0769523838
simulation of large-scale networks requires enormous amounts of memory and processing time. One way of speeding up these simulations is to distribute the model over a number of connected workstations. However, this introduces inefficiencies caused by the need for synchronization and message passing between machines. In distributed network simulation, one of the factors affecting message passing overhead is the amount of cross-traffic between machines. We perform an independent benchmark of the parallel/distributed Network Simulator (PDNS) based on experimental results processed at the Australian Centre for Advanced Computing and Communications (AC3) supercomputing cluster. We measure the effect of cross-traffic on wall-clock time needed to complete a simulation for a set of basic network topologies by comparing the result with the wall-clock time needed on a single processor. Our results show that although efficiency is reduced with large amounts of cross-traffic, speedup can still be achieved with PDNS. With these results, we developed a performance model that can be used as a guideline for designing future simulations.
In this work we illustrate the design and implementation guidelines of a recently developed middleware defined to support the parallel and distributedsimulation of large scale, complex and dynamically interacting sys...
详细信息
ISBN:
(纸本)0769524478
In this work we illustrate the design and implementation guidelines of a recently developed middleware defined to support the parallel and distributedsimulation of large scale, complex and dynamically interacting system models. The distributedsimulation of complex system models, may suffer the communication and synchronization required to maintain the causality constraints between distributed model components. We designed and implemented the ARTIS middleware as a new framework by incorporating a set of features that allow adaptive optimization by exploiting many complex and dynamic model and distributedsimulation characteristics. As an example, a dynamic migration mechanism for the run-time adaptive allocation of model entities has been designed and exploited for dynamic load and communication balancing. Optimizations have been introduced to obtain the maximum advantage from heterogeneous and asymmetric communication systems, from shared memory to LAN and Internet communication. Other optimizations have been introduced by the exploitation of concurrent replications of parallel and distributedsimulations, in order to increase the resources utilization and to maximize the speedup of simulation processes. Solutions have been designed, implemented and tuned to obtain a significant reduction in the communication and synchronization overheads between the physical execution units, and an increased model scalability and simulation speedup, even in worst-case modeling assumptions and simulation scenarios.
In this paper we introduce a new concept, network atomic operations (NAOs) to create a zero-cost consistent cut. Using NAOs, we define a wall-clock-time driven GVT algorithm called Seven O' Clock that is an extens...
详细信息
ISBN:
(纸本)0769523838
In this paper we introduce a new concept, network atomic operations (NAOs) to create a zero-cost consistent cut. Using NAOs, we define a wall-clock-time driven GVT algorithm called Seven O' Clock that is an extension of Fujimoto's shared memory GVT algorithm. Using this new GVT algorithm, we report good optimistic parallel performance on a cluster of state-of-the-art Itanium-II quad processor systems for both benchmark applications such as PHOLD and real-world applications such as a large-scale TCP/Internet model. In some cases, super-linear speedup is observed.
In this paper, a new event scheduling mechanism XEQ and a new rollback procedure rb-messages are proposed for use in optimistic logic simulation. We incorporate both of these techniques in a simulator XTW. XTW groups ...
详细信息
ISBN:
(纸本)0769523838
In this paper, a new event scheduling mechanism XEQ and a new rollback procedure rb-messages are proposed for use in optimistic logic simulation. We incorporate both of these techniques in a simulator XTW. XTW groups LPs into clusters, and makes use of a multi-level queue,XEQ, to schedule events in the cluster. XEQ has an O(1) event scheduling time complexity. Our new rollback mechanism replaces the use of anti-messages by an rb-message, and eliminates the need for an output queue at each LP. Experimental comparisons to Time Warp reveal a superior performance on the part of XTW, while experimental results over large circuits (5-million-gate to 25-million-gate) shows XTW scales well with both the size of circuits and the number of processors.
Efficient computer simulation of complex physical phenomena has long been challenging due to their multi-physics and multi-scale nature. In contrast to traditional time-stepped execution methods, we describe an approa...
详细信息
parallelising sequential discrete event simulation programs is often a tedious process, with no guarantee for speedup. This paper describes a performance analyser tool developed to predict the execution performance of...
详细信息
This paper provides an overview of the WarpIV simulation Kernel that was designed to be an initial implementation of the Standard simulation Architecture (SSA). WarpIV is the next generation replacement for the Synchr...
详细信息
This paper describes a method for evolutionary component-based development of families of parallel programs to attain performance goals on multiple execution environments for multiple family instances and an implement...
详细信息
ISBN:
(纸本)1595930876
This paper describes a method for evolutionary component-based development of families of parallel programs to attain performance goals on multiple execution environments for multiple family instances and an implementation of the method. It is based upon combining component-oriented development with integration of parallel/distributed execution and parallel/distributedsimulation. Each component may have multiple representations at multiple levels of realization from analytical timing models to production code. Each component is encapsulated with an associative interface specifying its properties and behaviors which enables distinguishing among different implementations (or abstractions) of the same functional behavior which may have different performance behavior. Evolutionary development evolves a program from an abstract performance model to a complete program and may continue evolution during runtime. Performance can be estimated at any stage of realization. The implementation is a compiler which composes parallel/distributed programs from components encapsulated with associative interfaces and a runtime system which supports integrated execution/simulation of parallel programs composed from components at different levels of abstraction and program evolution at runtime by component replacement. Case studies in the application of the evolutionary development method including performance results are given. Copyright 2005 ACM.
暂无评论