A key tool of computer architects is computer simulation at the level of detail that can execute program executables. The time and memory requirements of such simulations can be enormous, especially when the machine u...
详细信息
ISBN:
(纸本)9780818675393
A key tool of computer architects is computer simulation at the level of detail that can execute program executables. The time and memory requirements of such simulations can be enormous, especially when the machine under design-the target-is a parallel machine. Thus, it is attractive to use parallelsimulation, as successfully demonstrated by the Wisconsin Wind Tunnel (WWT). WWT uses a conservative simulation algorithm and eschews network simulation to make lookahead adequate. Nevertheless, we find most of WWT's slowdown to be due to the synchronization overhead in the conservative simulation algorithm. This paper examines the use of optimistic algorithms to perform parallelsimulations of parallel machines. We first show that we can make optimistic algorithms work correctly even with WWT's direct execution of program executables. We checkpoint processor registers (integer, floating-point, and condition codes) and use executable editing to log the value of memory words just before they are overwritten by stores. Second, we consider the performance of two optimistic algorithms. The first executes programs optimistically, but performs protocol events (e.g., sending messages) conservatively. The second executes everything optimistically and is similar to Time Warp with lazy message cancellation. Unfortunately, both approaches make parallelsimulation performance worse for the default WWT assumptions. We conclude by speculating on the performance of optimistic simulation when simulating (1) target network details, and (2) on hosts with high message latencies and no synchronization hardware.
One of the promises of parallelized discrete-event simulation is that it might provide significant speedups over sequential simulation. In reality, high performance cannot be achieved unless the system is fine-tuned t...
详细信息
ISBN:
(纸本)9780818675393
One of the promises of parallelized discrete-event simulation is that it might provide significant speedups over sequential simulation. In reality, high performance cannot be achieved unless the system is fine-tuned to balance computation, communication, and synchronization requirements. In this paper, we discuss our experiments in automated load balancing using the SPEEDES simulation framework. Specifically, we examine three mapping algorithms that use run-time measurements. Using simulation models of queuing networks and the National Airspace System, we investigate (i) the use of run-time data to guide mapping, (ii) the utility of considering communication costs in a mapping algorithm, (iii) the degree to which computational ``hot-spots'' ought to be broken up in the linearization, and (iv) the relative execution costs of the different algorithms. We compare the performance of the three algorithms using results from the Intel Paragon.
The partitioning of complex processor models on the gate and register-transfer level for parallel functional simulation based on the clock-cycle algorithm is considered. We introduce a hierarchical partitioning scheme...
详细信息
ISBN:
(纸本)9780818675393
The partitioning of complex processor models on the gate and register-transfer level for parallel functional simulation based on the clock-cycle algorithm is considered. We introduce a hierarchical partitioning scheme combining various partitioning algorithms in the frame of a competing strategy. Melting together different partitioning results within one level using superpositions we crossover to a mixture of experts one. This approach is improved applying genetic algorithms. In addition we present two new partitioning algorithms both of them taking cones as fundamental units for building partitions.
With increasing sophistication of parallel tool development technologies, runtime data collection and management activities are receiving greater attention from tool developers. We define an instrumentation system (IS...
详细信息
With increasing sophistication of parallel tool development technologies, runtime data collection and management activities are receiving greater attention from tool developers. We define an instrumentation system (IS) as a set of modules and functions that collect runtime data from concurrent processes and support an integrated tool environment by sharing this data along with specific control information. Presently, it is not standard practice to formally evaluate the performance and functionality of a tool early in its development. Usability and efficiency studies of prototypical tools are emerging to alleviate this situation. However; the underlying IS is removed from the end-user and is part of system infrastructure, thus necessitating more rigorous evaluation. We are developing a structured approach to plan, design, model, realize, and test an IS, called the Vista IS, to appropriately address these issues.
The event horizon is a very important concept that applies to both parallel and sequential discrete-event simulations. By exploiting the event horizon, parallelsimulations can processes events optimistically in a ris...
详细信息
ISBN:
(纸本)9780818675393
The event horizon is a very important concept that applies to both parallel and sequential discrete-event simulations. By exploiting the event horizon, parallelsimulations can processes events optimistically in a risk-free manner (i.e., without requiring antimessages) using adaptable "breathing" time cycles with variable time widths. Additionally, exploiting the event horizon can significantly reduce the overhead of event list management that is common to virtually all discrete-event simulations. This paper is a continuation of work previously reported at PADS94. In that report, a complete mathematical formulation of the event horizon was derived under equilibrium conditions using the hold model. Various forms of the beta density function were consequently used to verify the predicted results of the analytic model. This second report describes how the concept of the event horizon can also be applied to event list management. By exploiting the event horizon, the performance of several priority queue data structures are improved including: linked lists, various binary trees, and heaps. A somewhat detailed description of these modified data structures along with other relevant background information is provided for completeness. Performance results for each of these priority queue data structure is provided.
We simulate models of ATM communication systems on a massively parallel SIMD computer. Fast simulations of ATM models are needed because the regimes of interest usually involve high volumes of traffic and low failure ...
详细信息
ISBN:
(纸本)9780818675393
We simulate models of ATM communication systems on a massively parallel SIMD computer. Fast simulations of ATM models are needed because the regimes of interest usually involve high volumes of traffic and low failure rates. Unexpected practical and theoretical difficulties, partly due to the massive parallelism and SIMD aspects, were encountered and we show how to cope with them. In a replica-parallelsimulation of an ATM system, large variations in computed statistics are caused by small differences in the distribution of employed random number generators. A comparison of these distributions using a secondary statistical measure served to disambiguate the results. It was also found that time-parallelsimulations of ATM systems with Markov sources can be efficiently performed using parallel prefix methods only when the sources have a small number of states, while more complex sources require end-state matching for efficient simulation. We discovered that, with the proper choice of initial state distributions and partial regeneration points, the time and memory requirements can be much improved. Our simulations were carried out on the MasPar MP-1216 system with 16,384 processors, which was compared against an SGI workstation. We achieved about 60%-70% efficiency (speed-up of approx 35 compared to the ideal of approx 51).
We present an Incremental State Saving technique for which the state saving calls are inserted automatically by directly editing the application executable. This method has the advantage of being easy to use since it ...
详细信息
ISBN:
(纸本)9780818675393
We present an Incremental State Saving technique for which the state saving calls are inserted automatically by directly editing the application executable. This method has the advantage of being easy to use since it is fully automatic, and has good performance since it adds overhead only where state is being modified. Since the editing happens on executable code, the method is independent of the compiler, and allows third party libraries to be used. None of the previous incremental state saving methods have both of these features. We find that it is beneficial to use Automatic Incremental State Saving if less than 15% of the state is modified in each event as compared to copy state saving. This technique allows us to efficiently parallelize existing simulations, and makes Time Warp more accessible to non-Time Warp experts.
In this paper, we present two new methods to simulate Petri Nets: a data parallelsimulation and a distributedsimulation. Both simulations use an equational representation of the net in the so called (min, +) algebra...
详细信息
In this paper, we present two new methods to simulate Petri Nets: a data parallelsimulation and a distributedsimulation. Both simulations use an equational representation of the net in the so called (min, +) algebra. The data parallelsimulation is based on the use of matrix representation of these equations, and the distributedsimulation on the decomposition of a Petri Net into marked graph components.
暂无评论