This paper presents the results of an experimental study to evaluate the effectiveness of multiple synchronization protocols and partitioning algorithms in reducing the execution time of switch-level models of VLSI ci...
详细信息
This paper presents the results of an experimental study to evaluate the effectiveness of multiple synchronization protocols and partitioning algorithms in reducing the execution time of switch-level models of VLSI circuits. Specific contributions of this paper include: parallelizing an existing switch-level simulator such that the model can be executed using conservative and optimistic simulation protocols with minor changes; evaluating effectiveness of several partitioning algorithms for parallelsimulation; and demonstrating speedups with both conservative and optimistic simulation protocols for seven circuits, ranging in size from 3 K transistors to about 87 K transistors.
Performance analysis of distributed systems may be performed on different levels of abstraction. An accurate but time-consuming method is the construction of a simulation model which includes the different subsystems,...
详细信息
Performance analysis of distributed systems may be performed on different levels of abstraction. An accurate but time-consuming method is the construction of a simulation model which includes the different subsystems, the communication system, and the load profile. In particular, this approach seems to be very powerful for real-time systems because of the inherent possibility of precise calculations of delays and processing times. A VHDL-based approach is presented which supports the performance analysis of mixed discrete-continuous distributed systems.
The problem of executing sequential programs in parallel using the optimistic algorithm time warp is considered. This is done by first mapping the sequential execution to a control tree and then assigning timestamps t...
详细信息
The problem of executing sequential programs in parallel using the optimistic algorithm time warp is considered. This is done by first mapping the sequential execution to a control tree and then assigning timestamps to each node in the tree. For such timestamps to be effective in either hardware or software they must be finite, this implies that they must be periodically rescaled to allow old timestamps to be reused. A number of timestamp representations are described and compared on the basis of: their complexity; the frequency and cost of rescaling; and the cost of performing basic operations, including comparison and creation of new timestamps.
The proceedings contain 21 papers. The special focus in this conference is on Evolutionary Approaches to Issues in Biology and Economics. The topics include: Simulating pricing behaviours using a genetic algorithm;bio...
ISBN:
(纸本)3540634762
The proceedings contain 21 papers. The special focus in this conference is on Evolutionary Approaches to Issues in Biology and Economics. The topics include: Simulating pricing behaviours using a genetic algorithm;biologically inspired computational ecologies;modelling bounded rationality using evolutionary techniques;the abstract theory of evolution of the living;an evolutionary algorithm for single objective nonlinear constrained optimization problems;on recombinative sampling;the evolution of mutation, plasticity and culture in cyclically changing environments;on the structure and transformation of landscapes;island model genetic algorithms and linearly separable problems;empirical validation of the performance of a class of transient detector;the construction and evaluation of decision trees;paralleldistributed genetic programming applied to the evolution of natural language recognisers;scheduling planned maintenance of the south wales region of the national grid;solving generic scheduling problems with a distributed genetic algorithm;directing the search of evolutionary and neighbourhood-search optimisers for the flowshop sequencing problem with an idle-time heuristic;multiobjective genetic algorithms for pump scheduling in water supply;use of rules and preferences for schedule builders in genetic algorithms for production scheduling;a voxel based approach to evolutionary shape optimisation;an evolutionary, agent-assisted strategy for conceptual design space decomposition and task scheduling with use of classifier systems.
We address the problem of efficiently performing parallel discrete-event simulation in the case where event elaboration is independent of other processes' local states. We propose a parallelsimulation policy, cal...
详细信息
We address the problem of efficiently performing parallel discrete-event simulation in the case where event elaboration is independent of other processes' local states. We propose a parallelsimulation policy, called State Query Time Warp (SQTW), based on the Time Warp mechanism. We present experiments performed by means of a SQTW-based parallel simulator on a T-800 transputer machine for solving performance models based on state-dependent routing queueing network models. The experiments are used for assessing overheads and efficiency involved by SQTW; results show that high efficiency is achievable, and surprisingly reveal that SQTW is able to globally reduce rollback overheads with respect to corresponding Time Warp simulations.
The paper describes an implementation of a conservative paralleldistributed simulator that has been used to simulate a high fidelity model of ATM networks. Important optimisations of the simulator for this applicatio...
详细信息
The paper describes an implementation of a conservative paralleldistributed simulator that has been used to simulate a high fidelity model of ATM networks. Important optimisations of the simulator for this application are described. The performance of the simulator is reported on up to 12 processors and compared with a sequential implementation. It is seen that the simulator gives good speedup and better performance than the sequential implementation. It is noted that the low overhead of the simulator relies on there being good lookahead in realistic models of ATM networks. Some situations where this lookahead is significantly reduced are described together with future extensions to fix these problems.
Presents an algorithm for computing a sum of products, realizing a fundamental compound multiply-and-add operation of high-speed arithmetic. Two new cellular pipelined algorithms and architectures (2D and 3D) are prop...
详细信息
Presents an algorithm for computing a sum of products, realizing a fundamental compound multiply-and-add operation of high-speed arithmetic. Two new cellular pipelined algorithms and architectures (2D and 3D) are proposed. The initial data and results are binary signed-digit integers. The multipliers are loaded digit-serially, while the multiplicands are loaded in a digit-parallel fashion and the results are produced in the same way. The design is performed in terms of cellular technology, based on an original model of distributed computation (the parallel substitution algorithm). The time- and structural complexity is obtained.
For parallelsimulation of VLSI circuits on the transistor level a sophisticated partitioning of the circuits into subcircuits is crucial. Each net connecting the subcircuits causes additional communication and comput...
详细信息
For parallelsimulation of VLSI circuits on the transistor level a sophisticated partitioning of the circuits into subcircuits is crucial. Each net connecting the subcircuits causes additional communication and computation effort. As the slave processors simulating the subcircuits advance synchronously in time, the computation effort for each subcircuit should be approximately the same. In this paper a new approach for partitioning VLSI circuits on the transistor level yielding a low number of interconnects between the subcircuits and balanced subcircuit sizes is presented. simulation of industrial circuits using this partitioning is up to 41% faster than with other known partitioning approaches for parallel analog simulation.
Large-scale ecological simulations are natural candidates for distributed discrete event simulation. In optimistic simulation of spatially explicit models, a difficult problem arises when individuals migrate between p...
详细信息
Large-scale ecological simulations are natural candidates for distributed discrete event simulation. In optimistic simulation of spatially explicit models, a difficult problem arises when individuals migrate between physical regions simulated by different logical processes. We present a solution to this problem that uses shared object states. Shared states allow for efficient communication between LPs and for early detection of canceled events. We briefly describe an optimistic simulation environment called EcoKit, which operates on top of the WarpKit implementation of Time Warp. Our experiments with this system on a shared memory multiprocessor show that EcoKit promises to scale well both with the number of processors and the number of individuals simulated.
We present a novel approach to parallel discrete event simulation based on the Cilk model of multithreaded computation. Cilk's runtime system not only manages the low-level aspects of program execution, but also p...
详细信息
We present a novel approach to parallel discrete event simulation based on the Cilk model of multithreaded computation. Cilk's runtime system not only manages the low-level aspects of program execution, but also provides the user with an algorithmic model of performance which can be used to predict the execution time of a parallelsimulation. Moreover, a Cilk application can "scale down" to run on a single processor with nearly the same performance as that of serial code. A conservative parallel discrete event simulation algorithm has been developed in which communication between logical processes is achieved using Cilk's virtual memory model, dag consistent shared memory. The simulation executes in cycles, where each cycle involves a divide and conquer computation. Although local lookahead information can be exploited, the algorithm is robust in that it also calculates a global simulation time for each cycle. It can therefore be used for applications where zero lookahead may occur.
暂无评论