By offering a shared address space across a number of processors connected by a local area network, the distributed shared memory model offers an attractive way of programming parallel-distributed applications. Such p...
详细信息
By offering a shared address space across a number of processors connected by a local area network, the distributed shared memory model offers an attractive way of programming parallel-distributed applications. Such programming can be done either using a memory model based on objects or linear memory. Very few performance studies have been made of such systems. The author describes the motivation and the methodology for a project which compares performance of the object model to the linear memory model. Execution-driven simulation is used to analyze the performance and scalability of the systems for appearing fast processors and new highspeed networks.< >
The major goal of this work has been to develop an implementation of a parallel partitioning algorithm which is suitable for use in a conservatively synchronized parallel Discrete Event simulation (PDES) environment. ...
ISBN:
(纸本)9781565550551
The major goal of this work has been to develop an implementation of a parallel partitioning algorithm which is suitable for use in a conservatively synchronized parallel Discrete Event simulation (PDES) environment. Effective partitioning is essential for performance and capacity consideration, for any PDES problem. The performance of the partitioning algorithm is very important, to the overall simulation performance. There are two possible approaches to improve performance for the partitioning step: algorithm modifications; and parallelize the partitioning algorithm (Fiduccia and Mattheyses, 1982) is developed. The basic algorithm has been modified, first for parallel execution with a similar quality of final partition; and then further modified to increase the parallelism of the algorithm, at the expense of partition quality.
Recent experiments have shown that conservative methods can achieve good performance by exploiting the characteristics of the system being simulated. In this paper we focus on the interrelationship between run time an...
ISBN:
(纸本)9781565550551
Recent experiments have shown that conservative methods can achieve good performance by exploiting the characteristics of the system being simulated. In this paper we focus on the interrelationship between run time and synchronization requirements of a distributedsimulation. A metric that considers the effect of lookahead and the physical rate of transmission of messages, and an arrival approximation that models the effect of synchronization requirements on the run time are developed. It is shown that even when good lookahead is exploited in the system, poor run-time performance is achieved if an inefficient mapping of LPs to processors is used.
An approach for high performance parallel logic simulation on a local area network of workstation computers is discussed in this paper. The single, shared transmission medium often found in such networks places limita...
ISBN:
(纸本)9781565550551
An approach for high performance parallel logic simulation on a local area network of workstation computers is discussed in this paper. The single, shared transmission medium often found in such networks places limitations on parallel execution, hence a reduction in the frequency of synchronization is pursued by combining a circuit partitioning methodology with a specific synchronization constraint. A consequence of the partitioning methodology is replication of objects between blocks of a partition. A partitioning procedure based on iterative improvement is described for reducing replication while preserving load balance. Two interprocessor synchronization techniques for parallelsimulation are studied: conservative and optimistic synchronization. Experiments conducted on three large sequential circuits indicate that reasonable speedup is achievable for well-balanced partitions, and that optimistic synchronization provides a modest improvement in performance over conservative synchronization.
Time Warp has evolved to a common technique for distributedsimulation. Speedup in Time Warp simulation systems mainly depends on two overhead factors: first, the load on the simulators has to be well balanced and sec...
ISBN:
(纸本)9781565550551
Time Warp has evolved to a common technique for distributedsimulation. Speedup in Time Warp simulation systems mainly depends on two overhead factors: first, the load on the simulators has to be well balanced and second, communication and rollbacks have to be kept to a minimum. Both of these factors are influenced by the partitioning of the simulated system. In this paper, we focus on various static partitioning schemes used to partition digital circuits for distributedsimulation.A new hierarchical partitioning approach is presented, compared and rated with other partitioning schemes by evaluating benchmark circuits. Partitioning is done in two steps: a fine grained clustering step based on corollas and a coarse grained step forming partitions using the connectivity matrix. The corolla approach yields very good partitioning results even for a large number of partitions. The achieved speedups are almost linear (up to 12 partitions for larger circuits), as long as the partition sizes are large enough so that communication between the simulators is not a bottleneck. The results reveal the great impact of partitioning on the acceleration of distributed logic simulation and show the effectiveness of the presented corolla partitioning scheme.
The authors describe a new parallel image understanding machine RTA/1 design based on the recursive Torus architecture, and proposed a data level parallel processing scheme using parallel data structures. Various type...
详细信息
The two main approaches to parallel discrete event simulation – conservative and optimistic – are likely to encounter some limitations when the size and complexity of the simulation system increases. For such large ...
ISBN:
(纸本)9781565550551
The two main approaches to parallel discrete event simulation – conservative and optimistic – are likely to encounter some limitations when the size and complexity of the simulation system increases. For such large scale simulations, the conservative approach appears to be limited by blocking overhead and sensitivity to lookahead, whereas the optimistic approach may become prone to cascading rollbacks, state saving overhead, and demands for larger memory space. These drawbacks restrict the synchronization schemes based on each of the two approaches from scaling up. A combined approach may resolve these limitations, while preserving and utilizing potential advantages of each method. However, the schemes proposed so far integrate the two views at the same level, i.e. local to a logical process, and hence may not be able to fully solve the problems. In this paper we propose the Local Time Warp method for parallel discrete-event simulation and present a novel synchronization scheme for it called HCTW. The new scheme hierarchically combines a Conservative Time Window algorithm with Time Warp and aims at reducing cascade rollbacks, sensitivity to lookahead, and the scalability problems. Local Time Warp is believed to be suitable for parallel machines equipped with thousands of processors and thus an appropriate candidate for simulation of large and complex systems.
A hardware-based framework which supports a wide range of parallel discrete event synchronization protocols has been proposed in [Reyn92]. This framework offloads all synchronization activity from the host processors ...
ISBN:
(纸本)9781565550551
A hardware-based framework which supports a wide range of parallel discrete event synchronization protocols has been proposed in [Reyn92]. This framework offloads all synchronization activity from the host processors and host communication network in the system. The underlying hardware computes results of global, binary associative operations, or global reductions. In this paper we present results of simulations that strongly suggest the need for a next-generation reduction network which computes and disseminates results of target-specific reductions to support both aggressive and non-aggressive parallel discrete event simulations. Target-specific reductions allow a logical process to receive synchronization information only from those logical processes which may have a direct or indirect impact on its performance.
In this paper, a modular neurocontroller for an arbitrary N-DOF (degree of freedom) manipulator is proposed. The recursive nature of the Newton-Euler formulation is used as a base for the modular neurocontroller. The ...
详细信息
In this paper, a modular neurocontroller for an arbitrary N-DOF (degree of freedom) manipulator is proposed. The recursive nature of the Newton-Euler formulation is used as a base for the modular neurocontroller. The neural modules can be trained by the direct inverse or indirect adaptive control schemes. Computer simulation results for a 2-DOF SCARA manipulator are given. Due to its modular structure, this neurocontroller can be applied to a manipulator with arbitrary degrees of freedom such as distributed or cellular robotic systems.
暂无评论