Low voltage CMOS interface circuitry with self-correcting pull-up and pull-down output resistances to match the printed circuit board transmission line impedance, suitable for use in both series and parallel high-spee...
详细信息
Low voltage CMOS interface circuitry with self-correcting pull-up and pull-down output resistances to match the printed circuit board transmission line impedance, suitable for use in both series and parallel high-speed digital communication between integrated circuits and suitable for interfacing CMOS to heterogeneous logic (e.g. ECL), is described. The output driver impedance is controlled by the use of a pull-up and a pull-down, binary weighted transistor ladder in conjunction with an A-to-D convertor and an external resistor, representing the transmission line impedance.< >
This conference proceedings contains 22 papers on advances in the design and analysis of algorithms for parallel and distributedsimulation. Topics discussed include selecting the checkpoint interval in time warp para...
详细信息
ISBN:
(纸本)1565550552
This conference proceedings contains 22 papers on advances in the design and analysis of algorithms for parallel and distributedsimulation. Topics discussed include selecting the checkpoint interval in time warp parallelsimulation, parallel algorithms for simulating continuous time Markov chains, determining initial states for time-parallelsimulations, global synchronization for optimistic parallel discrete event simulation, an algorithm for minimally latent global virtual time, a parallel partitioning technique for use with conservative parallelsimulation, disseminating critical synchronization information in parallel discrete event simulations, shared variables in distributedsimulation, high performance parallel logic simulation on a network of workstations, corolla partitioning for distributed logic simulation of VLSI circuits, efficient implementation of event sets in time warp, an analytical comparison of periodic checkpointing and incremental state saving, parallelsimulation of communicating finite state machines, the effect of synchronization requirements on the performance of distributedsimulations, and time warp simulation in time-constrained systems.
Synchronization is a significant cost in many parallel programs, and can be a major bottleneck if it is handled in a centralized fashion using traditional shared-memory constructs such as barriers. In a parallel time-...
详细信息
ISBN:
(纸本)1565550552
Synchronization is a significant cost in many parallel programs, and can be a major bottleneck if it is handled in a centralized fashion using traditional shared-memory constructs such as barriers. In a parallel time-stepped simulation, the use of global synchronization primitives limits scalability, increases the sensitivity to load imbalance, and reduces the potential for exploiting locality to improve cache behavior. This paper presents the results of an initial one-application study quantifying the costs and performance benefits of distributed, nearest neighbors synchronization. The application studied, MP3D, is a particle-based wind tunnel simulation. Our results for this one application on current shared-memory multiprocessors show a significant decrease in synchronization time using these techniques. We prototyped an application-independent library that implements distributed synchronization. The library allows a variety of parallelsimulations to exploit these techniques without increasing the application programming beyond that of conventional approaches.
Recent experiments have shown that conservative methods can achieve good performance by exploiting the characteristics of the system being simulated. In this paper we focus on the interrelationship between run time an...
详细信息
ISBN:
(纸本)1565550552
Recent experiments have shown that conservative methods can achieve good performance by exploiting the characteristics of the system being simulated. In this paper we focus on the interrelationship between run time and synchronization requirements of a distributedsimulation. A metric that considers the effect of lookahead and the physical rate of transmission of messages, and an arrival approximation that models the effect of synchronization requirements on the run time are developed. It is shown that even when good lookahead is exploited in the system, poor run-time performance is achieved if an inefficient mapping of LPs to processors is used.
A number of optimistic synchronization schemes for parallelsimulation rely upon a global synchronization. The problem is to determine when every processor has completed all its work, and there are no messages in tran...
详细信息
ISBN:
(纸本)1565550552
A number of optimistic synchronization schemes for parallelsimulation rely upon a global synchronization. The problem is to determine when every processor has completed all its work, and there are no messages in transit in the system that will cause more work. Most previous solutions to the problem have used distributed termination algorithms, which are inherently serial;other parallel mechanisms may be inefficient. In this paper we describe an efficient parallel algorithm derived from a common `barrier' synchronization algorithm used in parallel processing. The algorithm's principle attraction is speed, and generality - it is designed to be used in contexts more general than parallel discrete-event simulation. To establish our claim to speed, we compare our algorithm's performance with the standard barrier algorithm, and find that its additional costs are not excessive. Our experiments are conducted using up to 256 processors on the Intel Touchstone Delta.
Several mathematical and algorithmic problems that have arisen in discrete event simulations of large systems are described. The simulated systems belong to the areas of computational physics, queueing networks, and e...
详细信息
ISBN:
(纸本)1565550552
Several mathematical and algorithmic problems that have arisen in discrete event simulations of large systems are described. The simulated systems belong to the areas of computational physics, queueing networks, and econometric models.
We have previously shown that the mathematical technique of uniformization can serve as the basis of synchronization for the parallelsimulation of continuous-time Markov chains. This paper reviews the basic method an...
详细信息
ISBN:
(纸本)1565550552
We have previously shown that the mathematical technique of uniformization can serve as the basis of synchronization for the parallelsimulation of continuous-time Markov chains. This paper reviews the basic method and compares four different methods based on uniformization, evaluating their strengths and weaknesses as a function of problem characteristics. The methods vary in their use of optimism, logical aggregation, communication management, and adaptivity. Performance evaluation is conducted on the Intel Touchstone Delta multiprocessor, using up to 256 processors.
The major goal of this work has been to develop an implementation of a parallel partitioning algorithm which is suitable for use in a conservatively synchronized parallel Discrete Event simulation (PDES) environment. ...
详细信息
ISBN:
(纸本)1565550552
The major goal of this work has been to develop an implementation of a parallel partitioning algorithm which is suitable for use in a conservatively synchronized parallel Discrete Event simulation (PDES) environment. Effective partitioning is essential for performance and capacity consideration, for any PDES problem. The performance of the partitioning algorithm is very important, to the overall simulation performance. There are two possible approaches to improve performance for the partitioning step: algorithm modifications;and parallelize the partitioning process. In this work, an efficient parallelized version of the iterative improvement based partitioning algorithm (Fiduccia and Mattheyses, 1982) is developed. The basic algorithm has been modified, first for parallel execution with a similar quality of final partition;and then further modified to increase the parallelism of the algorithm, at the expense of partition quality.
In this paper we consider the effect of using bus interconnection structures on the overheads present in conservative parallelsimulations of multicomputer programs. We use a modified version of the Poker Programming ...
详细信息
ISBN:
(纸本)1565550552
In this paper we consider the effect of using bus interconnection structures on the overheads present in conservative parallelsimulations of multicomputer programs. We use a modified version of the Poker Programming Environment to empirically measure the overhead in three parallel algorithms using buses. We discuss the sources of overhead and compare them with those found using point-to-point communication. Preliminary results indicate that the overheads encountered using a bus interconnection structure were not predicted by our previous results using point-to-point communications.
An approach for high performance parallel logic simulation on a local area network of workstation computers is discussed in this paper. The single, shared transmission medium often found in such networks places limita...
详细信息
ISBN:
(纸本)1565550552
An approach for high performance parallel logic simulation on a local area network of workstation computers is discussed in this paper. The single, shared transmission medium often found in such networks places limitations on parallel execution, hence a reduction in the frequency of synchronization is pursued by combining a circuit partitioning methodology with a specific synchronization constraint. A consequence of the partitioning methodology is replication of objects between blocks of a partition. A partitioning procedure based on iterative improvement is described for reducing replication while preserving load balance. Two interprocessor synchronization techniques for parallelsimulation are studied: conservative and optimistic synchronization. Experiments conducted on three large sequential circuits indicate that reasonable speedup is achievable for well-balanced partitions, and that optimistic synchronization provides a modest improvement in performance over conservative synchronization.
暂无评论