This paper describes the implementation of transmission-line matrix (TLM) method algorithms on a massively parallelcomputer (DECmpp 12000), the technique of distributed computing in the UNIX environment, and the comb...
详细信息
This paper describes the implementation of transmission-line matrix (TLM) method algorithms on a massively parallelcomputer (DECmpp 12000), the technique of distributed computing in the UNIX environment, and the combination of TLM analysis with Prony's method as well as with autoregressive moving average (ARMA) digital signal processing for electromagnetic field modelling. By combining these advanced computation techniques, typical electromagnetic field modelling of microwave structures by TLM analysis can be accelerated by a few orders of magnitude.
In order to solve the speed problem and shallow reasoning problem met in current research in fault diagnosis expert system, this paper presents a model based parallel fault diagnosis expert system for energy managemen...
Octrees offer a powerful means for representing and manipulating 3-D objects. This paper presents an implementation of octree manipulations using a new approach on a shared memory architecture. Octrees are hierarchica...
详细信息
Octrees offer a powerful means for representing and manipulating 3-D objects. This paper presents an implementation of octree manipulations using a new approach on a shared memory architecture. Octrees are hierarchical data structures used to model 3-D objects. The manipulation of these data structures involves performing independent computations on each node of the octree. Octrees are much easier to deal with than other forms of representations used to model 3-D objects especially where extensive manipulations are involved. When these operations are distributed among multiple processing elements (PEs) and executed simultaneously, a significant speedup may be achieved. Manipulations such as a complement, a union, an intersection and other operations such as finding the volume and centroid which this paper describes are implemented on the Sequent Balance multiprocessor. In this approach the PEs are allocated dynamically, resulting in a uniform load balancing among them. The experimental results presented illustrate the feasibility of the approach. Although this evaluation has been originally done for shared memory machines, it will provide insight for the evaluation of other architectures.
An analytical study of potential pathological performance areas of the Seamless architecture is presented. Seamless is a latency-tolerant, distributed memory, multiprocessor architecture. A key component of the philos...
详细信息
An analytical study of potential pathological performance areas of the Seamless architecture is presented. Seamless is a latency-tolerant, distributed memory, multiprocessor architecture. A key component of the philosophy of Seamless, however, is the use of standard, commodity components for a large part of the system. A discussion of the unavoidable implementation compromises imposed by this decision is presented, followed by a summary of some optimistic performance studies. Then an analytical study that parameterizes the predicts the worst-case impact of using standard components is provided. Finally, it is shown that these bottlenecks are manageable via careful generation of target machine code so that the optimistic performance studies become realistic expectations for a range of program behaviors and granularities.< >
For real-time radar processing, it is very desirable to have an algorithm that does not assume restricted statistics of the input data and can be implemented for high-speed processing (without a high cost) to meet rea...
详细信息
For real-time radar processing, it is very desirable to have an algorithm that does not assume restricted statistics of the input data and can be implemented for high-speed processing (without a high cost) to meet real-time requirements. We therefore apply the QR decomposition-based least-squares method for linear prediction to the problem of computing the reflection coefficients of a lattice predictor, instead of using the conventional Burg algorithm. We also propose a modified one-dimensional ring architecture for implementing the QR method of least-squares. The particular application considered in this case is that of surveillance radar systems for air traffic control.< >
The effectiveness of applying two types of parallel block predictor-corrector algorithms presented by L.F. Shampine and H.A. Watts (Math. Comput., vol.23, p.731-40, 1969; BIT, vol.12, p.252-66, 1972) and L.G. Birta an...
详细信息
The effectiveness of applying two types of parallel block predictor-corrector algorithms presented by L.F. Shampine and H.A. Watts (Math. Comput., vol.23, p.731-40, 1969; BIT, vol.12, p.252-66, 1972) and L.G. Birta and O. Abdou-Rabia (IEEE Trans. on Comput. vol.C-36, no.3, p.299-311 March 1987) to the simulation of a Space Shuttle main rocket engine is discussed. Comparisons between the sequential and parallel versions of the algorithms are made, based upon implementations of Shuttle main engine simulations on a four-node multiple-instruction multiple-data parallel processing environment consisting of Inmos T800 transputers. The results of these simulations are reported in terms of certain performance measurements, including execution time, processor utilization, simulation accuracy and simulation stability. The expected performance of expanded versions of the algorithms which employ more than four transputers is also calculated assuming a hypercube-type topology.< >
Numerical Weather Prediction (NWP) is acknowledged as being of vital importance to economy. The demand that NWP places on computing system performance has increased dramatically since the introduction of computer syst...
详细信息
An architecture to provide for the expert control of numerical processing is described. Simulations run to explore the possibility of using parallelism show positive results in the areas of numerical processing and ex...
详细信息
An architecture to provide for the expert control of numerical processing is described. Simulations run to explore the possibility of using parallelism show positive results in the areas of numerical processing and expert symbolic processing. The architecture presented, while multiple-instruction multiple-data (MIMD) massively parallel, is an optimum parallel environment for real-time applications and follows the idea of intelligent processing to create a coupled system to support real-time simulation and control tasks.< >
Hypercube architectures are introduced. The reasons behind their becoming the first widespread commercial massively parallel processors are outlined. A classification for image pattern recognition is proposed and char...
详细信息
暂无评论