This paper discusses the architecture and performance of a prototype switch for interconnecting IBM RISC System/6000 workstations. The paper describes the interconnection architecture and performance on a cluster of f...
详细信息
This paper discusses the architecture and performance of a prototype switch for interconnecting IBM RISC System/6000 workstations. The paper describes the interconnection architecture and performance on a cluster of four IBM RISC System 6000 model 340 workstations. It also describes the driver level software interface to the switch and the features incorporated to minimize communication overhead. The performance measurements cover communication latency and bandwidth. In addition, performance measurements of Express, a popular parallel-programming interface, are provided.< >
The author has implemented a set of computational physics codes on a network of IBM RS/6000 workstations used as a distributed parallel computer. He compares the performance of the codes on this network, using both st...
详细信息
The author has implemented a set of computational physics codes on a network of IBM RS/6000 workstations used as a distributed parallel computer. He compares the performance of the codes on this network, using both standard Ethernet connections and a fast prototype switch, and also on the nCUBE/2, a MIMD parallel computer. The algorithms used range from simple, local, and regular to complex, non-local, and irregular. He describes his experiences with the hardware, software and parallel languages used, and discusses ideas for making distributed parallel computing on workstation networks more easily usable for computational physicists.< >
The maximum clique problem is to find the maximum complete subgraph of a given graph G. A computation model for large-scale maximum clique problems is proposed and was tested. A parallel algorithm based on the maximum...
详细信息
ISBN:
(纸本)0780312813
The maximum clique problem is to find the maximum complete subgraph of a given graph G. A computation model for large-scale maximum clique problems is proposed and was tested. A parallel algorithm based on the maximum neural network which resembles the winner-takes-all circuit is introduced which solves large-scale problems in reasonable computation time that the best known algorithms cannot solve. The maximum clique problem is first formulated as an unconstrained quadratic zero-one programming problem and is solved by minimizing the weight summation over the same partition in a newly constructed graph.< >
The parallel implementation of the revised simplex algorithm (RSA) using eta-factorization holds the promise of significant improvement in the execution time by virtue of the existence of a high degree of parallelism ...
详细信息
The parallel implementation of the revised simplex algorithm (RSA) using eta-factorization holds the promise of significant improvement in the execution time by virtue of the existence of a high degree of parallelism in the computation within an iteration of the algorithm. However, the scheme employed to partition key data structures in a distributed memory parallel processor has a great impact on the achievable performance. The paper explores the trade-offs between block-row and block-column partitioning schemes for the matrix of constraint coefficients vis-a-vis the communication overheads and granularity of parallel computations. The results of an approximate analysis of the compute-communication balance are compared with measurements from practical implementation of the partitioning schemes on C-DAC's PARAM 8000 distributed memory parallel processor.< >
The local-network-based computer system, in which some workstations are connected is coming into practical use. Software development is, however, very difficult for end-users because the system has complicated problem...
详细信息
The local-network-based computer system, in which some workstations are connected is coming into practical use. Software development is, however, very difficult for end-users because the system has complicated problems such as load balancing, communication among processes on different workstations and so on. The authors propose a C-specific parallelizing compiler to solve these problems. The compiler adopts function call parallelization, and takes less communication overhead than DO loop parallelization.< >
We present a parallel algorithm for solving the dual transshipment problem. The traditional dual simplex method does not offer much scope for parallelization, because it moves from one basic feasible solution to anoth...
详细信息
We present a parallel algorithm for solving the dual transshipment problem. The traditional dual simplex method does not offer much scope for parallelization, because it moves from one basic feasible solution to another, performing one pivot operation at a time. We present a new method called modified network dual simplex method which uses concurrent pivots. This departure from the traditional LP approach raises several issues such as the need to convert a non-basic feasible solution to a basic feasible solution. We present our strategies to handle these issues as well as the corresponding parallel algorithms. We also present results of testing this algorithm on large graphs to solve the integrated layout compaction and wire balancing problem.< >
In object oriented databases, objects are connected to each other by various kinds of relationships and form a general directed graph. The idea of leveled clustering is utilized to take advantage of the relationships ...
详细信息
In object oriented databases, objects are connected to each other by various kinds of relationships and form a general directed graph. The idea of leveled clustering is utilized to take advantage of the relationships and solve the problem of object allocation and storage in a distributed object oriented database. Both nonreplicative and replicative algorithms are developed to allocate objects to different sites and organize them in the storage of the local databases.< >
A highly sophisticated robot controller with enhanced performance is very efficacious for laboratory use. The programming of the task in the robot task space and integration of external sensors becomes more important ...
详细信息
A highly sophisticated robot controller with enhanced performance is very efficacious for laboratory use. The programming of the task in the robot task space and integration of external sensors becomes more important to the user. Available commercial robot controllers are 'black boxes' and they are unable to be modified. The authors' multiprocessor laboratory robot controller is an open system and with transputers as central processors, they achieved great processing power and very fast communication. In this way, parallel computing became possible and execution times were reduced to 1 ms. An open system of this kind allows the use of various control algorithms, the described joint acceleration controller in Cartesian space being one of them, as well as the integration of external sensors.< >
Lee's (1961) maze-routing algorithm has been a popular method for routing wires in VLSI circuits. It can also be applied to a variety of other problems, such as robot path planning. Although the algorithm is simpl...
详细信息
Lee's (1961) maze-routing algorithm has been a popular method for routing wires in VLSI circuits. It can also be applied to a variety of other problems, such as robot path planning. Although the algorithm is simple and easy to implement, its computation time can be quite high. Therefore, it is a very attractive candidate for implementation on parallel systems. The major issue in parallelizing this algorithm is mapping the grid space of the problem to the processor space. The communication cost and processor utilization can be greatly affected by the mapping strategy used. Won and Sahni (1987) have studied a class of mapping strategies for Lee's algorithm and analyzed their performance. The authors propose two new mapping strategies. First, they modify Won and Sahni's mapping algorithm by using the concept of mirror images to allow higher processor utilization while reducing the number of boundary cells. The new algorithm is shown to be better than the original one in an obstacle-free grid space. Then, they propose a dynamic mapping algorithm. This new mapping algorithm is shown to give an optimal mapping in an obstacle-free grid space. Also, they performed simulation to study the relative performance of these mapping algorithms for grid spaces with obstacles. The results show that the new algorithms are substantially faster than the earlier ones.< >
Nonlinear filters have been used in many signal processing applications, for example, to obtain optimum signal extraction or detection in the presence of random noise. The weighted median filter (WMF), of which the st...
详细信息
Nonlinear filters have been used in many signal processing applications, for example, to obtain optimum signal extraction or detection in the presence of random noise. The weighted median filter (WMF), of which the standard median is a special case, is a novel nonlinear technique designed for 2D image processing. A major advantage of the WMF is its flexibility in design to deal with a wide variety of properties. This paper describes a commonly used class W(4,4,1) of the WMF. AS with most nonlinear methods, the computational demands of this technique are high and require a non-trivial number of "expensive" operations. A data parallel approach for efficient implementation of the WMF is described and implemented on two architecturally dissimilar supercomputers, the Convex C3840 and the Connection Machine CM-200. An analysis of the performance obtained from these two high performance parallel platforms is presented.< >
暂无评论