A parallel computation approach to two-dimensional shape recognition is proposed and illustrated. The approach uses parallel techniques for contour extraction, parallel computation of normalized contour-based feature ...
详细信息
A parallel computation approach to two-dimensional shape recognition is proposed and illustrated. The approach uses parallel techniques for contour extraction, parallel computation of normalized contour-based feature strings independent of scale and orientation, and parallel string-matching algorithms. The string matching can be applied in a manner independent of rotation. An implementation on the exclusive read, exclusive write parallel random access memory (EREW PRAM) architecture is discussed, but it can be adapted to other parallelarchitectures. An illustrated example is presented.
The array constructs of Fortran 90 (formerly called Fortran 8×) map naturally onto SIMD (single-instruction-stream, multiple-data-stream) architectures that support a data parallelprogramming style, such as that...
详细信息
ISBN:
(纸本)0818620536
The array constructs of Fortran 90 (formerly called Fortran 8×) map naturally onto SIMD (single-instruction-stream, multiple-data-stream) architectures that support a data parallelprogramming style, such as that of the Connection Machine computer system. The FORALL statement, an extension to Fortran 90 allowing for the expression of simultaneous execution of certain DO loop bodies, enhances this natural fit. A Fortran 90 compiler for data parallel machines is extended naturally to handle the FORALL statement. The data structures and algorithms used to effect this extension are described, and some examples of source code fragments and the target operations generated for them are presented.
We present new techniques for mapping computations onto hypercubes. Our methods speed up classical implementations of grid and tree communications by a factor of Θ(n), where n is the number of hypercube dimensions. T...
详细信息
ISBN:
(纸本)0897913701
We present new techniques for mapping computations onto hypercubes. Our methods speed up classical implementations of grid and tree communications by a factor of Θ(n), where n is the number of hypercube dimensions. The speed-ups are the best possible. We obtain these speed-ups by mapping each edge of the guest graph onto short, edge-disjoint paths in the hypercube. These multiple-path embeddings can be used to reduce communication time for large grid-based scientific computations, to increase tolerance to link faults, and for fast routing of large messages. We also develop a general technique for deriving multiple-path embeddings from multiple-copy embeddings. Multiple-copy embeddings are one-to-one maps of independent copies of the guest graph within the We present an efficient multiple-copy embedding of the cube-connected-cycles network within the hypercube. This embedding is used to derive efficient multiple-path embeddings of trees and butterfly networks in hypercubes.
Let P be a simple rectilinear convex polygon of size O(n) inside which lie n pairwise disjoint rectangular rectilinear obstacles. We provide parallel techniques for computing rectilinear shortest paths that avoid the ...
详细信息
ISBN:
(纸本)0897913701
Let P be a simple rectilinear convex polygon of size O(n) inside which lie n pairwise disjoint rectangular rectilinear obstacles. We provide parallel techniques for computing rectilinear shortest paths that avoid the set of obstacles in P. Specifically, we compute descriptions of shortest paths in O(log2 n) time, with O(n2/log2n) processors in the CREWPRAM model if source and destination are on the boundary of P, with O(n2/log n) processors if the source is an obstacle vertex and the destination a vertex of P, and with O(n2) processors if both source and destination are obstacle vertices. The descriptions we compute enable one processor to obtain the path length of any query pair of vertices in constant time, or O(n/log n) processors to retrieve the shortest path itself in logarithmic time. If the two query points are arbitrary rather than vertices, then one processor takes O(log n) time (instead of constant time) for finding the path length. A number of other related shortest paths problems are solved. The techniques we use involve a fast computation of separator staircases and, most importantly, a scheme for partitioning the obstacles' boundaries into families in a way that ensures that the resulting path length matrices have a monotonicity property that is apparently absent before applying our partitioning scheme.
A parallel algorithm for the implementation of rete networks on a parallel array processor is proposed. Although special-purpose architectures have been proposed for directly implementing production systems, the avail...
详细信息
A parallel algorithm for the implementation of rete networks on a parallel array processor is proposed. Although special-purpose architectures have been proposed for directly implementing production systems, the availability of such machines is very limited. Proposed implementation algorithms, which take advantage of more widely available parallel machines, are more practical. Such an approach was taken in a previous work where a number of workstations connected via a local area network were considered. Here an SIMD (single instruction, multiple data) array processor is used. The rete network is functionally divided into two subnetworks: the discrimination network and the join network. It is the join network where most of the processing load is carried out. The operation of the join network is the main objective of the SIMD implementation. The proposed algorithm emphasizes data parallelism as opposed to functional parallelism. In this type of parallelism, the data are partitioned into sets of approximately equal size and distributed among several processors. The data are then operated upon simultaneously by similar operators. The underlying architecture is described.
A vectorized algorithm for entering data into a hash table is presented. A program that enters multiple data could not be executed on vector processors by conventional vectorization techniques because of data dependen...
详细信息
A vectorized algorithm for entering data into a hash table is presented. A program that enters multiple data could not be executed on vector processors by conventional vectorization techniques because of data dependences. The proposed method enables execution of multiple data entry by conventional vector processors and improves the performance by a factor of 12.7, compared with the normal sequential method, when 4099 pieces of data are entered on the Hitachi S-810. This method is applied to address calculation sorting and the distribution counting sort, whose main part was unvectorizable by previous techniques. It improves performance by a factor of 12.8 when n = 214 on the S-810.
A modification of the classic Kung-Robinson timestamp-based concurrency control algorithm is described. The algorithm is based on two innovative techniques: query killing notes and weak serializability of transactions...
详细信息
A modification of the classic Kung-Robinson timestamp-based concurrency control algorithm is described. The algorithm is based on two innovative techniques: query killing notes and weak serializability of transactions. In particular, it prefers long transactions over short queries and thus reduces considerably the number of transaction rollbacks required. In order to test the validity and evaluate the performance of the proposed algorithm, a simulation program was written and run using a realistic set of transactions. The simulation was performed using Flat Concurrent Prolog (FCP). The advantages of FCP for specifying and implementing parallel algorithms include its refined granularity of parallelism, its declarativeness and conciseness, and its powerful communication and synchronization primitives. Results of algorithm performance are presented.
This conference proceedings contains 200 papers. The topics covered are neural networks and signal processing;heterojunction devices for high-speed signal processing, software prototypes in silicon compilation researc...
详细信息
This conference proceedings contains 200 papers. The topics covered are neural networks and signal processing;heterojunction devices for high-speed signal processing, software prototypes in silicon compilation research, parallel algorithms and architectures for DSP applications, topics in graph theory and engineering networks, topics in fault analysis, filter theory, telecommunication circuits, switched capacitor networks, graph theory and applications, DSP hardware systems, application of artificial neural networks to power systems, digital implementations of neural networks, analog implementations of neural networks multidimensional systems, VLSI for digital video signal processing, GaAs analog integrated circuits and system applications, adoptive systems, and solid state circuits and active filters.
Traditionally, a programmer's role has been reduced to designing computer vision algorithms and coding them in a machine independent language since the knowledge of the architecture has been implicit in the compil...
详细信息
暂无评论