Embedding one parallel architecture into another is very important in the area of parallelprocessing because parallel architectures can vary widely. Given a pyramid architecture of (4/sup N/-1)/3 nodes and height N, ...
详细信息
Embedding one parallel architecture into another is very important in the area of parallelprocessing because parallel architectures can vary widely. Given a pyramid architecture of (4/sup N/-1)/3 nodes and height N, this paper presents a mapping method to embed the pyramid architecture into a 2/sup N-1-k//spl times/2/sup N-1-k//spl times/(4/sup k+1/+2)/3 mesh for 0/spl les/k/spl les/N-1. Our method has dilation max{4/sup k/, 2/sup N-2-k/} and expansion 1+2/(4k+1). When setting k=(N-2)/3, the pyramid can be embedded into a 2/sup (2N-1//3)/spl times/2/sup (2N-1//3)/spl times/[4/sup (N+1//3)+2]/3 mesh, and it has dilation and expansion 1+2/[4/sup (N+1//3)]. This result has can optimal expansion when N is sufficiently large and is superior to the previous mapping methods in terms of the same gauges.
This paper describes a number of different coarse-grain GA's, including various migration strategies and connectivity schemes to address the premature convergence problem. These approaches are evaluated on a graph...
详细信息
This paper describes a number of different coarse-grain GA's, including various migration strategies and connectivity schemes to address the premature convergence problem. These approaches are evaluated on a graph partitioning problem. Our experiments showed, first, that the sequential GA's used are not as effective as parallel GA's for this graph partition problem. Second, for coarse-grain GA's, the results indicate that using a large number of nodes and exchanging individuals asynchronously among them is very effective. Third, GA's that exchange solutions based on population similarity instead of a fixed connection topology get better results without any degradation in speed. Finally, we propose a new coarse-grained GA architecture, the Injection Island GA (iiGA). The preliminary results of iiGA's show them to be a promising new approach to coarse-grain GA's.< >
parallel implementations of two computer vision algorithms on distributed cluster platforms are described. The first algorithm is a square-error data clustering method whose parallel implementation is based on the wel...
详细信息
parallel implementations of two computer vision algorithms on distributed cluster platforms are described. The first algorithm is a square-error data clustering method whose parallel implementation is based on the well-known sequential CLUSTER program. The second algorithm is a motion parameter estimation algorithm used to determine correspondence between two images taken of the same scene. Both algorithms have been implemented and tested on cluster platforms using the PVM package. Performance measurements demonstrate that it is possible to attain good performance in terms of execution time and speedup for large-scale problems, provided that adequate memory; swap space, and I/O capacity are available at each node.
Non-linear filters have been used in many signal processing applications, for example, to obtain optimum signal extraction or detection in the presence of random noise. The weighted median filter (WMF), of which the s...
详细信息
ISBN:
(纸本)081864222X
Non-linear filters have been used in many signal processing applications, for example, to obtain optimum signal extraction or detection in the presence of random noise. The weighted median filter (WMF), of which the standard median is a special case, is a novel non-linear technique designed for 2D imageprocessing. A major advantage of the WMF is its flexibility in design to deal with a wide variety of properties. This paper describes a commonly used class W(4,4,1) of the WMF. As with most non-linear methods, the computational demands of this technique are high and require a non-trivial number of `expensive' operations. A data parallel approach for efficient implementation of the WMF is described and implemented on two architecturally dissimilar supercomputers, the Convex C3840 and the Connection Machine CM-200. An analysis of the performance obtained from these two high performance parallel platforms is presented.
This work investigates the application of evolutionary programming for automatically configuring neural network architectures for pattern classification tasks. The evolutionary programming search procedure implements ...
详细信息
ISBN:
(纸本)0819412813
This work investigates the application of evolutionary programming for automatically configuring neural network architectures for pattern classification tasks. The evolutionary programming search procedure implements a parallel nonlinear regression technique and represents a powerful method for evaluating a multitude of neural network model hypotheses. The evolutionary programming search is augmented with the Solis & Wets random optimization method thereby maintaining the integrity of the stochastic search while taking into account empirical information about the response surface. A network architecture is proposed which is motivated by the structures generated in projection pursuit regression and the cascade-correlation learning architecture. Results are given for the 3-bit parity, normally distributed data, and the T-C classifier problems.
As a time-sequential and Bayesian front-end for image sequence processing, we consider the square root information (SRI) realization of Kalman filter. The computational complexity of the filter due to the dimension of...
详细信息
ISBN:
(纸本)0819411361
As a time-sequential and Bayesian front-end for image sequence processing, we consider the square root information (SRI) realization of Kalman filter. The computational complexity of the filter due to the dimension of the problem -- the size of the state vector is on the order of the number of pixels in the image frame -- is decreased drastically using a reduced-order approximation exploiting the natural spatial locality in the random field specifications. The actual computation for the reduced-order SRI filter is performed by an iterative and distributed algorithm for the unitary transformation steps, providing a potentially faster alternative to the common QR factorization-based methods. For the space-time estimation problems, near- optimal solutions can be obtained in a small number of iterations (e.g., less than 10), and each iteration can be performed in a finely parallel manner over the image frame, an attractive feature for a dedicated hardware implementation.
Describes an approach to edge detection particularly suited for implementation on distributed-memory massively parallel MIMD machines. One of the main tasks of this work is the identification of an optimal edge thresh...
详细信息
Describes an approach to edge detection particularly suited for implementation on distributed-memory massively parallel MIMD machines. One of the main tasks of this work is the identification of an optimal edge threshold, i.e. the value of the luminance gradient allowing one to identify actual edge pixels. Such identification has been done by adopting a local approach, where the image is a-priori partitioned into small square windows, and the optimal threshold is selected by ranking the outputs produced by several thresholds inside each window. The innovative contributions of this work lie in the fact that, by partitioning the image in suitably small windows, the probability of having only one edge chain in each window is maximized (thus enhancing the effectiveness of the optimal threshold selection criterion), and the scalability of the application is ensured (due to the high number of simple processing tasks into which the algorithm is subdivided).< >
Nonlinear filters have been used in many signal processing applications, for example, to obtain optimum signal extraction or detection in the presence of random noise. The weighted median filter (WMF), of which the st...
详细信息
Nonlinear filters have been used in many signal processing applications, for example, to obtain optimum signal extraction or detection in the presence of random noise. The weighted median filter (WMF), of which the standard median is a special case, is a novel nonlinear technique designed for 2D imageprocessing. A major advantage of the WMF is its flexibility in design to deal with a wide variety of properties. This paper describes a commonly used class W(4,4,1) of the WMF. AS with most nonlinear methods, the computational demands of this technique are high and require a non-trivial number of "expensive" operations. A data parallel approach for efficient implementation of the WMF is described and implemented on two architecturally dissimilar supercomputers, the Convex C3840 and the Connection Machine CM-200. An analysis of the performance obtained from these two high performance parallel platforms is presented.< >
Data parallel visual reconstruction and partitioning algorithms and the associated code are developed for a vector random access machine (V-RAM). Finite element algorithms are constructed for solving the one-dimension...
详细信息
Data parallel visual reconstruction and partitioning algorithms and the associated code are developed for a vector random access machine (V-RAM). Finite element algorithms are constructed for solving the one-dimensional visual reconstruction problem with the input data consisting of a symmetrical top hat loading for the modeling of interacting step discontinuities. The advantage of the V-RAM implementation is the general code applicability to a variety of architectures. A specific implementation is performed on a distributed Array Processor (DAP) simulator on the VAX 6000–420. Execution times on the DAP simulator are obtained and are found to be in agreement with the algorithmic complexities of the V-RAM code.
A distributed arithmetic implementation for two-dimensional FIR digital filters for real-time imageprocessing is presented. parallelism and pipelining are two features of the proposed filter structure that contribute...
详细信息
A distributed arithmetic implementation for two-dimensional FIR digital filters for real-time imageprocessing is presented. parallelism and pipelining are two features of the proposed filter structure that contribute to its high-speed performance. Speed performance and hardware complexity are evaluated, and the effects of finite-precision arithmetic are considered.< >
暂无评论