This paper parallelizes the embedding strategy for mapping any two-dimensional grid into its optimal hypercube with minimal dilation. The parallelization allows each hypercube node to independently determine, in const...
详细信息
This paper parallelizes the embedding strategy for mapping any two-dimensional grid into its optimal hypercube with minimal dilation. The parallelization allows each hypercube node to independently determine, in constant time, which grid node it will simulate and the communication paths it will take to reach the hypercube nodes which simulate its grid-neighbors. The paths between grid-neighbors are chosen in such a way as to curb the congestion at each hypercube node and across each hypercube edge. Explicity, the node congestion for our embedding is at most 6 (one above optimal), while the edge congestion is at most 5.
This article concerns computing the largest (or smallest) eigenvalue and corresponding eigenvector of a large irreducible symmetric tridiagonal matrix. We divide the original extreme eigenpair problem into p subproble...
详细信息
This article concerns computing the largest (or smallest) eigenvalue and corresponding eigenvector of a large irreducible symmetric tridiagonal matrix. We divide the original extreme eigenpair problem into p subproblems when the matrix is partitioned in p-block form. An inverse extreme eigenpair problem associated with the divide method is considered: Find a (p - 1)-dimensional vector a such that the largest eigenvalue lambda of a given symmetric tridiagonal matrix T is the common largest eigenvalues of p small matrices T-i(a) which are updated matrices of the diagonal blocks T-i of T with rank-one/two corrections. We prove that the inverse problem is solvable and has a unique positive solution and that the corresponding eigenvector of T-i(a) with the positive solution a forms a section of the eigenvector of T corresponding to its largest eigenvalue. An algorithm is proposed for solving the inverse problem iteratively, on which a system of nonlinear equations should be solved for each of its iterative steps. It is proved that the iterative scheme converges for any positive initial vector and that its convergence is asymptotically quadratic. The new method can be used to compute the extreme eigenpairs of T and is well suited for parallel implementation. Some numerical tests were performed, which show the efficiency of the new parallelizable algorithm.
We show classical elimination procedure can be simply extended to uncouple partitioned tridiagonal systems for parallel processing of their solution. In each block of equations, we now need two simultaneous eliminatio...
详细信息
We show classical elimination procedure can be simply extended to uncouple partitioned tridiagonal systems for parallel processing of their solution. In each block of equations, we now need two simultaneous eliminations;one usual forward elimination and one backward from across the succeeding block. Significantly, unlike Wang's method [6], our is a one-stage elimination procedure, at the end of which the core system is reached. Once the core system is solved, the uncoupled subsystems are solved in parallel by back substitution.
parallel versions of prestack KirchhofT 3D integral migration algorithm, which is suitable forseismic data processing, are described in this paper. Firstly, the inherent parallel characteristics of seismicdata process...
详细信息
parallel versions of prestack KirchhofT 3D integral migration algorithm, which is suitable forseismic data processing, are described in this paper. Firstly, the inherent parallel characteristics of seismicdata processing are analyzed. Then some principles in algorithm partition are discussed. Based on these analyses and the system architecture, communication mechanism, this algorithm is divided into four subtasksallocated to four nodes of 990 STAR-l. Then we describe in detail a module-partitioning method-theI / O processing and communication are separated from the computation process, the processes includingI / O processing and communication are allocated to transputer T805 and the other is allocated to processori860. These two processes are synchronized by shared memory and memory-lock mechanism, but the communication betWeen different nodes is implemented through links of transputer. Load balance among fourprocessor modules is performed dynamically. Finally, we discussed the speed--up of the parallel versions ofprestack KirchhofT 3D integral migration algorithm running on four nodes. Some further researches are also melltioned in this paper.
Given a run-length coded text of length 2n and a run-length coded pattern of length 2m, m much less than n commonly, this paper first presents an O(n+m) time sequential algorithm for string matching, then presents an ...
详细信息
Given a run-length coded text of length 2n and a run-length coded pattern of length 2m, m much less than n commonly, this paper first presents an O(n+m) time sequential algorithm for string matching, then presents an O(1) time parallel algorithm on a two-dimensional mXn mesh with a reconfigurable bus system.
Two fast algorithms for median filtering of images using parallel computers having 2-D mesh interconnections are given. Both algorithms assume that an n x n image is loaded onto the mesh with one processing element pe...
详细信息
Two fast algorithms for median filtering of images using parallel computers having 2-D mesh interconnections are given. Both algorithms assume that an n x n image is loaded onto the mesh with one processing element per pixel. One algorithm performs median filtering over d x d neighborhoods in O(d(2)) time and works with pixel values in an arbitrarily large range. This algorithm, while theoretically suboptimal, achieves a lower constant than a previously published asymptotically-optimal algorithm and is simpler to program. The second algorithm assumes that the range of pixel values is limited and relatively small, and it accomplishes median filtering in O(d) time.
Given nonintersecting simple polygoris P and Q, two vertices p is an element of P and q is an element of Q are said to be visible if pq does not properly intersect P or Q. We present a parallel algorithm for finding a...
详细信息
Given nonintersecting simple polygoris P and Q, two vertices p is an element of P and q is an element of Q are said to be visible if <($$)over bar>pq does not properly intersect P or Q. We present a parallel algorithm for finding a closest pair among all visible pairs (p,q) p is an element of P and q is an element of Q. The algorithm runs in time O(log n) using O(n) processors on a CREW PRAM, where n = \P\+\Q\. This algorithm can be implemented serially in O(n) time, which gives a new optimal sequential solution for this problem.
In this paper, we study the region growing approach to segmentation. We examine the effect of the merge criterion on region growing processing time, and propose a fast-merge policy that provides fast parallel segmenta...
详细信息
In this paper, we study the region growing approach to segmentation. We examine the effect of the merge criterion on region growing processing time, and propose a fast-merge policy that provides fast parallel segmentation. Fast-merge not only minimizes the number of merge rejections and results in fast region growing processes, but it also enhances the quality of segmentations. Our algorithm was tested on the BBN TC2000 shared memory multiprocessor system. It provides a well balanced load on the multiprocessors, and processing time is consistently fast on a wide range of images.
We consider the following three problems when we are given a set of n circular-arcs: (1) the articulation points problem, (2) the bridges problem and (3) the biconnected components problem. We first give several optim...
详细信息
We consider the following three problems when we are given a set of n circular-arcs: (1) the articulation points problem, (2) the bridges problem and (3) the biconnected components problem. We first give several optimal O(n log n) time sequential algorithms for these problems, if the endpoints of the arcs are unsorted. If the endpoints are presorted, our algorithms can be solved in O(n) time. The algorithms proposed are suitable to be implemented on a parallel computation model. On the EREW PRAM model, our algorithms run either in an O(log n) time using O(n) processors for the unsorted case or in an O(log n) time using O(n/log n) processors for the sorted case. The proposed algorithms for both cases are optimal speed-up.
This paper presents an improved neural network for channel assignment problems in cellular mobile communication systems in the new co-channel interference model. Sengoku et al. first proposed the neural network for th...
详细信息
This paper presents an improved neural network for channel assignment problems in cellular mobile communication systems in the new co-channel interference model. Sengoku et al. first proposed the neural network for the same problem, which can find solutions only in small size cellular systems with up to 40 cells in our simulations. For the practical use in the next generation's cellular systems, the performance of our improved neural network is verified by large size cellular systems with up to 500 cells. The newly defined energy function and the motion equation with two heuristics in our neural network achieve the goal of finding optimum or near-optimum solutions in a nearly constant time.
暂无评论