Multiselection is the problem of selecting multiple elements at specified ranks from a set of arbitrary elements. In this paper, we first present an efficient algorithm for single-element selection that runs in O(root...
详细信息
Multiselection is the problem of selecting multiple elements at specified ranks from a set of arbitrary elements. In this paper, we first present an efficient algorithm for single-element selection that runs in O(rootp +(n/p) \log p \log (kp/n)) time for selecting the k th smallest element from n elements on a rootp x rootp mesh-connected computer of p less than or equal to n processors, where the first component is for communication and second is for computation (data comparisons). Our algorithm is more computationally efficient than the existing result when p greater than or equal to n(1/2 + epsilon) for any 0 < epsilon < 1/2 . Combining our result for p = Omega(rootn) with the existing result for p = O(rootn) yields an improved computation time complexity for the selection problem on mesh t(comp)(sel) = O(min{(n/p) log p log (kp/n),\ (n/p + p) \log(n/p)\}) . Using this algorithm as a building block, we then present two efficient parallel algorithms for multiselection on the mesh-connected computers. For selecting r elements from a set of n elements on a rootp x rootp mesh, p, r less than or equal to n , our first algorithm runs in time O(p(1/2) + t(comp)(sel) min {r log r, log p}) with processors operating in the SIMD mode, which is time-optimal when p less than or equal to r . Allowing processors to operate in the MIMD mode, our second algorithm runs in O(p(1/2) + t(comp)(sel) log r) time and is time-optimal for any r and p .
This paper presents an efficient and highly scalable parallel version of the Modified RBF Shepard's method presented in [5]. This method maintains the "metric" nature and the advantages of Shepard's ...
详细信息
This paper presents an efficient and highly scalable parallel version of the Modified RBF Shepard's method presented in [5]. This method maintains the "metric" nature and the advantages of Shepard's method and, at the same time, improves its accuracy by exploiting the characteristics of flexibility and accuracy which have made the radial basis functions a well-established tool for multivariate interpolation. Due to its locality, this method can be easily and efficiently parallelized on a distributed memory parallel architecture. The performance of the parallel algorithm has been studied theoretically and the experimental results obtained by running its implementation on a Cray T3E parallel machine, using the MPI interface, confirm the theoretical efficiency.
For improving video coding efficiency,sub-pixel motion estimation(ME) is used extensively in the existing video coding *** quarter pixel ME is one of the high complexity tools in H.264/*** this paper,a parallel quarte...
详细信息
For improving video coding efficiency,sub-pixel motion estimation(ME) is used extensively in the existing video coding *** quarter pixel ME is one of the high complexity tools in H.264/*** this paper,a parallel quarter block motion estimation algorithm that not only accelerates the process of sub-pixel motion estimation but also maintains accuracy as that of the original algorithm is *** Intel P4 CPU,the SIMD(single instruction multiple data) technique is commonly used to provide an execution *** implementation of this algorithm using parallel processing on P4 platform is *** proposed algorithm satisfies in particular the requirements of low-rate real-timed video *** results show that the optimized video encoder is more than 13.5 times faster than the original reference software while keeping the accuracy of the latter approximately.
Finding all roots of a polynomial is basic and important work for numerical and algebraic calculation. We develop a method that only simple root is needed computing, either the roots are real or complex, Firstly, a Pa...
详细信息
ISBN:
(纸本)7506251817
Finding all roots of a polynomial is basic and important work for numerical and algebraic calculation. We develop a method that only simple root is needed computing, either the roots are real or complex, Firstly, a parallel algorithm for computing the greatest common factor of two polynomials is put forward;Secondly, a standard decomposing for a polynomial is given. Lastly, a method for converting the complex roots of a polynomial to the real roots of another polynomials is obtained.
A parallel algorithm for solving block-diagonal structured large linear system is presented. This algorithm is based on the ″gradient-simplex″ method. It partitions a large linear system into several small linear s...
详细信息
A parallel algorithm for solving block-diagonal structured large linear system is presented. This algorithm is based on the ″gradient-simplex″ method. It partitions a large linear system into several small linear subsystems so that they can be solved in parallel. The algorithm has the merit of high speed and is suitable for the large linear systems with less coupling constrains. The efficiency and applicability of the method is also analyzed.
By correcting Clenshaw and Forsyth algorithms for deriving Chebyshev polynomials. 2 parallel algorithms for deriving common orthogonal polynomials, through discussing rounding-off errors of the 2 algorithm matrixes, t...
详细信息
By correcting Clenshaw and Forsyth algorithms for deriving Chebyshev polynomials. 2 parallel algorithms for deriving common orthogonal polynomials, through discussing rounding-off errors of the 2 algorithm matrixes, their stabilities are analyzed.
Presents a new parallel image matching algorithm based on the concept of entropy feature vector and suitable to SIMD computer, which, in comparison with other algorithms, has the following advantages:(1)The spatial in...
详细信息
Presents a new parallel image matching algorithm based on the concept of entropy feature vector and suitable to SIMD computer, which, in comparison with other algorithms, has the following advantages:(1)The spatial information of an image is appropriately introduced into the definition of image entropy. (2) A large number of multiplication operations are eliminated, thus the algorithm is sped up. (3) The shortcoming of having to do global calculation in the first instance is overcome, and concludes the algorithm has very good locality and is suitable for parallel processing.
A numerical method of solution which by its nature provides approximate solutions for tridiagonal systems with Toeplitz coefficient matrices is extended in two ways. First, the coefficient matrix is perturbed and repl...
详细信息
A numerical method of solution which by its nature provides approximate solutions for tridiagonal systems with Toeplitz coefficient matrices is extended in two ways. First, the coefficient matrix is perturbed and replaced by a banded near-Toeplitz matrix, Then the algorithm is modified to allow for parallel processing. The perturbations are then taken into account in two separate correction processes, The convergence of the modified algorithm is examined. (C) 2001 Elsevier Science Inc. All rights reserved.
In order to speed the process of transforming GRID into TIN, this paper presents a parallel distributed algorithm. The algorithm includes the following steps: (1) The host partitions GRID into sub-GRIDs according to t...
详细信息
In order to speed the process of transforming GRID into TIN, this paper presents a parallel distributed algorithm. The algorithm includes the following steps: (1) The host partitions GRID into sub-GRIDs according to the amount of processor and distributes them to all processors involved;(2) Each processor finishes the respective process of transforming GRID into TIN and returning the respective result;(3) The host collects all results and links a complete TIN. The method solves edge-problem of the linkage and the output's data structures are compatible with the TIN ones on ARC/INFO completely.
We consider the discretization in time of an inhomogeneous parabolic equation in a Banach space setting, using a representation of the solution as an integral along a smooth curve in the complex left half-plane which,...
详细信息
We consider the discretization in time of an inhomogeneous parabolic equation in a Banach space setting, using a representation of the solution as an integral along a smooth curve in the complex left half-plane which, after transformation to a finite interval, is then evaluated to high accuracy by a quadrature rule. This reduces the problem to a finite set of elliptic equations with complex coefficients, which may be solved in parallel. The paper is a further development of earlier work by the authors, where we treated the homogeneous equation in a Hilbert space framework. Special attention is given here to the treatment of the forcing term. The method is combined with finite-element discretization in spatial variables.
暂无评论