In this paper, a parallel algorithm for analyzing connected components in binary images is described. It is based on the extension of the Cylindrical Algebraic Decomposition (CAD) to a two-dimensional (2D) discrete sp...
详细信息
In this paper, a parallel algorithm for analyzing connected components in binary images is described. It is based on the extension of the Cylindrical Algebraic Decomposition (CAD) to a two-dimensional (2D) discrete space. This extension allows us to find the number of connected components, to determine their connectivity degree, and to solve the visibility problem . The parallel implementation of the algorithm is outlined and its time/space complexity is given.
We present an efficient algorithm for computing the matching polynomial of a series-parallel graph in O(n 2 ) time. This algorithm improves on the previous result of O(n 3 ). We also present a cost-optimal parallel al...
详细信息
We present an efficient algorithm for computing the matching polynomial of a series-parallel graph in O(n 2 ) time. This algorithm improves on the previous result of O(n 3 ). We also present a cost-optimal parallel algorithm for computing the matching polynomial of a series-parallel graph using an EREW PRAM computer with the number of processors p less than n 2 / log n.
New parallel algorithms and comparative test results are given for solving triangular systems of linear equations on distributed-memory multiprocessors. These results supplement those given in a previous paper. All of...
详细信息
New parallel algorithms and comparative test results are given for solving triangular systems of linear equations on distributed-memory multiprocessors. These results supplement those given in a previous paper. All of the new algorithms are variations on the cyclic algorithms discussed previously. The new algorithms are shown to provide substantial performance improvements.
Motivation: We have witnessed an enormous increase in ChIP-Seq data for histone modifications in the past few years. Discovering significant patterns in these data is an important problem for understanding biological ...
详细信息
Motivation: We have witnessed an enormous increase in ChIP-Seq data for histone modifications in the past few years. Discovering significant patterns in these data is an important problem for understanding biological mechanisms. Results: We propose probabilistic partitioning methods to discover significant patterns in ChIP-Seq data. Our methods take into account signal magnitude, shape, strand orientation and shifts. We compare our methods with some current methods and demonstrate significant improvements, especially with sparse data. Besides pattern discovery and classification, probabilistic partitioning can serve other purposes in ChIP-Seq data analysis. Specifically, we exemplify its merits in the context of peak finding and partitioning of nucleosome positioning patterns in human promoters.
We present an efficient and scalable coarse grained multicomputer (CGM) coloring algorithm that colors a graph G with at most Delta + 1 colors where A is the maximum degree in G. This algorithm is given in two variant...
详细信息
We present an efficient and scalable coarse grained multicomputer (CGM) coloring algorithm that colors a graph G with at most Delta + 1 colors where A is the maximum degree in G. This algorithm is given in two variants: randomized and deterministic. We show that on a p-processor CGM model the proposed algorithms require a parallel time of O(\G\/p) and a total work and overall communication cost of O(\G\). These bounds correspond to the average case for the randomized version and to the worst case for the deterministic variant. (C) 2003 Elsevier B.V. All rights reserved.
parallel algorithms for the solution of linear-quadratic optimal control problems are described. The algorithms are based on a straightforward decomposition of the domain of the problem, and are related to multiple sh...
详细信息
parallel algorithms for the solution of linear-quadratic optimal control problems are described. The algorithms are based on a straightforward decomposition of the domain of the problem, and are related to multiple shooting methods for two-point boundary value problems. Their arithmetic cost is approximately twice that of the serial dynamic programming approach;however, they have the advantage that they can be efficiently implemented on a wide variety of parallel architectures. Extension to the case in which there are box constraints on the controls is simple. The algorithms can be used to solve linear-quadratic subproblems arising from the application of Newton's method or two-metric gradient projection methods to nonlinear problems.
An efficient parallel implementation of an algorithm for recursive least squares computations based upon the covariance updating method has been developed. The target architecture is a distributed-memory multiprocesso...
详细信息
An efficient parallel implementation of an algorithm for recursive least squares computations based upon the covariance updating method has been developed. The target architecture is a distributed-memory multiprocessor, and test results on an Intel iPSC/2 hypercube demonstrate the parallel efficiency of the algorithm. A 64-node system is measured to execute the algorithm over 48 times as fast as a single processor for the largest problem that fits on a single node (fixed-size speedup). Moreover, the computation times increase only slightly with an increase in the number of processors when the problem size per processor remains constant. Applications include robust regression in statistics and modification of the Hessian matrix in optimization, but the primary motivation for this work is the need for fast recursive least squares computations in adaptive filtering methods in signal processing.
In this paper, a robust, iterative algorithm is presented for computing Karush–Kuhn–Tucker (KKT) points of nonlinear programs. This algorithm is a variation of the NE/SQP method for solving the nonlinear complementa...
详细信息
In this paper, a robust, iterative algorithm is presented for computing Karush–Kuhn–Tucker (KKT) points of nonlinear programs. This algorithm is a variation of the NE/SQP method for solving the nonlinear complementarily problem; it makes use of a merit function that combines the original objective function of the nonlinear program with a residual function of the KKT conditions formulated as a system of nonsmooth equations. Global convergence and a Q-quadratic rate of convergence of the algorithm are established under some standard constraint qualifications in nonlinear programming theory, but without the strict complementarily condition. parallel implementations of the algorithm are discussed.
The control of robot manipulators is a complex problem because the differential equation that defines the system is nonlinear and coupled. In several control problems such as linear quadratic optimal control, it is ne...
详细信息
The control of robot manipulators is a complex problem because the differential equation that defines the system is nonlinear and coupled. In several control problems such as linear quadratic optimal control, it is necessary to work with a first order approximation of nonlinear equations. To obtain this approximation, the Lagrange-Euler equations have been used based on general theorems of mechanics. The linearization must be computed on-line, and this is the reason why parallel computing has been used. The parallel algorithms have been implemented on a low cost parallel platform based on a PC plus a board with transputers.
In this paper, the author presents an optimised parallel implementation of a flexible maximum a-posteriori decoder for synchronisation error correcting codes, supporting a very wide range of code sizes and channel con...
详细信息
In this paper, the author presents an optimised parallel implementation of a flexible maximum a-posteriori decoder for synchronisation error correcting codes, supporting a very wide range of code sizes and channel conditions. On mid-range GPUs the author demonstrates decoding speedups of more than two orders of magnitude over a central processing unit implementation of the same optimised algorithm, and more than an order of magnitude over the author's earlier GPU implementation. The prominent challenge is to maintain high parallelisation efficiency over a wide range of code sizes and channel conditions, and different execution hardware. The author ensures this with a dynamic strategy for choosing parallel execution parameters at run-time. They also present a variant that trades off some decoding speed for significantly reduced memory requirement, with no loss to the decoder's error correction performance. The increased throughput of their implementation and its ability to work with less memory allow us to analyse larger codes and poorer channel conditions, and makes practical use of such codes more feasible.
暂无评论