In this paper a parallel implementation of an Adaptive Generalized Predictive Control (AGPC) algorithm is presented. Since the AGPC algorithm needs to be fed with knowledge of the plant transfer function, the parallel...
详细信息
In this paper a parallel implementation of an Adaptive Generalized Predictive Control (AGPC) algorithm is presented. Since the AGPC algorithm needs to be fed with knowledge of the plant transfer function, the parallelization of a standard Recursive Least Squares (RLS) estimator and a GPC predictor is discussed here. Also, since a matrix inversion operation is required in the GPC predictor algorithm, special attention is given to its parallelization. A small DSP network with up to 2 processors is used to investigate, the performance of the parallel implementation. To exploit an heterogeneous architecture the parallel algorithm is mapped over a network builded up of transputers as communication elements, and DSPs as computing elements. Further some heterogeneous topologies are compared. Execution times and efficiency results of the RLS and GPC steps are presented to show the performance of the parallel algorithm, over different topologies.
We study the question of whether parallelization in the exploration of the feasible set can be used to speed up convex optimization, in the local oracle model of computation and in the high-dimensional regime. We show...
详细信息
We study the question of whether parallelization in the exploration of the feasible set can be used to speed up convex optimization, in the local oracle model of computation and in the high-dimensional regime. We show that the answer is negative for both deterministic and randomized algorithms applied to essentially any of the interesting geometries and nonsmooth, weakly-smooth, or smooth objective functions. In particular, we show that it is not possible to obtain a polylogarithmic (in the sequential complexity of the problem) number of parallel rounds with a polynomial (in the dimension) number of queries per round. In the majority of these settings and when the dimension of the space is polynomial in the inverse target accuracy, our lower bounds match the oracle complexity of sequential convex optimization, up to at most a logarithmic factor in the dimension, which makes them (nearly) tight. Another conceptual contribution of our work is in providing a general and streamlined framework for proving lower bounds in the setting of parallel convex optimization. Prior to our work, lower bounds for parallel convex optimization algorithms were only known in a small fraction of the settings considered in this paper, mainly applying to Euclidean (l2) and l∞ spaces.
We compare standard parallel algorithms for solving linear parabolic partial differential equations. The comparison is based on the combined effect of their numerical properties and their parallel performance. We disc...
详细信息
We compare standard parallel algorithms for solving linear parabolic partial differential equations. The comparison is based on the combined effect of their numerical properties and their parallel performance. We discuss the classical explicit methods (forward Euler, Heun and DuFort-Frankel), the standard implicit methods ( BDF 1 , BDF 2 and Crank-Nicolson), the line Hopscotch technique and the ADI formula of McKee and Mitchell. Timing results obtained on a 16-processor Intel hypercube are given. It is shown that parallelism does not alter the ranking of the methods unless the number of grid points per processor is very small.
This paper shows a simple algorithm for solving the single function coarsest partition problem on the CRCW PRAM model of parallel computation using O(n) processors in O( log n) time with O(n 1+ε ) space.
This paper shows a simple algorithm for solving the single function coarsest partition problem on the CRCW PRAM model of parallel computation using O(n) processors in O( log n) time with O(n 1+ε ) space.
Program slicing is the process of deleting statements in a program that do not affect a given set of variables at a chosen point in the program. In this paper the parallel slicing algorithm is introduced. It is shown ...
详细信息
Program slicing is the process of deleting statements in a program that do not affect a given set of variables at a chosen point in the program. In this paper the parallel slicing algorithm is introduced. It is shown how the control flow graph of the program to be sliced is converted into a network of concurrent processes, thereby producing a parallel version of Weiser's original static slicing algorithm. Keywords:.
In this work, we study the distributed memory architecture, the parallelization of Hansen's algorithm which is an interval Branch-and-Bound algorithm for solving the continuous global optimization problem with ine...
详细信息
In this work, we study the distributed memory architecture, the parallelization of Hansen's algorithm which is an interval Branch-and-Bound algorithm for solving the continuous global optimization problem with inequality constraints. Since this algorithm is dynamic and irregular, we propose, in particular, some parallel algorithms dealing with balancing the load with respect to the quantity and the quality of boxes. Our proposed techniques are based on the criterion of the "best-first strategy" and also on the cyclic redistribution of the boxes. The numerical simulations are performed using the PROFIL/BIAS libraries 1,2 for computation and "MPI/C++" environment for communication.
We present a parallel algorithm for the Concave Least Weight Subsequence (CLWS) problem that exhibits the following work-time trade-off: Given a parameter p, the algorithm runs in time using p processors. By a known r...
详细信息
We present a parallel algorithm for the Concave Least Weight Subsequence (CLWS) problem that exhibits the following work-time trade-off: Given a parameter p, the algorithm runs in time using p processors. By a known reduction of the Huffman Tree problem to the CLWS problem, we obtain the same complexity bounds for the Huffman Tree problem. However, as we show, for the later problem there exists a simpler (and, in fact, slightly better) algorithm that exhibits a similar trade-off: Namely, for a given parameter p, p≥1, the algorithm runs in time using p processors.
Decision trees have been found very effective for classification especially in data mining. Although classification is a well studied problem, most of the current classification algorithms need an in-memory data struc...
详细信息
We propose a new parallel algorithm for computing the sign function of a matrix. The algorithm is based on the Padé approximation of a certain hypergeometric function which in turn leads to a rational function ap...
详细信息
We propose a new parallel algorithm for computing the sign function of a matrix. The algorithm is based on the Padé approximation of a certain hypergeometric function which in turn leads to a rational function approximation to the sign function. parallelism is achieved by developing a partial fraction expansion of the rational function approximation since each fraction can be evaluated on a separate processor in parallel. For the sign function the partial fraction expansion is numerically attractive since the roots and the weights are known analytically and can be computed very accurately. We also present experimental results obtained on a Cray Y-MP.
In the past several years, domain decomposition has been a very popular topic, motivated by the ease of parallelization. However, the question of whether it is better than parallelizing some standard sequential method...
详细信息
In the past several years, domain decomposition has been a very popular topic, motivated by the ease of parallelization. However, the question of whether it is better than parallelizing some standard sequential methods has seldom been directly addressed.
暂无评论