A tabu search based approach is studied as a method for solving in parallel the two-dimensional irregular cutting problem. We use and compare different, variants of the method and various parallel computing systems. S...
详细信息
A tabu search based approach is studied as a method for solving in parallel the two-dimensional irregular cutting problem. We use and compare different, variants of the method and various parallel computing systems. Systems used are based on message passing or shared memory paradigm. parallel algorithms using both methods of communication are proposed. The efficiency of computer system utilization is discussed in the context of unpredictable time requirements of parallel tasks. We present results for different variants of the method together with efficiency measures for parallel implementations, where IBM SP2 and CRAY T3E systems, respectively, have been used.
We study the question of whether parallelization in the exploration of the feasible set can be used to speed up convex optimization, in the local oracle model of computation and in the high-dimensional regime. We show...
详细信息
We study the question of whether parallelization in the exploration of the feasible set can be used to speed up convex optimization, in the local oracle model of computation and in the high-dimensional regime. We show that the answer is negative for both deterministic and randomized algorithms applied to essentially any of the interesting geometries and nonsmooth, weakly-smooth, or smooth objective functions. In particular, we show that it is not possible to obtain a polylogarithmic (in the sequential complexity of the problem) number of parallel rounds with a polynomial (in the dimension) number of queries per round. In the majority of these settings and when the dimension of the space is polynomial in the inverse target accuracy, our lower bounds match the oracle complexity of sequential convex optimization, up to at most a logarithmic factor in the dimension, which makes them (nearly) tight. Another conceptual contribution of our work is in providing a general and streamlined framework for proving lower bounds in the setting of parallel convex optimization. Prior to our work, lower bounds for parallel convex optimization algorithms were only known in a small fraction of the settings considered in this paper, mainly applying to Euclidean (l2) and l∞ spaces.
We compare standard parallel algorithms for solving linear parabolic partial differential equations. The comparison is based on the combined effect of their numerical properties and their parallel performance. We disc...
详细信息
We compare standard parallel algorithms for solving linear parabolic partial differential equations. The comparison is based on the combined effect of their numerical properties and their parallel performance. We discuss the classical explicit methods (forward Euler, Heun and DuFort-Frankel), the standard implicit methods ( BDF 1 , BDF 2 and Crank-Nicolson), the line Hopscotch technique and the ADI formula of McKee and Mitchell. Timing results obtained on a 16-processor Intel hypercube are given. It is shown that parallelism does not alter the ranking of the methods unless the number of grid points per processor is very small.
This paper shows a simple algorithm for solving the single function coarsest partition problem on the CRCW PRAM model of parallel computation using O(n) processors in O( log n) time with O(n 1+ε ) space.
This paper shows a simple algorithm for solving the single function coarsest partition problem on the CRCW PRAM model of parallel computation using O(n) processors in O( log n) time with O(n 1+ε ) space.
Program slicing is the process of deleting statements in a program that do not affect a given set of variables at a chosen point in the program. In this paper the parallel slicing algorithm is introduced. It is shown ...
详细信息
Program slicing is the process of deleting statements in a program that do not affect a given set of variables at a chosen point in the program. In this paper the parallel slicing algorithm is introduced. It is shown how the control flow graph of the program to be sliced is converted into a network of concurrent processes, thereby producing a parallel version of Weiser's original static slicing algorithm. Keywords:.
We present a parallel algorithm for the Concave Least Weight Subsequence (CLWS) problem that exhibits the following work-time trade-off: Given a parameter p, the algorithm runs in time using p processors. By a known r...
详细信息
We present a parallel algorithm for the Concave Least Weight Subsequence (CLWS) problem that exhibits the following work-time trade-off: Given a parameter p, the algorithm runs in time using p processors. By a known reduction of the Huffman Tree problem to the CLWS problem, we obtain the same complexity bounds for the Huffman Tree problem. However, as we show, for the later problem there exists a simpler (and, in fact, slightly better) algorithm that exhibits a similar trade-off: Namely, for a given parameter p, p≥1, the algorithm runs in time using p processors.
Decision trees have been found very effective for classification especially in data mining. Although classification is a well studied problem, most of the current classification algorithms need an in-memory data struc...
详细信息
We propose a new parallel algorithm for computing the sign function of a matrix. The algorithm is based on the Padé approximation of a certain hypergeometric function which in turn leads to a rational function ap...
详细信息
We propose a new parallel algorithm for computing the sign function of a matrix. The algorithm is based on the Padé approximation of a certain hypergeometric function which in turn leads to a rational function approximation to the sign function. parallelism is achieved by developing a partial fraction expansion of the rational function approximation since each fraction can be evaluated on a separate processor in parallel. For the sign function the partial fraction expansion is numerically attractive since the roots and the weights are known analytically and can be computed very accurately. We also present experimental results obtained on a Cray Y-MP.
Reconfigurable models were shown to be very powerful in solving many problems faster than non reconfigurable models. WECPAR W(M,N,k) is an M x N reconfigurable model that has point-to-point reconfigurable interconnect...
详细信息
ISBN:
(纸本)9781479941162
Reconfigurable models were shown to be very powerful in solving many problems faster than non reconfigurable models. WECPAR W(M,N,k) is an M x N reconfigurable model that has point-to-point reconfigurable interconnection with k wires between neighboring processors. This paper studies several aspects of WECPAR. We first solve the list ranking problem on WECPAR. Some of the results obtained show that ranking one element in a list of N elements can be solved on W(N,N,N) WECPAR in O(1) time. Also, on W(N,N,k), ranking a list L(N) of N elements can be done in O((log N)( inverted right perpendicular log(k) (+1) N inverted left perpendicular )) time. To transfer a large body of algorithms to work on WECPAR and to assess its relative computational power, several simulations algorithms are introduced between WECPAR and well-known models such as PRAM and RMBM. Simulations algorithms show that a PRIORITY CRCW PRAM of N processors and S shared memory locations can be simulated by an W(S, N, k) WECPAR in O( inverted right perpendicular log(k) (+1) N inverted left perpendicular + inverted right perpendicular log S-k (+1) inverted left perpendicular ) time. Also, we show that a PRIORITY CRCW Basic-RMBM(P, B), of P processors and B buses can be simulated by an W(B, P+ B, k) WECPAR in O( inverted right perpendicular log(k) (+1) (P + B) inverted left perpendicular ) time. This has the effect of migrating a large number of algorithms to work directly on WECPAR with the simulation overhead.
In this work, we study the distributed memory architecture, the parallelization of Hansen's algorithm which is an interval Branch-and-Bound algorithm for solving the continuous global optimization problem with ine...
详细信息
In this work, we study the distributed memory architecture, the parallelization of Hansen's algorithm which is an interval Branch-and-Bound algorithm for solving the continuous global optimization problem with inequality constraints. Since this algorithm is dynamic and irregular, we propose, in particular, some parallel algorithms dealing with balancing the load with respect to the quantity and the quality of boxes. Our proposed techniques are based on the criterion of the "best-first strategy" and also on the cyclic redistribution of the boxes. The numerical simulations are performed using the PROFIL/BIAS libraries 1,2 for computation and "MPI/C++" environment for communication.
暂无评论