We show that many classical optimization problems – such as (1 ± )-approximate maximum flow, shortest path, and transshipment – can be computed in τmix(G)·no(1) rounds of distributed message passing, whe...
详细信息
ISBN:
(纸本)9783959770927
We show that many classical optimization problems – such as (1 ± )-approximate maximum flow, shortest path, and transshipment – can be computed in τmix(G)·no(1) rounds of distributed message passing, where τmix(G) is the mixing time of the network graph G. This extends the result of Ghaffari et al. [PODC’17], whose main result is a distributed MST algorithm in τmix(G)· 2O(log n log log n) rounds in the CONGEST model, to a much wider class of optimization problems. For many practical networks of interest, e.g., peer-to-peer or overlay network structures, the mixing time τmix(G) is small, e.g., polylogarithmic. On these networks, our algorithms bypass the Ω( n + D) lower bound of Das Sarma et al. [STOC’11], which applies for worst-case graphs and applies to all of the above optimization problems. For all of the problems except MST, this is the first distributed algorithm which takes o(n) rounds on a (nontrivial) restricted class of network graphs. Towards deriving these improved distributed algorithms, our main contribution is a general transformation that simulates any work-efficient PRAM algorithm running in T parallel rounds via a distributed algorithm running in T · τmix(G) · 2O(log n) rounds. Work- and time-efficient parallel algorithms for all of the aforementioned problems follow by combining the work of Sherman [FOCS’13, SODA’17] and Peng and Spielman [STOC’14]. Thus, simulating these parallel algorithms using our transformation framework produces the desired distributed algorithms. The core technical component of our transformation is the algorithmic problem of solving multi-commodity routing – that is, roughly, routing n packets each from a given source to a given destination – in random graphs. For this problem, we obtain a new algorithm running in 2O(log n) rounds, improving on the 2O(log n log log n) round algorithm of Ghaffari, Kuhn, and Su [PODC’17]. As a consequence, for the MST problem in particular, we obtain an improved distributed algorithm running
We have presented in a unified way optimal parallel algorithms for the unweighted versions of the MiS, MCC, and MDS problem on circular-arc graphs using greedy methods. It would be interesting to investigate whether o...
详细信息
In this paper, we present parallel algorithms for the coarse grained multicomputer (CGM) and the bulk synchronous parallel computer (BSP) for solving two well known graph problems: (1) determining whether a graph G is...
详细信息
The development of parallel processing came about due to the ineffectiveness of a single processor to accommodate the solutions of large scale problems in a reasonable amount of time. In this paper, we shall introduce...
详细信息
The development of parallel processing came about due to the ineffectiveness of a single processor to accommodate the solutions of large scale problems in a reasonable amount of time. In this paper, we shall introduce one such problem, and discuss the implementation of two parallel algorithms applied to the linear approximations. This study will illustrate how an approximation method which has a faster rate of convergence may not necessarily produce the best solution time.
The paper considers the problem of constructing a planer orthogonal grid drawing (or more simply, layout) of an n-vertex graph, with the goal of minimizing the number of bends along the edges. It exhibits graphs that ...
详细信息
The parallelization of the dynamic programming algorithm for the integral knapsack problem is approached from several perspectives. Two of them proceed by dividing the set of objects, while a third one proceeds by par...
详细信息
ISBN:
(纸本)9780897917285
The parallelization of the dynamic programming algorithm for the integral knapsack problem is approached from several perspectives. Two of them proceed by dividing the set of objects, while a third one proceeds by partitioning the set of capacities. Furthermore, we propose a new sequential algorithm and its parallelization by reducing the integral knapsack problem to a maximum path problem. The theoretical complexity analysis of the algorithms proves that for all the algorithms the product of the number of processors by the parallel time equals the corresponding sequential time. Computational results are presented both for transputer networks using occam and LAN using PVM. Although for many cases the best running times are obtained for the LAN, the speedup and the scalability are better for the transputer network.
A parallel system to solve complex computational problems involve multiple instruction, simultaneous flows, communication structures, synchronisation and competition conditions between processes, as well as mapping an...
详细信息
The queue-read, queue-write (QRQW) PRAM model [GMR94] permits concurrent reading and writing, but at a cost proportional to the number of readers/writers to a memory location in a given step. The QRQW model reflects t...
详细信息
ISBN:
(纸本)9780897916714
The queue-read, queue-write (QRQW) PRAM model [GMR94] permits concurrent reading and writing, but at a cost proportional to the number of readers/writers to a memory location in a given step. The QRQW model reflects the contention properties of most parallel machines more accurately than either the well-studied CRCW or EREW models: the CRCW model does not adequately penalize algorithms with high contention to shared memory locations, while the EREW model is too strict in its insistence on zero contention at each step. Of primary practical and theoretical interest, then, is the design of fast and efficient QRQW algorithms for problems for which all previous algorithms either suffer from high contention, fail to be fast, or fail to be *** paper describes low-contention, fast, work-optimal QRQW PRAM algorithms for the fundamental problems of finding a random permutation, parallel hashing, load balancing, and sorting. There is no known fast, work-optimal EREW algorithm known for finding a random permutation or for parallel hashing. For load balancing, we improve upon the EREW result whenever the ratio of the maximum to the average load is not too large. We show that the logarithmic dependence of the QRQW running time on this ratio is inherent by providing a matching lower *** demonstrate the performance advantage of a QRQW random permutation algorithm, compared with the popular EREW algorithm, by implementing and running both algorithms on the MasPar ***, we extend the work-time framework for the design of parallel algorithms to account for contention, and relate it to the QRQW PRAM model. We use our QRQW load balancing algorithm, as well as the QRQW linear compaction algorithm in [GMR94], to provide automatic tools for processor allocation—an issue that needs to be handled when translating an algorithm from its work-time presentation into the explicit PRAM description.
The paper proposes and compares two parallel algorithms for GPU simulation of a mass-spring cloth model and image based collision detection and response approach. The algorithms are implemented using three different A...
详细信息
Performance of an algorithm mainly depends on both computer architecture and software. An Intel Xeon processor based HPC cluster and Intel Itanium2 based symmetric multiprocessing (SMP) architectures are used for perf...
详细信息
暂无评论