The proceedings contain 49 papers. The topics discussed include: fast stencil computations using fast Fourier transforms;low-span parallelalgorithms for the binary-forking model;provable advantages for graph algorith...
ISBN:
(纸本)9781450380706
The proceedings contain 49 papers. The topics discussed include: fast stencil computations using fast Fourier transforms;low-span parallelalgorithms for the binary-forking model;provable advantages for graph algorithms in spiking neural networks;algorithms for right-sizing heterogeneous data centers;efficient parallel self-adjusting computation;speed scaling with explorable uncertainty;efficient online weighted multi-level paging;paging and the address-translation problem;massively parallelalgorithms for distance approximation and spanners;efficient load-balancing through distributed token dropping;finding subgraphs in highly dynamic networks;near-optimal time-energy trade-offs for deterministic leader election;and efficient stepping algorithms and implementations for parallel shortest paths.
An earlier parallel list ranking algorithm performs well for problem sizes N that are extremely large in comparison to the number of PUs P. However, no existing algorithm gives good performance for reasonable loads. W...
详细信息
ISBN:
(纸本)9780897918909
An earlier parallel list ranking algorithm performs well for problem sizes N that are extremely large in comparison to the number of PUs P. However, no existing algorithm gives good performance for reasonable loads. We present a novel family of algorithms, that achieve a better trade-off between the number of start-ups and the routing volume. We have implemented them on an Intel Paragon, and they turn out to considerably outperform all earlier algorithms: with P = 2 the sequential algorithm is already beaten for N = 25,000;for P = 100 and N = 107, the speed-up is 21, and for N = 108 it even reaches 30. A modification of one of our algorithms solves a theoretical question: we show that on one-dimensional processor arrays, list ranking can be solved with a number of steps equal to the diameter of the network.
We introduce a new preconditioner for solving a symmetric Toeplitz system of equations by the conjugate gradient method. This choice leads to an algorithm which is particularly suitable for parallel computations and, ...
详细信息
ISBN:
(纸本)0897913701
We introduce a new preconditioner for solving a symmetric Toeplitz system of equations by the conjugate gradient method. This choice leads to an algorithm which is particularly suitable for parallel computations and, compared to the circulant preconditioner of [C3], has a better asymptotic convergence rate and a lower arithmetic cost per iteration.
In this paper we give efficient parallelalgorithms for a number of problems from computational geometry by using generalized versions of parallel plane sweeping. We illustrate our approach with a number of applicatio...
详细信息
ISBN:
(纸本)0897913701
In this paper we give efficient parallelalgorithms for a number of problems from computational geometry by using generalized versions of parallel plane sweeping. We illustrate our approach with a number of applications, which include general hidden-surface elimination (even if the overlap relation contains cycles), CSG boundary evaluation, computing the contour of a collection of rectangles, and hidden-surface elimination for rectangles. Our algorithms are for the CREW PRAM.
The proceedings contain 45 papers. The topics discussed include: the price of clustering in bin-packing with applications to bin-packing with delays;faster matrix multiplication via sparse decomposition;NC algorithms ...
ISBN:
(纸本)9781450361842
The proceedings contain 45 papers. The topics discussed include: the price of clustering in bin-packing with applications to bin-packing with delays;faster matrix multiplication via sparse decomposition;NC algorithms for computing a perfect matching, the number of perfect matchings, and a maximum flow in one-crossing-minor-free graphs;improved MPC algorithms for edit distance and ulam distance;brief announcement: scalable diversity maximization via small-size composable core-sets;brief announcement: eccentricities via parallel set cover;dynamic algorithms for the massively parallel computation model;massively parallel computation via remote memory access;and brief announcement: ultra-fast asynchronous randomized rumor spreading.
The issue of effectiveness of private caches for processors were studied. Since time for all processors to access the shared memory simultaneously is usually much longer than the time for a processor to access its own...
详细信息
ISBN:
(纸本)9780897918091
The issue of effectiveness of private caches for processors were studied. Since time for all processors to access the shared memory simultaneously is usually much longer than the time for a processor to access its own private cache, scheduling with private caches falls into the distributed memory model where the lower bound applies. The effectiveness of private caches were shown by proving that a version of Dynamic Equi-partition Scheduling Policy (DEQ) achieves a mean response time with five times the optimal mean response time in the cache clock time for a large class of parallel jobs well accepted in the parallel scheduling community. This shows an improvement of system performance by using private caches over that of purely shared memory.
In this paper we explore a simple and general approach for developing parallelalgorithms that lead to good cache complexity on parallel machines with private or shared caches The approach is to design nested-parallel...
详细信息
ISBN:
(纸本)9781450300797
In this paper we explore a simple and general approach for developing parallelalgorithms that lead to good cache complexity on parallel machines with private or shared caches The approach is to design nested-parallelalgorithms that have low depth (span. critical path length) and for which the natural sequential evaluation order has low cache complexity in the cache-oblivious model We describe several cache-oblivious algorithms with optimal work, polylogarithmic depth, and sequential cache complexities that match the best sequential algorithms, including the first such algorithms for sorting and for sparse-matrix vector multiply on matrices with good vertex separators Using known mappings. our results lead to low cache complexities on shared-memory multiprocessors with a single level of private caches or a single shared cache We generalize these mappings to multi-level cache hierarchies of private or shared caches, implying that our algorithms also have low cache complexities on such hierarchies The key factor in obtaining these low parallel cache complexities is the low depth of the algorithms we propose.
We present lower bounds for time needed to solve basic problems on three general-purpose models of parallel computation: the shared-memory models QSM and s-QSW, and the distributed-memory model, the BSP. For each of t...
详细信息
ISBN:
(纸本)9780897919890
We present lower bounds for time needed to solve basic problems on three general-purpose models of parallel computation: the shared-memory models QSM and s-QSW, and the distributed-memory model, the BSP. For each of these models, we also obtain lower bounds for the number of rounds needed to solve these problems using a randomized algorithm on a p-processor machine. Our results on 'rounds' is of special interest in the context of designing work-efficient algorithms on a machine where latency and synchronization costs are high. Many of our lower bound results are complemented by upper bounds that match the lower bound or are close to it.
Let (G) denote the independence number of a graph G, that is the maximum number of pairwise independent vertices in G. We present a parallel algorithm that computes in a planar graph G = (V, E), an independent set I ⊂...
详细信息
The proceedings contain 7 papers from the symposium on parallelism in algorithms and architectures, SPAA 2003: 15th annualsymposium on parallelism in algorithms and architectures. The topics discussed include: a prac...
详细信息
The proceedings contain 7 papers from the symposium on parallelism in algorithms and architectures, SPAA 2003: 15th annualsymposium on parallelism in algorithms and architectures. The topics discussed include: a practical algorithm for constructing oblivious routing schemes;novel architectures for P2P applications: the continuous-discrete approach;quantifying instruction criticality for shared memory multiprocessors;relaxing the problem-size bound for out-of-core columnsort;the complexity of verifying memory coherence;a near optimal scheduler for switch-memory-switch routers;and on local algorithms for topology control and routing in ad hoc networks.
暂无评论