Over the last five years, major microprocessor manufacturers have released plans for a rapidly increasing number of cores per microprossesor, with upwards of 64 cores by 2015. In this setting, a sequential RAM compute...
详细信息
ISBN:
(纸本)9781595939739
Over the last five years, major microprocessor manufacturers have released plans for a rapidly increasing number of cores per microprossesor, with upwards of 64 cores by 2015. In this setting, a sequential RAM computer will no longer accurately reflect the architecture on which algorithms are being executed. In this paper we propose a model of low degree parallelism (LoPRAM) which builds upon the RAM and PRAM models yet better reflects recent advances in parallel (multi-core) architectures. This model supports a high level of abstraction that simplifies the design and analysis of parallel programs. More importantly we show that in many instances it naturally leads to work-optimal parallelalgorithms via simple modifications to sequential algorithms.
Two novel variations on sample sort, one using only two rounds of regular all-to-all personalized communication in a scheme that yields very good load balancing with virtually no overhead and another using regular sam...
详细信息
Two novel variations on sample sort, one using only two rounds of regular all-to-all personalized communication in a scheme that yields very good load balancing with virtually no overhead and another using regular sampling for choosing splitters, were studied. The two were coded in Split-C and were run on a variety of platforms. Results were consistent with theoretical analysis and illustrated the scalability and efficiency of the algorithms.
This paper presents the design and analysis of parallel approximation algorithms for facility-location problems, including NC and RNC algorithms for (metric) facility location, k-center, k-median, and k-means These pr...
详细信息
ISBN:
(纸本)9781450300797
This paper presents the design and analysis of parallel approximation algorithms for facility-location problems, including NC and RNC algorithms for (metric) facility location, k-center, k-median, and k-means These problems have received considerable attention during the past decades from the approximation algorithms community, which primarily concentrates on improving the approximation guarantees In this paper, we ask. Is it possible to parallelize some of the beautiful results from the sequential setting? Our starting point is a small, but diverse, subset of results in approximation algorithms for facility-location problems, with a primary goal of developing techniques for devising their efficient parallel counterparts. We focus on giving algorithms with low depth, near work efficiency (compared to the sequential versions), and low cache complexity
This volume of the proceedings contains 42 articles devoted to parallel processing algorithms and architectures used in computers with multiple processors. algorithms are presented for routing the processing steps, ma...
详细信息
ISBN:
(纸本)089791483X
This volume of the proceedings contains 42 articles devoted to parallel processing algorithms and architectures used in computers with multiple processors. algorithms are presented for routing the processing steps, mapping the data paths, sorting and interconnecting networks. Operating system program algorithms are discussed which optimize performance for hypercube, mesh connected arrays and to order sets, to select and label gray scale images and to schedule tasks.
We prove a number of negative results about practical (i.e., numerically accurate) algorithms for certain matrix factorizations. In particular, we prove that the popular Givens' method for computing the QR decompo...
详细信息
ISBN:
(纸本)9780897918909
We prove a number of negative results about practical (i.e., numerically accurate) algorithms for certain matrix factorizations. In particular, we prove that the popular Givens' method for computing the QR decomposition is inherently sequential over the realistic model of floating point arithmetic. We also prove a number of additional results concerning Gaussian Elimination for computing the LU decomposition. Altogether, the results of this paper support the widespread belief that there is a tradeoff between parallelism and accuracy in numerical algorithms.
A library, called PAD, of basic parallelalgorithms and data structures for the PRAM is currently being implemented using the PRAM programming language Fork95. Main motivations of the PAD project is to study the PRAM ...
详细信息
A library, called PAD, of basic parallelalgorithms and data structures for the PRAM is currently being implemented using the PRAM programming language Fork95. Main motivations of the PAD project is to study the PRAM as a practical programming model, and to provide an organized collection of basic PRAM algorithms for the SB-PRAM under completion at the University of Saarbruecken. We give a brief survey of Fork95, and describe the main components of PAD. Finally we report on the status of the language and library and discuss further developments.
Scheduling problems that are critical and prevalent in practical parallel computing are computed. A polynomial time makespan algorithm that produces a schedule of length O(V+Φ log T), which is therefore an O(log T) a...
详细信息
ISBN:
(纸本)9780897918091
Scheduling problems that are critical and prevalent in practical parallel computing are computed. A polynomial time makespan algorithm that produces a schedule of length O(V+Φ log T), which is therefore an O(log T) approximation is presented to solve these problems. The makespan algorithm can be extended to minimize the weighted average completion time over all the jobs to the same approximation factor of O(log T).
作者:
Koo, JaehyunMIT
77 Massachusetts Ave Cambridge MA 02139 USA
We present an O(1)-round fully-scalable deterministic massively parallel algorithm for computing the min-plus matrix multiplication of unit-Monge matrices. We use this to derive a O(log n)-round fully-scalable massive...
详细信息
ISBN:
(纸本)9798400704161
We present an O(1)-round fully-scalable deterministic massively parallel algorithm for computing the min-plus matrix multiplication of unit-Monge matrices. We use this to derive a O(log n)-round fully-scalable massively parallel algorithm for solving the exact longest increasing subsequence (LIS) problem. For a fully-scalable MPC regime, this result substantially improves the previously known algorithm of O(log(4) n)-round complexity, and matches the best algorithm for computing the (1 + epsilon)-approximation of LIS.
暂无评论