The authors recently introduced the technique of sparse mesh refinement to produce the first near-optimal sequential time bounds of O(n lg L/s+m) for inputs in any fixed dimension with piecewise-linear constraining (P...
详细信息
ISBN:
(纸本)9781595936677
The authors recently introduced the technique of sparse mesh refinement to produce the first near-optimal sequential time bounds of O(n lg L/s+m) for inputs in any fixed dimension with piecewise-linear constraining (PLC) features. This paper extends that work to the parallel case, refining the same inputs in time O(lg(L/s)lg m) on an EREW PRAM while maintaining the work bound;in practice, this means we expect linear speedup for any practical number of processors. This is faster than the best previously known parallel Delaunay mesh refinement algorithms in two dimensions. It is the first technique with work bounds equal to the sequential case. In higher dimension, it is the first provably fast parallel technique for any kind of quality mesh refinement with PLC inputs. Furthermore, the algorithm's implementation is straightforward enough that it is likely to be extremely fast in practice.
In this paper, we study the tradeoffs between the time and the number of communication rounds of the best arm identification problem in the heterogeneous collaborative learning model, where multiple agents interact wi...
详细信息
ISBN:
(纸本)9798400704161
In this paper, we study the tradeoffs between the time and the number of communication rounds of the best arm identification problem in the heterogeneous collaborative learning model, where multiple agents interact with possibly different environments and they want to learn in parallel an objective function in the aggregated environment. By proving almost tight upper and lower bounds, we show that collaborative learning in the heterogeneous setting is inherently more difficult than that in the homogeneous setting in terms of the time-round tradeoff.
The proceedings contain 37 papers. The topics discussed include: on triangulation of simple networks;strong-diameter decompositions of minor free graphs;approximation algorithms for multiprocessor scheduling under unc...
详细信息
ISBN:
(纸本)159593667X
The proceedings contain 37 papers. The topics discussed include: on triangulation of simple networks;strong-diameter decompositions of minor free graphs;approximation algorithms for multiprocessor scheduling under uncertainty;scheduling DAGs on asynchronous processors;scheduling to minimize gaps and power consumption;cache-oblivious streaming B-trees;an experimental comparison of cache-oblivious and cache-conscious programs;scheduling threads for constructive cache sharing on CMPs;proximity-aware directory-based coherence for multi-core processor architectures;a parallel dynamic programming algorithm on a multi-core architecture;tight bounds for distributed selection;local MST computation with short advice;distributed approximation of capacitated dominating sets;packing to angles and sectors;and the notion of a timed register and its application to indulgent synchronization.
In this paper we investigate algorithmic instruments leading to low power consumption in computing devices. While previous work on energy-efficient algorithms has mostly focused on single processor environments, in th...
详细信息
ISBN:
(纸本)9781595936677
In this paper we investigate algorithmic instruments leading to low power consumption in computing devices. While previous work on energy-efficient algorithms has mostly focused on single processor environments, in this paper we investigate multi-processor settings. We study the basic problem of scheduling a set of jobs, each specified by a release time, a deadline and a processing volume, on variable speed processors so as to minimize the total energy consumption. We first settle the complexity of speed scaling with unit size jobs. More specifically, we devise a polynomial time algorithm for agreeable deadlines and prove NP-hardness results for arbitrary release dates and deadlines. For the latter setting we also develop a polynomial time algorithm achieving a constant factor approximation guarantee that is independent of the number of processors. Additionally, we study speed scaling of jobs with arbitrary processing requirements and, again, develop constant factor approximation algorithms. We finally transform our offline algorithms into constant competitive online strategies.
The proceedings contains 48 papers from Thirteen annualacmsymposium on parallelalgorithms and architectures. Topics discussed include: compact routing schemes;simple on-line algorithms for the maximum disjoint path...
详细信息
The proceedings contains 48 papers from Thirteen annualacmsymposium on parallelalgorithms and architectures. Topics discussed include: compact routing schemes;simple on-line algorithms for the maximum disjoint paths problem;competitive buffer management for shared-memory switches;attack propagation in networks;computational power of pipelined memory hierarchies;and a data tracking scheme for general networks.
We discuss the high-performance parallel implementation and execution of dense linear algebra matrix operations on SMP architectures;with an eye towards multi-core processors with many cores. We argue that traditional...
详细信息
ISBN:
(纸本)9781595936677
We discuss the high-performance parallel implementation and execution of dense linear algebra matrix operations on SMP architectures;with an eye towards multi-core processors with many cores. We argue that traditional implementations, as those incorporated in LAPACK, cannot be easily modified to render high performance as well as scalability on these architectures. The solution we propose is to arrange the data structures and algorithms so that matrix blocks become the fundamental units of data;and operations on these blocks become the fundamental units of computation, resulting in algorithms-by-blocks as opposed to the snore traditional blocked algorithms. We show that this facilitates the adoption of techniques akin to dynamic scheduling and out-of-order execution usual in superscalar processors;which we name SuperMatrix Out-of-Order scheduling. Performance results on a 16 CPU Itanium2-based server are used to highlight opportunities and issues related to this new approach.
We address optimal makespan-minimizing identical parallel machine scheduling (P vertical bar vertical bar C-max) by introducing new pruning rules for branch-and-bound (BnB) and integrating them into a prior BnB algori...
详细信息
ISBN:
(纸本)9798400704161
We address optimal makespan-minimizing identical parallel machine scheduling (P vertical bar vertical bar C-max) by introducing new pruning rules for branch-and-bound (BnB) and integrating them into a prior BnB algorithm. Experimental results indicate that the presented rules are inexpensive to evaluate, applicable frequently, and extremely beneficial to the BnB algorithm's overall performance.
暂无评论