The proceedings contains 40 papers from the conference on SPAA 2004 - Sixteenth annual ACM symposium on parallelism in algorithms and architectures. The topics discussed include: On delivery times in packet networksun...
详细信息
The proceedings contains 40 papers from the conference on SPAA 2004 - Sixteenth annual ACM symposium on parallelism in algorithms and architectures. The topics discussed include: On delivery times in packet networksunder adversarial traffic;balanced graph partitioning;online hierarchical cooperative caching;scheduling against an adversarial network;effectively sharing a cache among threads;online algorithms for network design and dynamic analysis of the arrow distributed protocol.
The proceedings contains 46 papers from the conference on SPAA 2003 Fifteenth annual ACM symposium on parallelism in algorithms and architectures. The topics discussed include: optimal sharing of bags of tasks in hete...
详细信息
The proceedings contains 46 papers from the conference on SPAA 2003 Fifteenth annual ACM symposium on parallelism in algorithms and architectures. The topics discussed include: optimal sharing of bags of tasks in heterogeneous clusters;minimizing total flow time and total completion time with immediate dispatching;a practical algorithm for constructing oblivious routing schemes;a polynomial-time tree decomposition to minimize congestion and online oblivious routing.
Energy consumption by computer systems has emerged as an important concern However, the energy consumed in executing an algorithm cannot be inferred from its performance alone it must be modeled explicitly This paper ...
详细信息
ISBN:
(纸本)9781450300797
Energy consumption by computer systems has emerged as an important concern However, the energy consumed in executing an algorithm cannot be inferred from its performance alone it must be modeled explicitly This paper analyzes energy consumption of parallel algorithms executed on shared memory multicore processors Specifically, we develop a methodology to evaluate how energy consumption of a given parallel algorithm changes as the number of cores and their frequency is varied We use this analysis to establish the optimal number of cores to minimize the energy consumed by the execution of a parallel algorithm for a specific problem size while satisfying a given performance requirement We study the sensitivity of our analysis to changes in parameters such as the ratio of the power consumed by a computation step versus the power consumed in accessing memory The results show that the relation between the problem size and the optimal number of cores is relatively unaffected for a wide range of these parameters.
We discuss the high-performance parallel implementation and execution of dense linear algebra matrix operations on SMP architectures;with an eye towards multi-core processors with many cores. We argue that traditional...
详细信息
ISBN:
(纸本)9781595936677
We discuss the high-performance parallel implementation and execution of dense linear algebra matrix operations on SMP architectures;with an eye towards multi-core processors with many cores. We argue that traditional implementations, as those incorporated in LAPACK, cannot be easily modified to render high performance as well as scalability on these architectures. The solution we propose is to arrange the data structures and algorithms so that matrix blocks become the fundamental units of data;and operations on these blocks become the fundamental units of computation, resulting in algorithms-by-blocks as opposed to the snore traditional blocked algorithms. We show that this facilitates the adoption of techniques akin to dynamic scheduling and out-of-order execution usual in superscalar processors;which we name SuperMatrix Out-of-Order scheduling. Performance results on a 16 CPU Itanium2-based server are used to highlight opportunities and issues related to this new approach.
The proceedings contain 45 papers. The topics discussed include: randomized composable coresets for matching and vertex cover;almost optimal streaming algorithms for coverage problems;bicriteria distributed submodular...
ISBN:
(纸本)9781450345934
The proceedings contain 45 papers. The topics discussed include: randomized composable coresets for matching and vertex cover;almost optimal streaming algorithms for coverage problems;bicriteria distributed submodular maximization in a few rounds;on energy conservation in data centers;asymptotically optimal approximation algorithms for coflow scheduling;online flexible job scheduling for minimum span;minimizing total weighted flow time with calibrations;brief announcement: scheduling parallelizable jobs online to maximize throughput;brief announcement: a new improved bound for coflow scheduling;and a communication-avoiding parallel algorithm for the symmetric eigenvalue problem.
We present a scheduling algorithm of stream programs for multi-core architectures called team scheduling Compared to previous multi-core stream scheduling algorithms, team scheduling achieves 1) similar synchronizatio...
详细信息
ISBN:
(纸本)9781450300797
We present a scheduling algorithm of stream programs for multi-core architectures called team scheduling Compared to previous multi-core stream scheduling algorithms, team scheduling achieves 1) similar synchronization overhead, 2) coverage of a larger class of applications, 3) better control over buffer space, 4) deadlock-free feedback loops, and 5) lower latency We compare team scheduling to the latest stream scheduling algorithm, SGMS, by evaluating 14 applications on a multi-core architecture with 16 cores. Team scheduling successfully targets applications that cannot be validly scheduled by SGMS clue to excessive buffer requirement or deadlocks in feedback loops (e.g., GSM and W-cDmA) For applications that can be validly scheduled by SGMS, team scheduling shows on average 37% higher throughput within the same buffer space constraints
The proceedings contain 45 papers. The topics discussed include: the price of clustering in bin-packing with applications to bin-packing with delays;faster matrix multiplication via sparse decomposition;NC algorithms ...
ISBN:
(纸本)9781450361842
The proceedings contain 45 papers. The topics discussed include: the price of clustering in bin-packing with applications to bin-packing with delays;faster matrix multiplication via sparse decomposition;NC algorithms for computing a perfect matching, the number of perfect matchings, and a maximum flow in one-crossing-minor-free graphs;improved MPC algorithms for edit distance and ulam distance;brief announcement: scalable diversity maximization via small-size composable core-sets;brief announcement: eccentricities via parallel set cover;dynamic algorithms for the massively parallel computation model;massively parallel computation via remote memory access;and brief announcement: ultra-fast asynchronous randomized rumor spreading.
The proceedings contain 49 papers. The topics discussed include: fast stencil computations using fast Fourier transforms;low-span parallel algorithms for the binary-forking model;provable advantages for graph algorith...
ISBN:
(纸本)9781450380706
The proceedings contain 49 papers. The topics discussed include: fast stencil computations using fast Fourier transforms;low-span parallel algorithms for the binary-forking model;provable advantages for graph algorithms in spiking neural networks;algorithms for right-sizing heterogeneous data centers;efficient parallel self-adjusting computation;speed scaling with explorable uncertainty;efficient online weighted multi-level paging;paging and the address-translation problem;massively parallel algorithms for distance approximation and spanners;efficient load-balancing through distributed token dropping;finding subgraphs in highly dynamic networks;near-optimal time-energy trade-offs for deterministic leader election;and efficient stepping algorithms and implementations for parallel shortest paths.
The proceedings contain 47 papers. The topics discussed include: parallel minimum cuts in near-linear work and low depth;trees for vertex cuts, hypergraph cuts and minimum hypergraph bisection;dynamic representations ...
ISBN:
(纸本)9781450357999
The proceedings contain 47 papers. The topics discussed include: parallel minimum cuts in near-linear work and low depth;trees for vertex cuts, hypergraph cuts and minimum hypergraph bisection;dynamic representations of sparse distributed networks: a locality-sensitive approach;constant-depth and subcubic-size threshold circuits for matrix multiplication;integrated model, batch, and domain parallelism in training neural networks;brief announcement: on approximating pagerank locally with sublinear query complexity;brief announcement: coloring-based task mapping for dragonfly systems;brief announcement: parallel transitive closure within 3D crosspoint memory;and lock-free contention adapting search trees.
The proceedings contain 42 papers. The topics discussed include: sorting with asymmetric read and write costs;myths and misconceptions about threads;new streaming algorithms for parameterized maximal matching & be...
ISBN:
(纸本)9781450335881
The proceedings contain 42 papers. The topics discussed include: sorting with asymmetric read and write costs;myths and misconceptions about threads;new streaming algorithms for parameterized maximal matching & beyond;efficient approximation algorithms for computing k disjoint restricted shortest paths;fast and better distributed MapReduce algorithms for k-center clustering;fair adaptive parallelism for concurrent transactional memory applications;towards a universal approach for the finite departure problem in overlay networks;a compiler-runtime application binary interface for pipe-while loops;the Cilkprof scalability profiler;efficiently detecting races in Cilk programs that use reducer hyperobjects;ThreadScan: automatic and scalable memory reclamation;temporal fairness of round robin: competitive analysis for Lk-norms of flow time;and space and time efficient parallel graph decomposition, clustering, and diameter approximation.
暂无评论