Most high-level parallel programming languages allow for fine-grained parallelism. Programs written in such languages can express the full parallelism in the program without specifying the mapping of program tasks to ...
详细信息
ISBN:
(纸本)9780897917179
Most high-level parallel programming languages allow for fine-grained parallelism. Programs written in such languages can express the full parallelism in the program without specifying the mapping of program tasks to processors. When executing such programs, the major concern is to dynamically schedule tasks to processors in order to minimize execution time and the amount of memory needed. In this paper, a class of parallel schedules that are provably efficient in both time and space, even for programs whose task structure is revealed only during execution are identified. Following this, an efficient dynamic scheduling algorithm that generates schedules in this class, for languages with nested fine-grained parallelism is described.
A unified framework for finding efficient permutation routes on parallel networks in an off-line setting is presented. If the underlying graph of a parallel network contains an appropriate 'approximate' produc...
详细信息
ISBN:
(纸本)0897913701
A unified framework for finding efficient permutation routes on parallel networks in an off-line setting is presented. If the underlying graph of a parallel network contains an appropriate 'approximate' product structure then our method guarantees the existence of non-blocking near-optimal permutation routes. The routes in question can be determined in polynomial time. Furthermore, our results are extended to finding permutation routes among the remaining 'live' nodes in a faulty network.
The proceedings contain 43 papers. The topics discussed include: publish and perish: definition and analysis of an n-person publication impact game;exponential separation of quantum and classical online space complexi...
详细信息
ISBN:
(纸本)1595934529
The proceedings contain 43 papers. The topics discussed include: publish and perish: definition and analysis of an n-person publication impact game;exponential separation of quantum and classical online space complexity;minimizing the stretch when scheduling flows of biological requests;position paper and brief announcement: the FG programming environment - good and good for you;efficient parallelalgorithms for dead sensor diagnosis and multiple access channels;on the communication complexity of randomized broadcasting in random-like graphs;strip packing with precedence constraints and strip packing with release times;on space-stretch trade-offs: lower bounds;a performance analysis of local synchronization;the cache complexity of multithreaded cache oblivious algorithms;and deterministic load balancing and dictionaries in the parallel disk model.
The authors recently introduced the technique of sparse mesh refinement to produce the first near-optimal sequential time bounds of O(n lg L/s+m) for inputs in any fixed dimension with piecewise-linear constraining (P...
详细信息
ISBN:
(纸本)9781595936677
The authors recently introduced the technique of sparse mesh refinement to produce the first near-optimal sequential time bounds of O(n lg L/s+m) for inputs in any fixed dimension with piecewise-linear constraining (PLC) features. This paper extends that work to the parallel case, refining the same inputs in time O(lg(L/s)lg m) on an EREW PRAM while maintaining the work bound;in practice, this means we expect linear speedup for any practical number of processors. This is faster than the best previously known parallel Delaunay mesh refinement algorithms in two dimensions. It is the first technique with work bounds equal to the sequential case. In higher dimension, it is the first provably fast parallel technique for any kind of quality mesh refinement with PLC inputs. Furthermore, the algorithm's implementation is straightforward enough that it is likely to be extremely fast in practice.
External memory (EM) algorithms are designed for computational problems in which the size of the internal memory of the computer is only a small fraction of the problem size. For certain large scale applications this ...
详细信息
ISBN:
(纸本)9780897918909
External memory (EM) algorithms are designed for computational problems in which the size of the internal memory of the computer is only a small fraction of the problem size. For certain large scale applications this is necessarily true. Typically, the cost models proposed for external memory algorithms have measured only the number of I/O operations, and the algorithms have been specially crafted for the EM situation. In the past, several attempts have been made to relate the large body of work based on parallelalgorithms to EM, but with limited success. In this paper we provide simulation techniques which produce efficient EM algorithms from efficient algorithms developed under BSP-like parallel computing models. Our techniques can accommodate one or multiple processors on the EM target machine, each with one or more disks, and they also adapt to the disk blocking factor of the target machine. In addition to the main simulation result we obtain a more comprehensive cost model for EM algorithms, which considers the total costs incurred by the algorithm including computation, I/O and communication costs.
In this paper we give efficient parallelalgorithms for a number of problems from computational geometry by using generalized versions of parallel plane sweeping. We illustrate our approach with a number of applicatio...
详细信息
ISBN:
(纸本)0897913701
In this paper we give efficient parallelalgorithms for a number of problems from computational geometry by using generalized versions of parallel plane sweeping. We illustrate our approach with a number of applications, which include general hidden-surface elimination (even if the overlap relation contains cycles), CSG boundary evaluation, computing the contour of a collection of rectangles, and hidden-surface elimination for rectangles. Our algorithms are for the CREW PRAM.
The proceedings contain 45 papers. The topics discussed include: randomized composable coresets for matching and vertex cover;almost optimal streaming algorithms for coverage problems;bicriteria distributed submodular...
ISBN:
(纸本)9781450345934
The proceedings contain 45 papers. The topics discussed include: randomized composable coresets for matching and vertex cover;almost optimal streaming algorithms for coverage problems;bicriteria distributed submodular maximization in a few rounds;on energy conservation in data centers;asymptotically optimal approximation algorithms for coflow scheduling;online flexible job scheduling for minimum span;minimizing total weighted flow time with calibrations;brief announcement: scheduling parallelizable jobs online to maximize throughput;brief announcement: a new improved bound for coflow scheduling;and a communication-avoiding parallel algorithm for the symmetric eigenvalue problem.
The proceedings contain 42 papers. The topics discussed include: on dynamic bin packing for resource allocation in the cloud;on the online fault-tolerant server consolidation problem;on computing maximal independent s...
ISBN:
(纸本)9781450328210
The proceedings contain 42 papers. The topics discussed include: on dynamic bin packing for resource allocation in the cloud;on the online fault-tolerant server consolidation problem;on computing maximal independent sets of hypergraphs in parallel;simple parallel and distributed algorithms for spectral graph sparsification;concurrent data structures for efficient streaming aggregation;provably good scheduling for parallel programs that use data structures through implicit batching;scheduling selfish jobs on multidimensional parallel machines;LP rounding and combinatorial algorithms for minimizing active and busy time;a note on multiprocessor speed scaling with precedence constraints;executing dynamic data-graph computations deterministically using chromatic scheduling;the PCL theorem. transactions cannot be parallel, consistent and live;adaptive integration of hardware and software lock elision techniques;and transaction-friendly condition variables.
NEighborhood MOdeling (NEMO) provides transparent parallelism to develop parallel spatial data processing. It addresses the following parallel issues: architecture and machine independence;communication bottlenecks;da...
详细信息
NEighborhood MOdeling (NEMO) provides transparent parallelism to develop parallel spatial data processing. It addresses the following parallel issues: architecture and machine independence;communication bottlenecks;data visualization;casualty errors;load balancing;and data coherence. NEMO is capable of processing three types of time consuming raster neighborhood models: cellular automata;propagation;and neighborhood analysis. NEMO achieves this flexibility by including five components to its design: the three application drivers such as the cellular automata driver, propagation driver, and neighborhood analysis automata driver;and the display manager and raster database manager.
The computational requirements for an adaptive solution of unsteady problems change as the simulation progresses. This causes workload imbalance among processors on a parallel machine which, in turn, requires signific...
详细信息
The computational requirements for an adaptive solution of unsteady problems change as the simulation progresses. This causes workload imbalance among processors on a parallel machine which, in turn, requires significant data movement at runtime. We present a dynamic load-balancing framework, called JOVE, that balances the workload across all processors with a global view each time the computational mesh is adapted. JOVE has been implemented on an SP2 in MPI for portability. Experimental results for two model meshes demonstrate that mesh adaption with load balancing gives more than a sixfold improvement over one without load balancing. Furthermore, JOVE gives a 24-fold speedup on 64 processors compared to sequential execution.
暂无评论