We present new BSP algorithms for deterministic sorting and randomized median finding. We sort n general keys by using a partitioning scheme that achieves the requirements of efficiency (one-optimality) and insensitiv...
详细信息
ISBN:
(纸本)9780897918091
We present new BSP algorithms for deterministic sorting and randomized median finding. We sort n general keys by using a partitioning scheme that achieves the requirements of efficiency (one-optimality) and insensitivity against data skew (the accuracy of the splitting keys depends solely on the step distance, which can be adapted to meet the worst-case requirements of our application). Although we employ sampling in order to realize efficiency, we can give a precise worst-case estimation of the maximum imbalance which might occur. We also investigate optimal randomized BSP algorithms for the problem of finding the median of n elements that require, with high-probability, 3n/(2p)+o(n/p) number of comparisons, for a wide range of values of n and p. Experimental results for the two algorithms are also presented.
this paper analyzes job scheduling for parallel computers by using theoretical and experimental means. Based on existing architectures we first present a machine and a job model. then, we propose a simple on-line algo...
详细信息
this paper analyzes job scheduling for parallel computers by using theoretical and experimental means. Based on existing architectures we first present a machine and a job model. then, we propose a simple on-line algorithm employing job preemption without migration and derive theoretical bounds for the performance of the algorithm. the algorithm is experimentally evaluated with trace data from a large computing facility. these experiments show that the algorithm is highly sensitive on parameter selection and that substantial performance improvements over existing (non-preemptive) scheduling methods are possible.
the proceedings contain 35 papers. the topics discussed include: storage requirements for deterministic polynomial time recognizable languages;linear time algorithm for isomorphism of planar graphs;testing graph conne...
the proceedings contain 35 papers. the topics discussed include: storage requirements for deterministic polynomial time recognizable languages;linear time algorithm for isomorphism of planar graphs;testing graph connectivity;efficient stable sorting with minimal extra space;an efficient algorithm for computing optimal disk merge patterns;limitations of synchronization primitives with conditional branching and global variables;and construction withparallel derivatives of the closure of a parallel program schema.
We present parallelalgorithms for union, intersection and difference on ordered sets using random balanced binary trees (treaps [26]). For two sets of size n and m (m ≤ n) the algorithms run in expected O(m lg(n/m))...
详细信息
ISBN:
(纸本)9780897919890
We present parallelalgorithms for union, intersection and difference on ordered sets using random balanced binary trees (treaps [26]). For two sets of size n and m (m ≤ n) the algorithms run in expected O(m lg(n/m)) work and O(lg n) depth (parallel time) on an EREW PRAM with scan operations (implying O(lg2 n) depth on a plain EREW PRAM). As withthe sequential algorithms on treaps for insertion and deletion the main advantage of our algorithms are their simplicity. In fact our algorithms for set operations seem simpler than previous sequential algorithms withthe same work bounds, and might therefore also be useful in a sequential context. To analyze the effectiveness of the algorithms we implemented both sequential and parallel versions of the algorithms and ran several experiments on them. Our parallel implementation uses the Cilk [5] shared memory runtime system on a 16 processor SGI Power Challenge and a 6 processor Sun Ultra Enterprise 3000. It shows reasonable speedup: 6.3 to 6.8 speedup on 8 processors of the SGI, and 4.1 to 4.4 speedup on 5 processors of the Sun.
the fundamental problems in dynamic load balancing and job scheduling in parallel and distributed computers involve moving load between processors. In this paper, we consider a new model for load movement in synchrono...
详细信息
It is well known that simple randomized load balancing schemes can balance load effectively while incurring only a small overhead, making such schemes appealing for practical systems. In this paper, we provide new ana...
详细信息
ISBN:
(纸本)9780897918909
It is well known that simple randomized load balancing schemes can balance load effectively while incurring only a small overhead, making such schemes appealing for practical systems. In this paper, we provide new analyses for several such dynamic randomized load balancing schemes. Unlike previous analyses, we do not assume that in equilibrium each server is stochastically independent from other servers. Our work extends a previous analysis of the supermarket model, a model that abstracts a simple, efficient load balancing scheme in the setting where jobs arrive at a large system of parallel processors. In this model, customers arrive at a system of n servers as a Poisson stream of rate λn, λ
the proceedings contain 47 papers. the topics discussed include: Quancurrent: a concurrent quantiles sketch;an efficient scheduler for task-parallel interactive applications;efficient synchronization-light work steali...
ISBN:
(纸本)9781450395458
the proceedings contain 47 papers. the topics discussed include: Quancurrent: a concurrent quantiles sketch;an efficient scheduler for task-parallel interactive applications;efficient synchronization-light work stealing;balanced allocations in batches: the tower of two choices;massively parallel tree embeddings for high dimensional spaces;deterministic massively parallel symmetry breaking for sparse graphs;an associativity threshold phenomenon in set-associative caches;increment - and - freeze: every cache, everywhere, all of the time;multidimensional approximate agreement with asynchronous fallback;a tight characterization of fast failover routing: resiliency to two link failures is possible;releasing memory with optimistic access: a hybrid approach to memory reclamation and allocation in lock-free programs;transactional composition of nonblocking data structures;applying hazard pointers to more concurrent data structures;and nearly optimal parallelalgorithms for longest increasing subsequence.
the symposium materials contain 118 papers on new developments in parallel processing. algorithms, architectures, mapping/scheduling, applications, special-purpose architectures, interconnection networks, software, an...
详细信息
ISBN:
(纸本)0818626720
the symposium materials contain 118 papers on new developments in parallel processing. algorithms, architectures, mapping/scheduling, applications, special-purpose architectures, interconnection networks, software, and distributed systems are among the main topics covered.
this paper introduces the serial-parallel decision problem. Consider an online scheduler that receives a series of tasks, where each task has both a parallel and a serial implementation. the parallel implementation ha...
详细信息
ISBN:
(纸本)9798400704161
this paper introduces the serial-parallel decision problem. Consider an online scheduler that receives a series of tasks, where each task has both a parallel and a serial implementation. the parallel implementation has the advantage that it can make progress concurrently on multiple processors, but the disadvantage that it is (potentially) work-inefficient. As tasks arrive, the scheduler must decide for each task which implementation to use. We begin by studying total awake time. We give a simple decide-on-arrival scheduler that achieves a competitive ratio of 3 for total awake time-this scheduler makes serial/parallel decisions immediately when jobs arrive. Our second result is an parallel-work-oblivious scheduler that achieves a competitive ratio of 6 for total awake time-this scheduler makes all of its decisions based only on the size of each serial job and without needing to know anything about the parallel implementations. Finally, we prove a lower bound showing that, if a scheduler wishes to achieve a competitive ratio of O(1), it can have at most one of the two aforementioned properties (decide-on-arrival or parallel-work-oblivious). We also prove lower bounds of the form 1 + Omega(1) on the optimal competitive ratio for any scheduler. Next, we turn our attention to optimizing mean response time. Here, we show that it is possible to achieve an O(1) competitive ratio with O(1) speed augmentation. this is the most technically involved of our results. We also prove that, in this setting, it is not possible for a parallel-work-oblivious scheduler to do well. In addition to these results, we present tight bounds on the optimal competitive ratio if we allow for arrival dependencies between tasks (e.g., tasks are components of a single parallel program), and we give an in-depth discussion of the remaining open questions.
Most high-level parallel programming languages allow for fine-grained parallelism. Programs written in such languages can express the full parallelism in the program without specifying the mapping of program tasks to ...
详细信息
ISBN:
(纸本)9780897917179
Most high-level parallel programming languages allow for fine-grained parallelism. Programs written in such languages can express the full parallelism in the program without specifying the mapping of program tasks to processors. When executing such programs, the major concern is to dynamically schedule tasks to processors in order to minimize execution time and the amount of memory needed. In this paper, a class of parallel schedules that are provably efficient in both time and space, even for programs whose task structure is revealed only during execution are identified. Following this, an efficient dynamic scheduling algorithm that generates schedules in this class, for languages with nested fine-grained parallelism is described.
暂无评论