Emerging applications in multi-media and the Human Genome Project require storage and searching of large databases of strings - a task for which parallelism seems the only hope. In this paper, we consider the parallel...
详细信息
Emerging applications in multi-media and the Human Genome Project require storage and searching of large databases of strings - a task for which parallelism seems the only hope. In this paper, we consider the parallelism in some of the fundamental problems in compressing strings and in matching large dictionaries of patterns against texts. We present the first work-optimal algorithms for these well-studied problems including the classical dictionary matching problem, optimal compression with a static dictionary and the universal data compression with dynamic dictionary of Lempel and Ziv. All our algorithms are randomized and they are of the Las Vegas type. Furthermore, they are fast, working in time logarithmic in the input size. Additionally, our algorithms seem suitable for a distributed implementation.
We present an NC algorithm for recognizing chordal graphs, and we present NC algorithms for finding the following objects on chordal graphs: all maximal cliques, an intersection graph representation, an optimal colori...
详细信息
ISBN:
(纸本)9780897912211
We present an NC algorithm for recognizing chordal graphs, and we present NC algorithms for finding the following objects on chordal graphs: all maximal cliques, an intersection graph representation, an optimal coloring, a perfect elimination scheme, a maximum independent set, a minimum clique cover, and the chromatic polynomial. The well known polynomial algorithms for these problems seem highly sequential, and therefore a different approach is needed to find parallelalgorithms.
The proceedings contains 40 papers from the conference on SPAA 2004 - Sixteenth annualacmsymposium on parallelism in algorithms and architectures. The topics discussed include: On delivery times in packet networksun...
详细信息
The proceedings contains 40 papers from the conference on SPAA 2004 - Sixteenth annualacmsymposium on parallelism in algorithms and architectures. The topics discussed include: On delivery times in packet networksunder adversarial traffic;balanced graph partitioning;online hierarchical cooperative caching;scheduling against an adversarial network;effectively sharing a cache among threads;online algorithms for network design and dynamic analysis of the arrow distributed protocol.
As the number of cores increases on chip multiprocessors, coherence is fast becoming a central issue for multi-core performance. This is exacerbated by the fact that interconnection speeds are not scaling well with te...
详细信息
ISBN:
(纸本)9781595936677
As the number of cores increases on chip multiprocessors, coherence is fast becoming a central issue for multi-core performance. This is exacerbated by the fact that interconnection speeds are not scaling well with technology. This paper describes mechanisms to accelerate coherence for a multi-core architecture that has multiple private L2 caches and a scalable point-to-point interconnect between cores. These techniques exploit the differences in geometry between chip multiprocessors and traditional multiprocessor architectures. Directory-based protocols have been proposed as a scalable alternative to snoop-based protocols. In this paper, we discuss implementations of coherence for CMPs and propose and evaluate a novel directory-based coherence scheme to improve the performance of parallel programs on such processors. Proximity-aware coherence accelerates read and write misses by initiating cache-to-cache transfers from the spatially closest sharer. This has the dual benefit of eliminating unnecessary accesses to off-Chip memory, and minimizing the distance over which communicated data moves across the network. The proposed schemes result in speedups up to 74.9% for our workloads.
The proceedings contains 46 papers from the conference on SPAA 2003 Fifteenth annualacmsymposium on parallelism in algorithms and architectures. The topics discussed include: optimal sharing of bags of tasks in hete...
详细信息
The proceedings contains 46 papers from the conference on SPAA 2003 Fifteenth annualacmsymposium on parallelism in algorithms and architectures. The topics discussed include: optimal sharing of bags of tasks in heterogeneous clusters;minimizing total flow time and total completion time with immediate dispatching;a practical algorithm for constructing oblivious routing schemes;a polynomial-time tree decomposition to minimize congestion and online oblivious routing.
A new approach to parallel sorting called parallel Sorting by OverPartitioning (PSOP) is presented. The approach limits the communication cost by moving each element between processors at most once, and leads to good ...
详细信息
We present parallelalgorithms for several graph and geometric problems, including transitive closure and topological sorting in planar st-graphs, preprocessing planar subdivisions for point location queries, and cons...
详细信息
We implemented and measured several methods to perform BMMC permutations on the MasPar MP-2. Our results indicate that, except for certain types of permutations or very high virtual processor ratios, the best method o...
详细信息
We implemented and measured several methods to perform BMMC permutations on the MasPar MP-2. Our results indicate that, except for certain types of permutations or very high virtual processor ratios, the best method overall is the naive method but with virtual-processor numbers computed in Gray-code order. For some permutations, however, the naive method performs very poorly;the best method in these cases is an adaptation of the block BMMC algorithm for parallel disk systems in which the processor elements are treated as independent devices.
In this paper, we study parallelalgorithms for private-cache chip multiprocessors (CMPs), focusing on methods for foundational problems that are scalable with the number of cores. By focusing on private-cache CMPs, w...
详细信息
ISBN:
(纸本)9781595939739
In this paper, we study parallelalgorithms for private-cache chip multiprocessors (CMPs), focusing on methods for foundational problems that are scalable with the number of cores. By focusing on private-cache CMPs, we show that we can design efficient algorithms that need no additional assumptions about the way cores are interconnected, for we assume that all inter-processor communication occurs through the memory hierarchy. We study several fundamental problems, including prefix sums, selection, and sorting, which often form the building blocks of other parallelalgorithms. Indeed, we present two sorting algorithms, a distribution sort and a mergesort. Our algorithms are asymptotically optimal in terms of parallel cache accesses and space complexity under reasonable assumptions about the relationships between the number of processors, the size of memory, and the size of cache blocks. In addition, we study sorting lower bounds in a computational model, which we call the parallel external-memory (PEM) model, that formalizes the essential properties of our algorithms for private-cache CMPs.
暂无评论