检索结果-内蒙古大学图书馆

proceedings of the 1998 10th annual acm symposium on parallel algorithms and architectures, SPAA

作者： Dittrich, W. Hutchinson, D. Maheshwari, A. Bosch Telecom GmbH Backnang Germany

ISBN: (纸本)9780897919890

External memory (EM) algorithms are designed for computational problems in which the size of the internal memory of the computer is only a small fraction of the problem size. Block-wise access to data is a central theme in the design of efficient EM algorithms. A similar requirement arises in the transmission of data between processors in certain parallel computation models such as BSP* and CGM, for which block-wise communication is a crucial issue. We consider multisearch problems, where a large number of queries are to be simultaneously processed and satisfied by navigating through large data structures on parallel computers. The examples used originate as BSP* algorithms and we adapt them to the EM situation where the queries and data structure are considered to be much larger than the size of the available internal memory. This paper presents techniques to achieve blocking for I/O as well as for communication in multisearch on the BSP* and EM-BSP* models. In the area of EM algorithms new algorithms for multisearch in balanced trees are described. For search trees up to size O(n log n) where n is the number of queries, we obtain work-optimal, parallel, EM multisearch algorithms whose I/O and communication time are the same, asymptotically, as the computation time. These algorithms are obtained via the simulation technique of [15]. For larger trees we describe a parallel, EM algorithm which is simultaneously c-optimal and I/O-optimal. We give a lower bound to the number of I/O operations required for filtering n queries through a binary or multiway search tree of size m when m ≥ n2+Ε, constant Ε > 0. In the domain of parallel, non-EM algorithms we describe a new 1-optimal algorithm for BSP* multisearch in optimal trees which communicates in a block-wise fashion.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

A constant time optimal parallel algorithm for two-dimensional pattern matching

引用

SIAM JOURNAL ON COMPUTING 1998年第3期27卷 668-681页

作者： Crochemore, M Gasieniec, L Hariharan, R Muthukrishnan, S Rytter, W Univ Marne La Vallee Inst Gaspard Monge F-93160 Noisy Le Grand France Univ Warsaw Inst Informat PL-02097 Warszawa Poland Max Planck Inst Informat Saarbrucken Germany NYU Courant Inst New York NY USA AT&T Bell Labs Murray Hill NJ 07974 USA

We give an alphabet-independent deterministic parallel algorithm for finding all occurrences of a pattern array of size m(h) x m(w) in a text array of size n(h) x n(w) in the concurrent-read-concurrent-write-parallel-random-access-machine (CRCW-PRAM) model. Our algorithm runs in O(1) time performing optimal, that is, O(n(h) x n(w)) work, following preprocessing of the pattern. This improves the previous best bound of O(loglogm) time with optimal work [A. Amir, G. Benson, and M. Farach, proceedings 5th annual acm symposium on parallel algorithms and architectures, acm, New York, 1993, pp. 79-85], following preprocessing of the pattern, where m = max {m(h),m(w)}. The preprocessing required by our algorithm (and that due to Amir, Benson, and Farach) can be accomplished in O(loglogm) time and O(m(h) x m(w)) work [M. Crochemore et al., manuscript, 1993], [R. Cole et al., manuscript, 1993].

关键词： pattern matching PRAM two-dimensional witnesses duelling periodicity

来源：评论

学校读者我要写书评

暂无评论

Dynamic scheduling with incomplete information 98

Dynamic scheduling with incomplete information

引用

proceedings of the 1998 10th annual acm symposium on parallel algorithms and architectures, SPAA

作者： Bast, H. Max-Planck-Inst fur Informatik Saarbrucken Germany

ISBN: (纸本)9780897919890

We consider the following scheduling problem: Our goal is to execute a given amount of arbitrarily decomposable work on a distributed machine as quickly as possible. The work is maintained by a central scheduler that can assign chunks of work of an arbitrary size to idle processors. The difficulty is that the processing time required for a chunk is not exactly predictable - usually the less, the larger the chunk - and that processors suffer a delay for each assignment. Our objective is to minimize the total wasted time of the schedule, that is, the sum of all delays plus the idle times of processors waiting for the last processor to finish. We introduce a new deterministic model for this setting, based on estimated ranges [α(w), β(w)] for processing times of chunks of size w. Depending on α, β, and a measure for the overall deviation from these estimates, we can prove matching upper and lower bounds on the wasted time, the former being achieved by our new balancing strategy. This is in sharp contrast with previous work that, even under the strong assumption of independent, approximately normally distributed chunk processing times, proposed only heuristic scheduling schemes supported merely by empirical evidence. Our model naturally subsumes this stochastic setting, and our generic analysis is valid for most of the existing schemes too, proving them to be non-optimal.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Detecting data races in Cilk programs that use locks 98

Detecting data races in Cilk programs that use locks

引用

proceedings of the 1998 10th annual acm symposium on parallel algorithms and architectures, SPAA

作者： Cheng, G.-I. Feng, M. Leiserson, Ch.E. Randall, K.H. Stark, A.F. MIT Lab for Computer Science Cambridge MA United States

ISBN: (纸本)9780897919890

When two parallel threads holding no locks in common access the same memory location and at least one of the threads modifies the location, a 'data race' occurs, which is usually a bug. This paper describes the algorithms and strategies used by a debugging tool, called the Nondeterminator-2, which checks for data races in programs coded in the Cilk multithreaded language. Like its predecessor, the Nondeterminator, which checks for simple 'determinacy' races, the Nondeterminator-2 is a debugging tool, not a verifier, since it checks for data races only in the computation generated by a serial execution of the program on a given input. We give an algorithm, ALL-SETS, that determines whether the computation generated by a serial execution of a Cilk program on a given input contains a race. For a program that runs serially in time T, accesses V shared memory locations, uses a total of n locks, and holds at most k n locks simultaneously, ALL-SETS runs in. O(nkT α(V, V)) time and O(nkV) space, where α is Tarjan's functional inverse of Ackermann's function. Since ALL-SETS may be too inefficient in the worst case, we propose a much more efficient algorithm which can be used to detect races in programs that obey the 'umbrella' locking discipline, a programming methodology that is more flexible than similar disciplines proposed in the literature. We present an algorithm, BRELLY, which detects violations of the umbrella discipline in O(kT α(V, V)) time using O(kV) space. We also prove that any 'abelian' Cilk program, one whose critical sections commute, produces a determinate final state if it is deadlock free and if it generates any computation which is data-race free. Thus, the Nondeterminator-2's two algorithms can verify the determinacy of a deadlock-free abelian program running on a given input.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Deadlock-free routing in arbitrary networks via the flattest common supersequence method 98

Deadlock-free routing in arbitrary networks via the flattest...

引用

proceedings of the 1998 10th annual acm symposium on parallel algorithms and architectures, SPAA

作者： Laing, A.K. Cypher, R. The Johns Hopkins Univ Baltimore MD United States

ISBN: (纸本)9780897919890

In this paper we consider the problem of deadlock-free routing in arbitrary parallel and distributed computers. We focus on asynchronous routing algorithms which continuously receive new packets to route and which do not discard packets that encounter congestion. Specifically, we examine what we call the deadlock-free routing (DFR) problem. The input to the DFR problem consists of an arbitrary network and an arbitrary set of paths in the network. The output consists of a routing algorithm, which is a list of the buffers used along each of the paths. The routing algorithm is required to be free from deadlock and the goal is to minimize the number of buffers required in any one node. We study the DFR problem by converting it into an equivalent problem which we call the flattest common supersequence (FCS) problem. The input to the FCS problem consists of a set of sequences and the output consists of a single sequence that contains all of the input sequences as (possibly non-contiguous) subsequences. The goal of the FCS problem is to minimize the maximum frequency of any symbol in the output sequence. We present three main results. First, we prove that the decision version of the FCS problem is NP-complete, and has no polynomial-time approximation scheme unless P = NP. Next, we propose and experimentally evaluate a range of heuristics for FCS. Our experimental results show that one of these heuristics performs very well over a wide range of inputs. Lastly, we prove that this heuristic is in fact optimal for certain restricted classes of inputs.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Clustering and reassignment-based mapping strategy for message-passing architectures 1

Clustering and reassignment-based mapping strategy for messa...

引用

1st Merged International parallel Processing symposium/symposium on parallel and Distributed Processing (IPPS/SPDP 1998)

作者： Senar, MA Ripoll, A Cortes, A Luque, E Univ Autonoma Barcelona Dept Informat Unitat Arquitect Ordinadors & Sistemes Operatius Bellaterra 08193 Barcelona Spain

ISBN: (纸本)0818684038

A fundamental issue affecting the performance of a parallel program is the assignment of tasks to processors in order to get the minimum completion rime. In this paper, we present a compilation-time two-stage mapping strategy (denoted as CREMA) used for mapping arbitrary programs (modeled as TIG graphs) onto message-passing parallel systems with any architecture. In contrast to most of the other approaches found in the literature, CREMA is not tied to any particular architecture or any specific algorithm. The first stage is based on task clustering and task reassignment algorithms that contract the original task graph. The second stage takes the contracted graph and tries to successfully match the physical properties of the target system. It has been evaluated for a wide range of bath regular and irregular graphs that correspond to some well-known real applications. The results show that CREMA provides a good trade-off between mapping quality and computational complexity.

关键词： mapping problem task allocation task assignment graph partitioning clustering Heuristics

来源：评论

学校读者我要写书评

暂无评论

First- and second-order diffusive methods for rapid, coarse, distributed load balancing

引用

THEORY OF COMPUTING SYSTEMS 1998年第4期31卷 331-354页

作者： Muthukrishnan, S Ghosh, B Schultz, MH AT&T Bell Labs Lucent Technol Murray Hill NJ 07974 USA Informix Software Inc Menlo Park CA 94025 USA Yale Univ Dept Comp Sci New Haven CT 06520 USA Yale Univ Dept Comp Sci New Haven CT 06520 USA

We consider the following general problem modeling load balancing in a variety of distributed settings. Given an arbitrary undirected connected graph G = (V, E) and a weight distribution w(0) on the nodes, determine a schedule to move weights across edges in each step so as to (approximately) balance the weights on the nodes. We focus on diffusive schedules for this problem. All previously studied diffusive schedules can be modeled as w(t+l) = Mw(t) where w(t) is the weight distribution after t steps and M is a doubly stochastic matrix. We call these the first-order schedules. First-order schedules, although widely used in practice, are often slow. In this paper we introduce a new direction in diffusive schedules by considering schedules that are modeled as: w(1) = Mw(0);w(t+1) = beta Mw(t) + (1 - beta)w(t-1) for some appropriate beta;we call these the second-order schedules. In the idealized setting of weights being real numbers, we adopt known results to show that beta can be chosen so that the second-order schedule involves significantly fewer steps than the first-order method for approximate load balancing. In the realistic setting when the weights are positive integers, we simulate the idealized schedules by maintaining I Owe You units on the edges. Extensive experiments with simulated data and real-life data from JOSTLE, a mesh-partitioning software, show that the resultant realistic schedule is close to the idealized schedule, and it again involves fewer steps than the first-order schedules for approximate load balancing. Our main result is therefore a fast algorithm for coarse load balancing that can be used in a variety of applications.

关键词： Stochastic systems

来源：评论

学校读者我要写书评

暂无评论

A framework for simple sorting algorithms on parallel disk systems (extended abstract) 98

A framework for simple sorting algorithms on parallel disk s...

引用

proceedings of the tenth annual acm symposium on parallel algorithms and architectures

作者： Sanguthevar Rajasekaran Dept. of CISE University of Florida Gainesville FL

来源：评论

学校读者我要写书评

暂无评论

Distributed and parallel computing issues in data warehousing (abstract) 98

Distributed and parallel computing issues in data warehousin...

引用

proceedings of the tenth annual acm symposium on parallel algorithms and architectures

作者： Hector Garcia-Molina Wilburt J. Labio Janet L. Wiener Yue Zhuge Stanford University

No abstract available.

ISBN: (纸本)9780897919890

No abstract available.

关键词：

来源：评论

学校读者我要写书评

暂无评论

parallel continuous randomized load balancing (extended abstract) 98

Parallel continuous randomized load balancing (extended abst...

引用

proceedings of the tenth annual acm symposium on parallel algorithms and architectures

作者： Petra Berenbrink Tom Friedetzky Ernst W. Mayr Department of Mathematics and Computer Science Paderborn University Germany Institut für Informatik Technische Universität München Germany

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：