检索结果-内蒙古大学图书馆

Seventeenth annual acm symposium on parallelism in algorithms and architectures

作者： Gulati, Ajay Varman, Peter Department of Computer Science Rice University Houston TX 77005 Department of Electrical and Computer Engineering Rice University Houston TX 77005

ISBN: (纸本)9781581139860

High-end shared storage systems serving multiple independent work-loads must assure that concurrently executing clients will receive a fair or agreed-upon share of system I/O resources. In a parallel I/O system an application makes requests for specific disks at different steps of its computation depending on the data layout and its computational state. Different applications contend for disk access making the problem of maintaining fair allocation challenging. We propose a model for differentiated disk bandwidth allocation based on lexicographic minimization, and provide new efficient scheduling algorithms to allocate the I/O bandwidth fairly among contending applications. A major contribution of our model is its ability to handle multiple parallel disks and contention for disks among the concurrent applications. Analysis and simulation-based evaluation shows that our algorithms provide performance isolation, weighted allocation of resources, and are work conserving. The solutions are also applicable to other shared resource environments dealing with non-uniform heterogeneous servers. Copyright 2005 acm.

关键词： Quality of service

来源：评论

学校读者我要写书评

暂无评论

parallel scheduling of complex dags under uncertainty 05

Parallel scheduling of complex dags under uncertainty

引用

Seventeenth annual acm symposium on parallelism in algorithms and architectures

作者： Malewicz, Grzegorz Department of Computer Science University of Alabama Tuscaloosa AL 35487 United States Division of Mathematics and Computer Science Argonne National Laboratory Argonne IL 60439 United States

ISBN: (纸本)9781581139860

This paper introduces a parallel scheduling problem where a directed acyclic graph modeling t tasks and their dependencies needs to be executed on n unreliable workers. Worker i executes task j correctly with probability p i,j. The goal is to find a regimen Σ, that dictates how workers get assigned to tasks (possibly in parallel and redundantly) throughout execution, so as to minimize expected completion time. This fundamental parallel scheduling problem arises in grid computing and project management fields, and has several practical applications. We show a polynomial time algorithm for the problem restricted to the case when dag width is at most a constant and the number of workers is also at most a constant. These two restrictions may appear to be too severe. However, they are fundamentally required. Specifically, we demonstrate that the problem is NP-hard with constant number of workers when dag width can grow, and is also NP-hard with constant dag width when the number of workers can grow. When both dag width and the number of workers are unconstrained, then the problem is inapproximable within factor less than 5/4, unless P=NP. Copyright 2005 acm.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

parallelizing time with polynomial circuits 05

Parallelizing time with polynomial circuits

引用

Seventeenth annual acm symposium on parallelism in algorithms and architectures

作者： Williams, Ryan Computer Science Department Carnegie Mellon University Pittsburgh PA 15213 United States

ISBN: (纸本)9781581139860

We study the relatively old problem of asymptotically reducing the runtime of serial computations with polynomial size Boolean circuits. To the best of our knowledge, no progress on this problem has been formally reported in the literature for general computational models, although we observe that early work of Chandra, Stockmeyer, and Vishkin implies the existence of non-uniform unbounded fan-in circuits of tO(1) size and O(t/log log n) depth, for time t Turing machines. We give an algorithmic size-depth tradeoff for parallelizing time t random access Turing machines, a model at least as powerful as logarithmic cost RAMs. Our parallel simulation yields logspace-uniform tO(1) size, O(t/log t) depth Boolean circuits having semi-unbounded fan-in gates. In fact, for appropriate d, uniform tO(1)2 O(t/d) size circuits of depth O(d) can simulate time t. One corollary is that any log-cost time t RAM can be simulated by a log-cost CRCW PRAM using tO(1) processors and O(t/ log t) time. This is a major improvement over previous parallel speedups, which could only guarantee an Ω(log t) speedup with an exponential number of processors. Copyright 2005 acm.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Collaborate with strangers to find own preferences 05

Collaborate with strangers to find own preferences

引用

Seventeenth annual acm symposium on parallelism in algorithms and architectures

作者： Awerbuch, Baruch Azar, Yossi Lotker, Zvi Patt-Shamir, Boaz Tuttle, Mark R. Dept. of Computer Science Johns Hopkins University School of Computer Science Tel Aviv University Tel Aviv 69978 Israel Kruislaan 413 CWI 1090 GB Amsterdam Netherlands Dept. of Electrical Engineering Tel Aviv University Tel Aviv 69978 Israel HP Cambridge Research Lab. One Cambridge Center Cambridge MA 02142 United States

ISBN: (纸本)9781581139860

We consider a model with n players and m objects. Each player has a "preference vector" of length m that models his grade for each object. The grades are unknown to the players. A player can learn his grade for an object by probing that object, but performing a probe incurs cost. The goal of a player is to learn his preference vector with minimal cost, by adopting the results of probes performed by other players. To facilitate communication, we assume that players collaborate by posting their grades for objects on a shared billboard: reading from the billboard is free. We consider players whose preference vectors are popular, i.e., players whose preferences are common to many other players. We present distributed and sequential algorithms to solve the problem with logarithmic cost overhead. Copyright 2005 acm.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Exploiting multi-grained parallelism in reconfigurable SBC architectures

Exploiting multi-grained parallelism in reconfigurable SBC a...

引用

13th annual IEEE symposium on Field-Programmable Custom Computing Machines

作者： Zambreno, J Honbo, D Choudhary, A Northwestern Univ Dept Elect & Comp Engn Evanston IL 60208 USA

ISBN: (纸本)0769524451

In recent years, reconfigurable technology has emerged as a popular choice for implementing various types of cryptographic functions. Nevertheless, an insufficient amount effort has been placed into fully exploiting the tremendous amounts of parallelism intrinsic to FPGAs for this class of algorithms. In this paper, we focus on block cipher architectures and explore design decisions that leverage the multi-grained parallelism inherent in many of these algorithms. We demonstrate the usefulness of this approach with a highly parallel FPGA implementation of the AES standard, and present results detailing the area/delay tradeoffs resulting from our design decisions.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Fourteenth annual acm symposium on parallel algorithms and architectures

Fourteenth Annual ACM Symposium on Parallel Algorithms and A...

引用

Fourteenth annual acm symposium on parallel algorithms and architectures

The proceedings contains 36 papers. Topics discussed include telecommunication traffic, telecommunication networks, memory constraints, scheduling, sequential consistency, queueing, wireless networks and faulty networks.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Weighted distributed hash tables 05

Weighted distributed hash tables

引用

Seventeenth annual acm symposium on parallelism in algorithms and architectures

作者： Schindelhauer, Christian Schomaker, Gunnar Heinz Nixdorf Institute Computer Science Department University of Paderborn Germany

ISBN: (纸本)9781581139860

We present two methods for weighted consistent hashing also known as weighted distributed hash tables. The first method, called Linear Method, combines the standard consistent hasing introduced by Karger et al. [9] with a linear weighted distance measure. By using node copies and different partitions of the hash space, the balance of this scheme approximates the fair weight relationship with high probability. The second method, called the Logarithmic Method, uses a logarithmic weighted distance between the peers and the data to find the corresponding node. For distributing one data element it provides perfect weighted balance. To provide this distribution for many data elements we use partitions to achieve a fair balance with high probability. These methods provide small fragmentation, which means that the hash space is divided into at most script O sign(n log n) intervals. Furthermore, there is an efficient data structure that assigns data elements to the nodes in expected time script O sign(log n). If small fragmentation is not an issue one can replace the use of partitions by a method we call double hash functions. This method needs script O sign(n) for assigning elements to a node, yet it can be directly used for Storage Area Networks, where the number of nodes is small compared to participating nodes in Peer-to-Peer networks. Copyright 2005 acm.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Windows scheduling of arbitrary length jobs on parallel machines 05

Windows scheduling of arbitrary length jobs on parallel mach...

引用

Seventeenth annual acm symposium on parallelism in algorithms and architectures

作者： Bar-Noy, Amotz Ladner, Richard E. Tamir, Tami VanDegrift, Tammy Computer and Information Science Department Brooklyn College 2900 Bedford Ave. Brooklyn NY 11210 Department of Computer Science and Engineering Box 352350 University of Washington Seattle WA 98195 School of Commiter Science Interdisciplinary Center Herzliya Israel Electrical Engineering and Computer Science University of Portland 5000 N. Willamette Blvd. Portland OR 97203

ISBN: (纸本)9781581139860

The generalized windows scheduling problem for n jobs on multiple machines is defined as follows: Given is a sequence, I = 〈(w1, l 1),(w2, l2),....(wn, l n)〉 of n Pairs of positive integers that are associated with the jobs 1, 2,..., n, respectively. The processing length of job i is li slots (a slot is the processing time of one length unit). The goal is to repeatedly and non-preemptively schedule all the jobs on the fewest possible parallel machines such that the gap (window) between two consecutive executions of the first slot of. job i is at most wi slots. This problem arises in push broadcast systems in which data is transmitted on parallel channels. The problem is NP-hard even for unit-length jobs and a (1 + Ε)-approximation algorithm is known for this case by approximating the natural lower bound W(I) = Σi=1n(1/wi). The techniques used for approximating unit-length jobs cannot be applied for arbitrary-length jobs mainly because the optimal number of machines might be arbitrarily larger than the generalized lower bound W (I) = Σi=1n(l i/wi). Our main result is an 8-approximation algorithm for the generalized problem using new methods, different from those used for the unit-length case. We also present an algorithm that uses 2(1 + Ε)W(I) + log wmax machines and a greedy algorithm that is based on a new tree representation of schedules. The greedy algorithm is optimal for some special and simulations show that it performs very well in practice. Copyright 2005 acm.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Decentralized algorithms using both local and random probes for p2p load balancing 05

Decentralized algorithms using both local and random probes ...

引用

Seventeenth annual acm symposium on parallelism in algorithms and architectures

作者： Kenthapadi, Krishnaram Manku, Gurmeet Singh Stanford University Google Inc.

ISBN: (纸本)9781581139860

We study randomized algorithms for placing a sequence of n nodes on a circle with unit perimeter. Nodes divide the circle into disjoint arcs. We desire that a newly-arrived node (which is oblivious of its index in the sequence) choose its position on the circle by learning the positions of as few existing nodes as possible. At the same time, we desire that that the variation in arc-lengths be small. To this end, we propose a new algorithm that works as follows: The kth node chooses r random points on the circle, inspects the sizes of v arcs in the vicinity of each random point, and places itself at the mid-point of the largest arc encountered. We show that for any combination of r and v satisfying rv ≥ c log k, where c is a small constant, the ratio of the largest to the smallest arc-length is at most eight w.h.p., for an arbitrarily long sequence of n nodes. This strategy of node placement underlies a novel decentralized load-balancing algorithm that we propose for Distributed Hash Tables (DHTs) in peer-to-peer environments. Underlying the analysis of our algorithm is Structured Coupon Collection over n/b disjoint cliques with b nodes per clique, for any n, b ≥ 1. Nodes are initially uncovered. At each step, we choose d nodes independently and uniformly at random. If all the nodes in the corresponding cliques are covered, we do nothing. Otherwise, from among the chosen cliques with at least one uncovered node, we select one at random and cover an uncovered node within that clique. We show that as long as bd ≥ c log n, O(n) steps are sufficient to cover all nodes w.h.p. and each of the first Ω(n) steps succeeds in covering a node w.h.p. These results are then utilized to analyze a stochastic process for growing binary trees that are highly balanced - the leaves of the tree belong to at most four different levels with high probability. Copyright 2005 acm.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel Lempel Ziv coding

引用

DISCRETE APPLIED MATHEMATICS 2005年第2期146卷 180-191页

作者： Klein, ST Wiseman, Y Bar Ilan Univ Dept Comp Sci IL-52900 Ramat Gan Israel

We explore the possibility of using multiple processors to improve the encoding and decoding times of Lempel-Ziv schemes. A new layout of the processors, based on a full binary tree, is suggested and it is shown how LZSS and LZW can be adapted to take advantage of such parallel architectures. The layout is then generalized to higher order trees. Experimental results show an improvement in compression over the standard method of parallelization and an improvement in time over the sequential method. (C) 2004 Elsevier B.V. All rights reserved.

关键词： data compression Lempel-Ziv algorithms parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：