检索结果-内蒙古大学图书馆

Fifteenth annual acm symposium on parallelism in algorithms and architectures

The proceedings contains 46 papers from the conference on SPAA 2003 Fifteenth annual acm symposium on parallelism in algorithms and architectures. The topics discussed include: optimal sharing of bags of tasks in heterogeneous clusters;minimizing total flow time and total completion time with immediate dispatching;a practical algorithm for constructing oblivious routing schemes;a polynomial-time tree decomposition to minimize congestion and online oblivious routing.

关键词： Telecommunication networks

来源：评论

学校读者我要写书评

暂无评论

symposium on parallelism in algorithms and architectures, SPAA 2003: 15th annual symposium on parallelism in algorithms and architectures

Symposium on Parallelism in Algorithms and Architectures, SP...

引用

4th acm Federated Computing Research Conference, FCRC 2003

The proceedings contain 7 papers from the symposium on parallelism in algorithms and architectures, SPAA 2003: 15th annual symposium on parallelism in algorithms and architectures. The topics discussed include: a practical algorithm for constructing oblivious routing schemes;novel architectures for P2P applications: the continuous-discrete approach;quantifying instruction criticality for shared memory multiprocessors;relaxing the problem-size bound for out-of-core columnsort;the complexity of verifying memory coherence;a near optimal scheduler for switch-memory-switch routers;and on local algorithms for topology control and routing in ad hoc networks.

关键词： Computer science

来源：评论

学校读者我要写书评

暂无评论

Integrated prefetching and caching in single and parallel disk systems 03

Integrated prefetching and caching in single and parallel di...

引用

Fifteenth annual acm symposium on parallelism in algorithms and architectures

作者： Albers, Susanne Büttner, Markus Institute for Computer Science Freiburg University Georges-Köhler-Allee 79 79110 Freiburg Germany

ISBN: (纸本)9781581136616

We study integrated prefetching and caching in single and parallel disk systems. There exist two very popular approximation algorithms called Aggressive and Conservative for minimizing the total elapsed time in the single disk problem. For D parallel disks, approximation algorithms are known for both the elapsed time and stall time performance measures. In particular, there exists a D-approximation algorithm for the stall time measure that uses D-1 additional memory locations in cache. In the first part of the paper we investigate approximation algorithms for the single disk problem. We give a refined analysis of the Aggressive algorithm, showing that the original analysis was too pessimistic. We prove that our new bound is tight. Additionally we present a new family of prefetching and caching strategies and give algorithms that perform better than Aggressive and Conservative. In the second part of the paper we investigate the problem of minimizing stall time in parallel disk systems. We present a polynomial time algorithm for computing a prefetching/caching schedule whose stall time is bounded by that of an optimal solution. The schedule uses at most 3(D - 1) extra memory locations in cache. This is the first polynomial time algorithm for computing schedules with a minimum stall time. Our algorithm is based on the linear programming approach of [1]. However, in order to achieve minimum stall times, we introduce the new concept of synchronized schedules in which fetches on the D disks are performed completely in parallel.

关键词： Buffer storage

来源：评论

学校读者我要写书评

暂无评论

Load balancing of unit size tokens and expansion properties of graphs

Load balancing of unit size tokens and expansion properties ...

引用

Fifteenth annual acm symposium on parallelism in algorithms and architectures

作者： Elsässer, Robert Monien, Burkhard University of Paderborn Fürstenallee 11 33102 Paderborn Germany

Diffusive schemes have been widely analyzed for parallel and distributed load balancing. It is well known that their convergence rates depend on the eigenvalues of some associated matrices and on the expansion properties of the underlying graphs. In the first part of this paper we make use of these relationships in order to obtain new spectral bounds on the edge and node expansion of graphs. We show that these new bounds are better than the classical bounds for several graph classes. In the second part of the paper, we consider the load balancing problem for indivisible unit size tokens. Since known diffusion schemes do not completely balance the load for such settings, we propose a randomized distributed algorithm based on Markov chains to reduce the load imbalance. We prove that this approach provides the best asymptotic result that can be achieved in l1- or l2-norm concerning the final load situation.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Novel architectures for P2P applications: The continuous-discrete approach 03

Novel architectures for P2P applications: The continuous-dis...

引用

Fifteenth annual acm symposium on parallelism in algorithms and architectures

作者： Naor, Moni Wieder, Udi Weizmann Institute of Science Rehovot Israel

ISBN: (纸本)9781581136616

We propose a new approach for constructing P2P networks based on a dynamic decomposition of a continuous space into cells corresponding to processors. We demonstrate the power of these design rules by suggesting two new architectures, one for DHT (Distributed Hash Table) and the other for dynamic expander networks. The DHT network, which we call Distance Halving allows logarithmic routing and load, while preserving constant degrees. It offers an optimal tradeoff between the degree and the dilation in the sense that degree d guarantees a dilation of O(logdn). Another advantage over previous constructions is its relative simplicity. A major new contribution of this construction is a dynamic caching technique that maintains low load and storage even under the occurrence of hot spots. Our second construction builds a network that is guaranteed to be an expander. The resulting topologies are simple to maintain and implement. Their simplicity makes it easy to modify and add protocols. A small variation yields a DHT which is robust against random faults. Finally we show that, using our approach, it is possible to construct any family of constant degree graphs in a dynamic environment, though with worst parameters. Therefore we expect that more distributed data structures could be designed and implemented in a dynamic environment.

关键词： Telecommunication networks

来源：评论

学校读者我要写书评

暂无评论

Beating in-order stalls with "flea-flicker" two-pass pipelining

Beating in-order stalls with "flea-flicker" two-pass pipelin...

引用

IEEE/acm International symposium on Microarchitecture (MICRO)

作者： R.D. Barnes S.J. Patel E.M. Nystrom N. Navarro J.W. Sias W.W. Hwu Center for Reliable and High-Performance Computing Department of Electrical and Computer Engineering University of Illinois Urbana-Champaign USA

Accommodating the uncertain latency of load instructions is one of the most vexing problems in in-order microarchitecture design and compiler development. Compilers can generate schedules with a high degree of instruction-level parallelism but cannot effectively accommodate unanticipated latencies; incorporating traditional out-of-order execution into the microarchitecture hides some of this latency but redundantly performs work done by the compiler and adds additional pipeline stages. Although effective techniques, such as prefetching and threading, have been proposed to deal with anticipable, long latency misses, the shorter, more diffuse stalls due to difficult-to-anticipate, first- or second-level misses are less easily hidden on in-order architectures. This paper addresses this problem by proposing a microarchitectural technique, referred to as two-pass pipelining, wherein the program executes on two in-order back-end pipelines coupled by a queue. The "advance" pipeline executes instructions greedily, without stalling on unanticipated latency dependences (executing independent instructions while otherwise blocking instructions are deferred). The "backup" pipeline allows concurrent resolution of instructions that were deferred in the other pipeline, resulting in the absorption of shorter misses and the overlap of longer ones. This paper argues that this design is both achievable and a good use of transistor resources and shows results indicating that it can deliver significant speedups for in-order processor designs.

关键词： Pipeline processing Delay Microarchitecture Processor scheduling Runtime Computer aided instruction Out of order Registers parallel processing Process design

来源：评论

学校读者我要写书评

暂无评论

Fourteenth annual acm symposium on parallel algorithms and architectures

Fourteenth Annual ACM Symposium on Parallel Algorithms and A...

引用

Fourteenth annual acm symposium on parallel algorithms and architectures

The proceedings contains 36 papers. Topics discussed include telecommunication traffic, telecommunication networks, memory constraints, scheduling, sequential consistency, queueing, wireless networks and faulty networks.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

On a scheme for parallel sorting on heterogeneous clusters

引用

FUTURE GENERATION COMPUTER SYSTEMS 2002年第3期18卷 353-372页

作者： Cérin, C Gaudiot, JL Univ So Calif Los Angeles CA 90089 USA Univ Picardie Jules Verne Laria F-80000 Amiens France

We discuss parallel sorting algorithms and their implementations suitable for cluster architectures in order to optimize cluster resources. We focus on the time spent in computation and the load balancing properties when processors are running at different speeds, i.e. correlated by a multiplicative constant factor (our weak definition of heterogeneous platform). One scheme is under study: parallel sorting by sampling (either regular sampling technique introduced by Shi and Schaeffer [J. parallel Distrib. Comput. 14 (4) (1992) 361] or the over-partitioning scheme introduced by Li and Seveik [parallel sorting by over-partitioning, in: proceedings of the Sixth annual symposium on parallel algorithms and architectures, acm Press, New York, June 1994]). What is important in the paper is mainly the load balance factor and not necessary the execution time. It is clear that improved load balance leads to improved execution titre. The results presented in the paper demonstrate that load balancing for the case of computers with heterogeneous processing capacity is more challenging than for the homogeneous case. The survey, through the sorting case study, allow us to identify some algorithmic issues and software challenges to master heterogeneous cluster platforms in order to better utilize theta: data decomposition techniques, scheduling and load balancing methods. (C) 2002 Elsevier Science B.V. All rights reserved.

关键词： performance evaluation and modeling of parallel integer sorting algorithms sorting by regular sampling and by over-partitioning data distribution load balancing strategies BSP programming

来源：评论

学校读者我要写书评

暂无评论

parallel integer sorting is more efficient than parallel comparison sorting on exclusive write PRAMs

Parallel integer sorting is more efficient than parallel com...

引用

10th annual acm-SIAM symposium on Discrete algorithms

作者： Han, YJ Shen, XJ Univ Missouri Sch Interdisciplinary Comp & Engn Kansas City MO 64110 USA

ISBN: (纸本)0898714346

We present a significant improvement for parallel integer sorting. On the EREW (exclusive read exclusive write) PRAM our algorithm sorts n integers in the range {0, 1,..., m 1} in time O(log n) with O(n(q) (log n) over bar /k) operations using word length k log( m + n), where 1 less than or equal to k less than or equal to log n. In this paper we present the following four variants of our algorithm. (1) The first variant sorts integers in {0, 1,..., m - 1} in time O(log n) and in linear space with O(n) operations using word length log m log n. (2) The second variant sorts integers in {0, 1,..., n - 1} in time O ( log n) and in linear space with O(n rootlog n) operations using word length log n. (3) The third variant sorts integers in {0, 1,..., m - 1} in time O(log(3)/(2)n) and in linear space with O(n rootlog n) operations using word length log(m + n). (4) The fourth variant sorts integers in {0, 1,..., m - 1} in time O(log n) and space O(nm(epsilon)) with O(n rootlog n) operations using word length log( m + n). Our algorithms can then be generalized to the situation where the word length is k log( m + n), 1 less than or equal to k less than or equal to log n.

关键词： algorithms analysis of algorithms bucket sorting conservative algorithms design of algorithms integer sorting parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Two techniques for reconciling algorithm parallelism with memory constraints 02

Two techniques for reconciling algorithm parallelism with me...

引用

proceedings of the fourteenth annual acm symposium on parallel algorithms and architectures

作者： Uzi Vishkin University of Maryland

ISBN: (纸本)9781581135299

The utility of algorithm parallelism for coping with increased processor to memory latencies using "latency hiding" is part of the folklore of parallel computing. Latency hiding techniques increase the traffic to memory and therefore may "hit another wall": limited bandwidth to memory. The current paper attempts to stimulate research in the following general direction: show that algorithm parallelism need not conflict with limited bandwidth.A general technique for using parallel algorithms to enhance serial implementation in the face of processor-memory latency problems is revisited. Two techniques for alleviating memory bandwidth constraints are presented. Both techniques can be incorporated in a *** is often considerable parallelism in many of the algorithms which are known as useful serial algorithms. Interestingly enough, all the examples provided for the use of the two techniques come from such serial algorithms.

关键词： prefetching memory systems constraints parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：