检索结果-内蒙古大学图书馆

19th annual symposium on parallelism in algorithms and architectures

作者： Ajwani, Deepak Elbassioni, Khaled Govindarajan, Sathish Ray, Saurabh Max Planck Inst Informat Saarbrucken Germany

ISBN: (纸本)9781595936677

Given a set of points P subset of R-2, a conflict free coloring of P w.r.t. rectangle ranges is an assignment of colors to points of P, such that each non-empty axis-parallel rectangle T in the plane contains a point whose color is distinct from all other points in P boolean AND T. This notion has been the subject of recent interest, and is motivated by frequency assignment in wireless cellular networks: one naturally would like to minimize the number of frequencies (colors) assigned to bases stations (points), such that within any range (for instance, rectangle), there is no interference. We show that any set of n points in R-2 can be conflict-free colored with (O) over tilde (n(beta+epsilon)) colors in expected polynomial time, for any arbitrarily small epsilon > 0 and beta = 3-root 5/2 < 0.382. This improves upon the previously known bound of O(root n log log n/ log n).

关键词： Frequency assignment in wireless networks conflict-free coloring axis-parallel rectangles dominating sets monotone sequences

来源：评论

学校读者我要写书评

暂无评论

Scheduling Threads for Constructive Cache Sharing on CMPs 07

Scheduling Threads for Constructive Cache Sharing on CMPs

引用

19th annual symposium on parallelism in algorithms and architectures

作者： Chen, Shimin Gibbons, Phillip B. Kozuch, Michael Liaskovitis, Vasileios Ailamaki, Anastassia Blelloch, Guy E. Falsafi, Babak Fix, Limor Hardavellas, Nikos Mowry, Todd C. Wilkerson, Chris Intel Res Pittsburgh Pittsburgh PA USA Carnegie Mellon Univ Pittsburgh PA 15213 USA Intel Microproc Res Lab Pittsburgh PA USA

ISBN: (纸本)9781595936677

In chip multiprocessors (CMPs), limiting the number of offchip cache misses is crucial for good performance. Many multithreaded programs provide opportunities for constructive cache sharing, in which concurrently scheduled threads share a largely overlapping working set. In this paper, we compare the performance of two state-of-the-art schedulers proposed for fine-grained multithreaded programs: parallel Depth First (PDF), which is specifically designed for constructive cache sharing, and Work Stealing (WS), which is a more traditional design. Our experimental results indicate that PDF scheduling yields a 1.3-1.6X performance improvement relative to WS for several fine-grain parallel benchmarks on projected future CMP configurations;we also repert several issues that may limit the advantage of PDF in certain applications. These results also indicate that PDF more effectively utilizes off-chip bandwidth, making it possible to trade-off on-chip cache for a larger number of cores. Moreover, we find that task granularity plays a key role in cache performance. Therefore;we present an automatic approach for selecting effective grain sizes, based on a new working set profiling algorithm that is an order of magnitude faster than previous approaches. This is the first paper demonstrating the effectiveness of PDF on real benchmarks;providing a direct comparison between PDF and WS;revealing the limiting factors for PDF in practice, and presenting an approach for overcoming these factors.

关键词： Chip Multiprocessors Scheduling algorithms Constructive Cache Sharing parallel Depth First Work Stealing Thread Granularity Working Set Profiling

来源：评论

学校读者我要写书评

暂无评论

Scheduling DAGs on Asynchronous Processors 07

Scheduling DAGs on Asynchronous Processors

引用

19th annual symposium on parallelism in algorithms and architectures

作者： Bender, Michael A. Phillips, Cynthia A. SUNY Stony Brook Dept Comp Sci Stony Brook NY 11794 USA Sandia Natl Labs Discrete Algorithms & Math Dept Albuquerque NM USA

ISBN: (纸本)9781595936677

This paper addresses the problem of scheduling a DAG of unit-length tasks on asynchronous processors, that is, processors having different and changing speeds. The objective is to minimize the makespan, that is, the time to execute the entire DAG. Asynchrony is modeled by an oblivious adversary, which is assumed to determine the processor speeds at each point in time. The oblivious adversary may change processor speeds arbitrarily and arbitrarily often, but makes speed decisions independently of any random choices of the scheduling algorithm. This paper gives bounds on the makespan of two randomized online firing-squad scheduling algorithms, ALL and LEVEL. These two schedulers are shown to have good makespan even when asynchrony is arbitrarily extreme. Let W and D denote, respectively, the number of tasks and the longest path in the DAG, and let gave denote the average speed of the p processors during the execution. In ALI, each processor repeatedly chooses a random task to execute from among all ready tasks (tasks whose predecessors have been executed). Scheduler ALL is shown to have a makespan T-p = circle minus (W/p pi(ave)), when W/D >= p log p circle minus ((log p)(alpha) W/p pi(ave) + (log p)(1-alpha) D/pi(ave)), when W/D = p(log p)(1-2 alpha), for alpha is an element of [0,1] circle minus (D/pi(ave)), when W/D <= p/log p, both expected and with high probability. A family of DAGs is exhibited for which this analysis is tight. In LEVEL each of the processors repeatedly chooses a random task to execute from among all critical tasks (ready tasks at the lowest level of the DAG). This second scheduler is shown to have a makespan of T-p = circle minus (W/p pi(ave) + [log* p - log* (pD/W)] W/pi(ave)), both expected and with high probability. Thus, LEVEL is always at least as good as ALL asymptotically, and sometimes better.

关键词： Asynchronous parallel computing online scheduling firing-squad scheduling precedence-constrained scheduling

来源：评论

学校读者我要写书评

暂无评论

PRAM-on-chip: first commitment to silicon 07

PRAM-on-chip: first commitment to silicon

引用

proceedings of the nineteenth annual acm symposium on parallel algorithms and architectures

作者： Xingzhi Wen Uzi Vishkin University of Maryland

No abstract available.

ISBN: (纸本)9781595936677

No abstract available.

关键词： ease-of-programming PRAM on-chip parallel processor explicit multi-threading XMT parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Towards soft optimization techniques for parallel cognitive applications 07

Towards soft optimization techniques for parallel cognitive ...

引用

proceedings of the nineteenth annual acm symposium on parallel algorithms and architectures

作者： Woongki Baek JaeWoong Chung Chi Cao Minh Christos Kozyrakis Kunle Olukotun Stanford University Stanford CA

No abstract available.

ISBN: (纸本)9781595936677

No abstract available.

关键词： cognitive applications optimization parallel programming parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Evaluating synchronization techniques for light-weight multithreaded/multicore architectures 07

Evaluating synchronization techniques for light-weight multi...

引用

proceedings of the nineteenth annual acm symposium on parallel algorithms and architectures

作者： Srinivas Sridharan Arun Rodrigues Peter Kogge University of Notre Dame Notre Dame IN Sandia National Laboratories Albuquerque NM

No abstract available.

ISBN: (纸本)9781595936677

No abstract available.

关键词： scalable lock and barrier synchronization techniques multicore processors full-empty bits processing-in-memory light-weight multithreading

来源：评论

学校读者我要写书评

暂无评论

SPAA 2006: 18th annual acm symposium on parallelism in algorithms and architectures

SPAA 2006: 18th Annual ACM Symposium on Parallelism in Algor...

引用

SPAA 2006: 18th annual acm symposium on parallelism in algorithms and architectures

ISBN: (纸本)1595934529

The proceedings contain 43 papers. The topics discussed include: publish and perish: definition and analysis of an n-person publication impact game;exponential separation of quantum and classical online space complexity;minimizing the stretch when scheduling flows of biological requests;position paper and brief announcement: the FG programming environment - good and good for you;efficient parallel algorithms for dead sensor diagnosis and multiple access channels;on the communication complexity of randomized broadcasting in random-like graphs;strip packing with precedence constraints and strip packing with release times;on space-stretch trade-offs: lower bounds;a performance analysis of local synchronization;the cache complexity of multithreaded cache oblivious algorithms;and deterministic load balancing and dictionaries in the parallel disk model.

关键词： Computer architecture

来源：评论

学校读者我要写书评

暂无评论

Efficient automatic simulation of parallel computation on networks of workstations

引用

DISCRETE APPLIED MATHEMATICS 2006年第10期154卷 1500-1509页

作者： Kaklamanis, Christos Krizanc, Danny Montangero, Manuela Persiano, Giuseppe Univ Modena & Reggio Emilia Dipartimento Ingn Informaz I-41100 Modena Italy Univ Patras Inst Comp Technol GR-26500 Patras Greece Univ Patras Dept Comp Engn & Informat GR-26500 Patras Greece Wesleyan Univ Dept Math & Comp Sci Middletown CT 06459 USA Univ Salerno Dipartimento Informat & Applicaz I-84081 Baronissi Italy

Andrews et al. [Automatic method for hiding latency in high bandwidth networks, in: proceedings of the acm symposium on Theory of Computing, 1996, pp. 257-265;Improved methods for hiding latency in high bandwidth networks, in: proceedings of the Eighth annual acm symposium on parallel algorithms and architectures, 1996, pp. 52-61] introduced a number of techniques for automatically hiding latency when performing simulations of networks with unit delay links on networks with arbitrary unequal delay links. In their work, they assume that processors of the host network are identical in computational power to those of the guest network being simulated. They further assume that the links of the host are able to pipeline messages, i.e., they are able to deliver P packets in time O(P + d) where d is the delay on the link. In this paper we examine the effect of eliminating one or both of these assumptions. In particular, we provide an efficient simulation of a linear array of homogeneous processors connected by unit-delay links on a linear array of heterogeneous processors connected by links with arbitrary delay. We show that the slowdown achieved by our simulation is optimal. We then consider the case of simulating cliques by cliques;i.e., a clique of heterogeneous processors with arbitrary delay links is used to simulate a clique of homogeneous processors with unit delay links. We reduce the slowdown from the obvious bound of the maximum delay link to the average of the link delays. In the case of the linear array we consider both links with and without pipelining. For the clique simulation the links are not assumed to support pipelining. The main motivation of our results (as was the case with Andrews et al.) is to mitigate the degradation of performance when executing parallel programs designed for different architectures on a network of workstations (NOW). In such a setting it is unlikely that the links provided by the NOW will support pipelining and it is quite probab

关键词： parallel computation distributed computation automatic simulation

来源：评论

学校读者我要写书评

暂无评论

The FG programming environment: good and good for you 06

The FG programming environment: good and good for you

引用

proceedings of the eighteenth annual acm symposium on parallelism in algorithms and architectures

作者： Elena Riccio Davidson Dartmouth College

No abstract available.

ISBN: (纸本)9781595934529

No abstract available.

关键词： usability human factors parallel programming

来源：评论

学校读者我要写书评

暂无评论

An envolutionary path towards virtual shared memory with random access 06

An envolutionary path towards virtual shared memory with ran...

引用

proceedings of the eighteenth annual acm symposium on parallelism in algorithms and architectures

作者： Jonathan L. Brown Sue Goudy Mike Heroux Shan Shan Huang Zhaofang Wen Sandia National Laboratories Albuquerque NM

No abstract available.

ISBN: (纸本)9781595934529

No abstract available.

关键词： parallel programming programming models

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：