检索结果-内蒙古大学图书馆

Proceedings of the 1998 10th annual acm symposium on parallel algorithms and architectures, SPAA

作者： Vishkin, U. Dascal, Sh. Berkovich, E. Nuzman, J. Univ of Maryland and Tel-Aviv Univ

ISBN: (纸本)9780897919890

This paper envisions an extension to a standard instruction set which efficiently implements PRAM-style algorithms using explicit multi-threaded instruction-level parallelism (ILP);that is, Explicit Multi-Threading (XMT), a fine-grained computational paradigm covering the spectrum from algorithms through architecture to implementation is introduced;new elements are added where needed.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

SPAA 2021 - Proceedings of the 33rd acm symposium on parallelism in algorithms and architectures

SPAA 2021 - Proceedings of the 33rd ACM Symposium on Paralle...

引用

33rd acm symposium on parallelism in algorithms and architectures, SPAA 2021

ISBN: (纸本)9781450380706

The proceedings contain 49 papers. The topics discussed include: fast stencil computations using fast Fourier transforms;low-span parallel algorithms for the binary-forking model;provable advantages for graph algorithms in spiking neural networks;algorithms for right-sizing heterogeneous data centers;efficient parallel self-adjusting computation;speed scaling with explorable uncertainty;efficient online weighted multi-level paging;paging and the address-translation problem;massively parallel algorithms for distance approximation and spanners;efficient load-balancing through distributed token dropping;finding subgraphs in highly dynamic networks;near-optimal time-energy trade-offs for deterministic leader election;and efficient stepping algorithms and implementations for parallel shortest paths.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Efficient external memory algorithms by simulating coarse-grained parallel algorithms 97

Efficient external memory algorithms by simulating coarse-gr...

引用

Proceedings of the 1997 9th annual acm symposium on parallel algorithms and architectures, SPAA

作者： Dehne, Frank Dittrich, Wolfgang Hutchinson, David Carleton Univ Ottawa Ont Canada

ISBN: (纸本)9780897918909

External memory (EM) algorithms are designed for computational problems in which the size of the internal memory of the computer is only a small fraction of the problem size. For certain large scale applications this is necessarily true. Typically, the cost models proposed for external memory algorithms have measured only the number of I/O operations, and the algorithms have been specially crafted for the EM situation. In the past, several attempts have been made to relate the large body of work based on parallel algorithms to EM, but with limited success. In this paper we provide simulation techniques which produce efficient EM algorithms from efficient algorithms developed under BSP-like parallel computing models. Our techniques can accommodate one or multiple processors on the EM target machine, each with one or more disks, and they also adapt to the disk blocking factor of the target machine. In addition to the main simulation result we obtain a more comprehensive cost model for EM algorithms, which considers the total costs incurred by the algorithm including computation, I/O and communication costs.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

A work-optimal CGM algorithm for the LIS problem

A work-optimal CGM algorithm for the LIS problem

引用

13th annual symposium on parallel algorithms and architectures (SPAA 2001)

作者： Thierry, G. Jean-Frádáric, M. David, S. Lab. de Recherche en Info. d'Amiens Univ. de Picardie Jules Verne CURI 5 rue du Moulin Neuf 80000 Amiens France

This paper presents a work-optimal CGM algorithm that solves the Longest Increasing Subsequence Problem. It can be implemented in the CGM with P processors in O(N2/P) time and O(P) communication steps. It is the first... 详细信息

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

On a scheme for parallel sorting on heterogeneous clusters

引用

FUTURE GENERATION COMPUTER SYSTEMS 2002年第3期18卷 353-372页

作者： Cérin, C Gaudiot, JL Univ So Calif Los Angeles CA 90089 USA Univ Picardie Jules Verne Laria F-80000 Amiens France

We discuss parallel sorting algorithms and their implementations suitable for cluster architectures in order to optimize cluster resources. We focus on the time spent in computation and the load balancing properties when processors are running at different speeds, i.e. correlated by a multiplicative constant factor (our weak definition of heterogeneous platform). One scheme is under study: parallel sorting by sampling (either regular sampling technique introduced by Shi and Schaeffer [J. parallel Distrib. Comput. 14 (4) (1992) 361] or the over-partitioning scheme introduced by Li and Seveik [parallel sorting by over-partitioning, in: Proceedings of the Sixth annual symposium on parallel algorithms and architectures, acm Press, New York, June 1994]). What is important in the paper is mainly the load balance factor and not necessary the execution time. It is clear that improved load balance leads to improved execution titre. The results presented in the paper demonstrate that load balancing for the case of computers with heterogeneous processing capacity is more challenging than for the homogeneous case. The survey, through the sorting case study, allow us to identify some algorithmic issues and software challenges to master heterogeneous cluster platforms in order to better utilize theta: data decomposition techniques, scheduling and load balancing methods. (C) 2002 Elsevier Science B.V. All rights reserved.

关键词： performance evaluation and modeling of parallel integer sorting algorithms sorting by regular sampling and by over-partitioning data distribution load balancing strategies BSP programming

来源：评论

学校读者我要写书评

暂无评论

Dense edge-disjoint embedding of binary trees in the mesh 92

Dense edge-disjoint embedding of binary trees in the mesh

引用

4th annual acm symposium on parallel algorithms and architectures - SPAA '92

作者： Gibbons, Alan Paterson, Michael Univ of Warwick Coventry United Kingdom

ISBN: (纸本)089791483X

We present an embedding of the complete binary tree with n leaves in the √n × √n mesh, for any n = 22m where m is a positive integer. The embedding has the following properties: at most two tree nodes (one of which is a leaf) are mapped onto each mesh node, paths of the tree are mapped onto edge-disjoint paths in the mesh (each mesh edge being considered as two anti-parallel directed edges) and the maximum distance from a leaf to the root of the tree is √n + O(log n) mesh steps. This embedding facilitates efficient implementation of many P-RAM algorithms on the mesh, particularly those using the balanced binary tree technique. Such an embedding offers greater flexibility of use and improves the time complexity of these implementations by a constant factor compared with previously described embeddings.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Efficient galois field arithmetic on SIMD architectures 03

Efficient galois field arithmetic on SIMD architectures

引用

Fifteenth annual acm symposium on parallelism in algorithms and architectures

作者： Bhaskar, Raghav Dubey, Pradeep K. Kumar, Vijay Rudra, Atri Sharma, Animesh Project CODES INRIA Rocquencourt Le Chesnay France Strat. Plan. and Optimization Team *** Seattle WA United States Department of Computer Science University of Texas at Austin Austin TX United States Fiorano Software Ltd New Delhi India

The use of efficient Galois Field Arithmetic on SIMD architecture was presented. SIMD architectures were used for obtaining high speed implementation in the fields where data parallelism was encountered. In regard wit... 详细信息

ISBN: (纸本)9781581136616

关键词： Computer architecture

来源：评论

学校读者我要写书评

暂无评论

New preconditioner for the parallel solution of positive definite Toeplitz systems

New preconditioner for the parallel solution of positive def...

引用

SPAA '90 - Proceedings of the 2nd annual acm symposium on parallel algorithms and architectures

作者： Bini, Dario Di Benedetto, Fabio Univ di Pisa Pisa Italy

ISBN: (纸本)0897913701

We introduce a new preconditioner for solving a symmetric Toeplitz system of equations by the conjugate gradient method. This choice leads to an algorithm which is particularly suitable for parallel computations and, compared to the circulant preconditioner of [C3], has a better asymptotic convergence rate and a lower arithmetic cost per iteration.

关键词： Computer Programming

来源：评论

学校读者我要写书评

暂无评论

On the analysis of randomized load balancing schemes 97

On the analysis of randomized load balancing schemes

引用

Proceedings of the 1997 9th annual acm symposium on parallel algorithms and architectures, SPAA

作者： Mitzenmacher, Michael Digital Systems Research Cent Palo Alto CA United States

ISBN: (纸本)9780897918909

It is well known that simple randomized load balancing schemes can balance load effectively while incurring only a small overhead, making such schemes appealing for practical systems. In this paper, we provide new analyses for several such dynamic randomized load balancing schemes. Unlike previous analyses, we do not assume that in equilibrium each server is stochastically independent from other servers. Our work extends a previous analysis of the supermarket model, a model that abstracts a simple, efficient load balancing scheme in the setting where jobs arrive at a large system of parallel processors. In this model, customers arrive at a system of n servers as a Poisson stream of rate λn, λ

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Provably efficient scheduling for languages with fine-grained parallelism 95

Provably efficient scheduling for languages with fine-graine...

引用

Proceedings of the 7th annual acm symposium on parallel algorithms and architectures, SPAA'95

作者： Blelloch, Guy E. Gibbons, Phillip B. Matias, Yossi Carnegie Mellon Univ Pittsburgh PA United States

ISBN: (纸本)9780897917179

Most high-level parallel programming languages allow for fine-grained parallelism. Programs written in such languages can express the full parallelism in the program without specifying the mapping of program tasks to processors. When executing such programs, the major concern is to dynamically schedule tasks to processors in order to minimize execution time and the amount of memory needed. In this paper, a class of parallel schedules that are provably efficient in both time and space, even for programs whose task structure is revealed only during execution are identified. Following this, an efficient dynamic scheduling algorithm that generates schedules in this class, for languages with nested fine-grained parallelism is described.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：