检索结果-内蒙古大学图书馆

23rd ACM symposium on parallelism in algorithms and architectures, SPAA'11

ISBN: (纸本)9781450307437

The proceedings contain 50 papers. The topics discussed include: graph expansion and communication costs of fast matrix multiplication;near linear-work parallel SDD solvers, low-diameter decomposition, and low-stretch subgraphs;linear-work greedy parallel approximate set cover and variants;optimizing hybrid transactional memory: the importance of nonspeculative operations;parallelism and data movement characterization of contemporary application classes;work-stealing for mixed-mode parallelism by deterministic team-building;full reversal routing as a linear dynamical system;reclaiming the energy of a schedule, models and algorithms;a tight runtime bound for synchronous gathering of autonomous robots with limited visibility;convergence of local communication chain strategies via linear transformations: or how to trade locality for speed;and convergence to equilibrium of logit dynamics for strategic games.

关键词：

来源：评论

学校读者我要写书评

暂无评论

annual ACM symposium on parallelism in algorithms and architectures: Foreword

Annual ACM Symposium on Parallelism in Algorithms and Archit...

引用

annual ACM symposium on parallelism in algorithms and architectures 2011年 iii页

作者： Auf Der Heide, Friedhelm Meyer Rajaraman, Rajmohan University of Paderborn Germany Northeastern University United States

来源：评论

学校读者我要写书评

暂无评论

Foreword: parallelism in algorithms and architectures

引用

THEORY OF COMPUTING SYSTEMS 2014年第3期55卷 449-450页

作者： Pucci, Geppino Luchangco, Victor Rajaraman, Rajmohan Univ Padua Padua Italy Oracle Burlington MA USA Northeastern Univ Boston MA 02115 USA

This special issue contains 6 selected papers whose preliminary versions appeared in the Proceedings of the 23rd annual ACM symposium on parallelism in algorithms and architectures (SPAA), held June 2011, in San Jose, California, USA. These papers were selected by the special issue co-editors from 35 papers that were presented at the conference. The authors were invited to submit full versions of their papers, which were then fully refereed according to the usual standards of Theory of Computing Systems. The selected papers are representative of the breadth and depth of the research in parallelism in algorithms and architectures that was presented at SPAA 2011.

关键词： COMPUTER network protocols algorithms COMPUTER networks CONGRESSES SPECIAL issues of periodicals BOUNDS (Mathematics) SYNCHRONIZATION

来源：评论

学校读者我要写书评

暂无评论

Brief Announcement: Communication Bounds for Heterogeneous architectures 11

Brief Announcement: Communication Bounds for Heterogeneous A...

引用

23rd annual symposium on parallelism in algorithms and architectures

作者： Ballard, Grey Demmel, James Gearhart, Andrew UC Berkeley Berkeley CA USA

ISBN: (纸本)9781450307437

As the gap between the cost of communication (i.e., data movement) and computation continues to grow, the importance of pursuing algorithms which minimize communication also increases. Toward this end, we seek asymptotic communication lower bounds for general memory models and classes of algorithms. Recent work [2] has established lower bounds for a wide set of linear algebra algorithms on a sequential machine and on a parallel machine with identical processors. This work extends these previous bounds to a heterogeneous model in which processors access data and perform floating point operations at differing speeds. We also present an algorithm for dense matrix multiplication which attains the lower bound.

关键词： communication-avoiding heterogeneity

来源：评论

学校读者我要写书评

暂无评论

Designing scalable FPGA architectures using high-level synthesis 18

Designing scalable FPGA architectures using high-level synth...

引用

23rd ACM SIGPLAN symposium on Principles and Practice of Parallel Programming (PPoPP)

作者： Licht, Johannes de Fine Blott, Michaela Hoefler, Torsten Swiss Fed Inst Technol Zurich Switzerland Xilinx Inc San Jose CA USA

ISBN: (纸本)9781450349826

Massive spatial parallelism at low energy gives FPGAs the potential to be core components in large scale high performance computing (HPC) systems. In this paper we present four major design steps that harness high-level synthesis (HLS) to implement scalable spatial FPGA algorithms. To aid productivity, we introduce the open source library hlslib to complement HLS. We evaluate kernels designed with our approach on an FPGA accelerator board, demonstrating high performance and board utilization with enhanced programmer productivity. By following our guidelines, programmers can use HLS to develop efficient parallel algorithms for FPGA, scaling their implementations with increased resources on future hardware.

关键词： Field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

Proceedings - 2015 IEEE 23rd annual symposium on High-Performance Interconnects, HOTI 2015

Proceedings - 2015 IEEE 23rd Annual Symposium on High-Perfor...

引用

23rd IEEE annual symposium on High-Performance Interconnects, HOTI 2015

ISBN: (纸本)9781467391603

The proceedings contain 11 papers. The topics discussed include: NUMA aware I/O in virtualized systems;the BXI interconnect architecture;exploiting offload enabled network interfaces;a brief introduction to the OpenFabrics interfaces - a new network API for maximizing high performance application efficiency;UCX: an open source framework for HPC network APIs and beyond;OWN: optical and wireless network-on-chip for kilo-core architectures;Amon: an advanced mesh-like optical NoC;impact of InfiniBand DC transport protocol on energy consumption of all-to-all collective algorithms;implementing ultra low latency data center services with programmable logic;and enhanced overloaded CDMA interconnect (OCI) bus architecture for on-chip communication.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Recovering numerical reproducibility in hydrodynamic simulations 23

Recovering numerical reproducibility in hydrodynamic simulat...

引用

23rd IEEE symposium on Computer Arithmetic (ARITH)

作者： Langlois, Philippe Nheili, Rafife Denis, Christophe Univ Perpignan Digits Architectures & Logiciels Informat Via Domitia Perpignan France Univ Montpellier 2 Lab Informat Robot & Microelect Montpellier UMR 5506 CNRS Montpellier France ENS Cachan CMLA Res Ctr Appl Maths Cachan France

ISBN: (纸本)9781509016150

HPC simulations suffer from failures of numerical reproducibility because of floating-point arithmetic peculiarities. Different computing distributions of a parallel computation may yield different numerical results. We are interested in a finite element computation of hydrodynamic simulations within the openTelemac software where parallelism is provided by domain decomposition. One main task in a finite element simulation consists in building one large linear system and to solve it. Here the building step relies on element-by-element storage mode and the solving step applies the conjugated gradient algorithm. The subdomain parallelism is merged within these steps. We study why reproducibility fails in this process and which operations have to be corrected. We detail how to use compensation techniques to compute a numerically reproducible resolution. We illustrate this approach with the reproducible version of one test case provided by the openTelemac software suite.

关键词： Numerical reproducibility finite element domain decomposition hydrodynamics simulation HPC compensated algorithms openTelemac-Mascaret

来源：评论

学校读者我要写书评

暂无评论

Realistic scheduling: Compaction for pipelined architectures 23

Realistic scheduling: Compaction for pipelined architectures

引用

23rd annual Workshop and symposium on Microprogramming and Microarchitecture, MICRO 1990

作者： Nicolau, Alexandru Potasman, Roni Information and Computer Science Department University of California United States Dept. of Electrical and Computer Engineering University of California IrvineCA92717 United States

ISBN: (纸本)0897914139

This paper presents an approach for the development of microcode for parallel and pipelined machines. The approach is geared towards mapping programs with real-time constraints and/or massive time requirements onto synchronous parallel computers (VLIW's, superscalars and microengines). In order to exploit the maximal parallelism from such machines, both spatial (multiple functional units) and temporal (pipelined) capabilities of the architecture need to be exploited. Until now, parallelizing compilers for parallel machines have not fully taken advantage of pipelining capabilities: They have either assumed that all operations take one cycle or have added pipelining as an after thought. These approaches restrict the speed-up. We built a system which is based on a set of low-level transformations called Pipelined Percolation Scheduling (PPS). The transformations integrate the exploitation of temporal and spatial parallelism. Although these low-level transformations are integrated into our system they are self-contained and may be used separately by applying 'higher level' transformations (on top of PPS) to optimize performance for a target architecture. © 1990 IEEE Computer Society. All rights reserved.

关键词： Pipelines

来源：评论

学校读者我要写书评

暂无评论

Performance comparison of Flow Aware Networking (FAN) architectures under GridFTP traffic 08

Performance comparison of Flow Aware Networking (FAN) archit...

引用

23rd annual ACM symposium on Applied Computing

作者： Cardenas, Cesar Gagnaire, Maurice ENST Paris F-75013 Paris France

ISBN: (纸本)9781595937537

Grid networks are large distributed systems that share and virtualize heterogeneous resources. Quality of Service (QoS) is a key and complex issue for Grid services provisioning. Currently, most Grid networks offer best-effort (BE) services. Thus, QoS architectures initially developed for Internet such as DiffServ (DS) have been adapted to Grid environment. Since the widespread of Internet, many Grid networks will be deployed in the years to come over this technology. In this paper, we propose to compare two Flow-Aware Networking (FAN) architectures, mainly from the second generation (2GFAN). The purpose is to answer the question of which 2GFAN architecture performs better under Grid traffic. FAN is a promising option to DS for QoS provisioning in Internet networks. DS provides QoS differentiation through explicit packet marking and classification whereas FAN consist on per-flow admission control and implicit flow differentiation through priority fair queuing. The main difference between the two 2GFAN architectures is the fair queuing algorithm. Thus;to the knowledge of the authors, this is the first time two priority per-flow fair queuing algorithms are compared under Grid traffic. A GridFTP session may be seen as a succession of parallel TCP flows with large volumes of data transfers. Metrics used are average delay, average goodput and the average rejection rate.

关键词： Grid Networks Flow-Aware Networking Quality of Service

来源：评论

学校读者我要写书评

暂无评论

VLIW-in-the-large: a model for fine grain parallelism exploitation on distributed memory multiprocessors

VLIW-in-the-large: a model for fine grain parallelism exploi...

引用

annual Workshop on Microprogramming and Microarchitecture

作者： M. Danelutto M. Vanneschi Department of Computer Sciences University of Pisa Pisa Italy Dept. of Comput. Sci. Pisa Univ. Italy

VLIW architectures have been shown to be able to exploit large amounts of find grain parallelism in the execution of sequential imperative programs. In this paper, a new computing model is presented, which allows the VLIW techniques to be adopted to operate a distributed memory, multiprocessor machine. The model, called VLIW-in-the-large, can be adopted in conjunction with a suitable hardware framework to obtain consistent speedups in the execution of both sequential and parallel-natured software. The authors show that the advantages of the VLIW-in-the-large computing model with respect to the classical VLIW approach are: (i) better utilization of hardware resources; (ii) extension of the applicability of the VLIW techniques to multiprocessor architectures, in such a way that they can be used for multi-style, multi-grain parallelism exploitation; (iii) compact realization of processing elements, suitable for VLSI massively parallel architectures.< >

关键词： Parallel processing VLIW Hardware Parallel architectures Computer architecture Concurrent computing Vector processors Flow graphs Data analysis Computer science

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：