检索结果-内蒙古大学图书馆

Exploiting Locality in Single Assignment Data Structures Updated Through Split-Phase Transactions

Cluster Computing 2001年第4期4卷 281-293页

作者： Amaral, José Nelson Lin, Wen-Yen Gaudiot, Jean-Luc Gao, Guang R. Department of Computing Science University of Alberta Edmonton Canada Tia Mobile Inc. Pasadena USA Department of Electrical Engineering University of Southern California Los Angeles USA Computer Architecture and Parallel Systems Laboratory Department of Electrical and Computer Engineering University of Delaware Newark USA

We present the design, implementation, and evaluation of single assignment data structures and of a software controlled cache in an existing multi-threaded architecture platform – the Efficient architecture for Running Threads (EARTH). The I-Structure Software-Controlled Cache (ISSC) exploits temporal and spatial locality of EARTH split-phased memory transactions for single-assignment memory references. Our experimental evaluation indicates that the caching mechanism for single-assignment storage makes the EARTH memory system more robust to variations in the latency of memory operations. As a consequence the system can be ported to a wider range of machine platforms and deliver speedup for both regular and irregular application.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Design and implementation of an efficient thread partitioning algorithm 3rd

Design and implementation of an efficient thread partitionin...

引用

3rd International Symposium on High Performance Computing, ISHPC 2000

作者： Amaral, José Nelson Gao, Guang Kocalar, Erturk Dogan O'Neill, Patrick Tang, Xinan Computer Architecture and Parallel Systems Laboratory University of Delaware NewarkDE United States Dep. of Comp. Science Univ. of Alberta Canada

ISBN: (纸本)9783540411284

The development of fine-grain multi-threaded program ex-ecution models has created an interesting challenge: how to partition a program into threads that can exploit machine parallelism, achieve latency tolerance, and maintain reasonable locality of reference? A suc-cessful algorithm must produce a thread partition that best utilizes mul-tiple execution units on a single processing node and handles long and unpredictable latencies. In this paper, we introduce a new thread partitioning algorithm that can meet the above challenge for a range of machine architecture models. A quantitative aFFInity heuristic is introduced to guide the placement of operations into threads. This heuristic addresses the trade-off between exploiting parallelism and preserving locality. The algorithm is surpris-ingly simple due to the use of a time-ordered event list to account for the multiple execution unit activities. We have implemented the proposed al-gorithm and our experiments, performed on a wide range of examples, have demonstrated its eFFIciency and effectiveness. © Springer-Verlag Berlin Heidelberg 2000.

关键词： Economic and social effects

来源：评论

学校读者我要写书评

暂无评论

Coping with very high latencies in petaflop computer systems 2nd

引用

2nd International Symposium on High Performance Computing, ISHPC 1999

作者： Ryan, Sean Amaral, José N. Gao, Guang Ruiz, Zachary Marquez, Andres Theobald, Kevin Computer Architecture and Parallel Systems Laboratory University of Delaware NewarkDE United States

ISBN: (纸本)3540659692

The very long and highly variable latencies in the deep memory hierarchy of a petaflop-scale architecture design, such as the Hybrid Technology Multi-Threaded architecture (HTMT) [13], present a new challenge to its programming and execution model. A solution to coping with such high and variable latencies is to directly and explicitly expose the different memory regions of the machine to the program execution model, allowing better management of communication. In this paper we describe the novel percolation model that lies at the heart of the HTMT program execution model [13]. The Percolation Model combines multithreading with dynamic prefetching of coarse-grain contexts. In the past, prefetching techniques have concentrated on moving blocks of data within the memory hierarchy. Instead of only moving contiguous blocks of data, the thread percolation approach manages contexts that include data, program instructions, and control states. The main contributions of this paper include the specification of the HTMT runtime execution model based on the concept of percolation, and a discussion of the role of the compiler in a machine that exposes the memory hierarchy to the programming model. © 1999, Springer-Verlag. All rights reserved.

关键词： Solvents

来源：评论

学校读者我要写书评

暂无评论

Superconducting processors for HTMT: issues and challenges

Superconducting processors for HTMT: issues and challenges

引用

Frontiers of Massively parallel Computation

作者： K.B. Theobald G.R. Gao T.L. Sterling Computer Architecture and Parallel Systems Laboratory Department of Electrical and Computer Engineering University of Delaware Newark DE USA NASA Jet Propulsion Laboratory /Center for Advanced Computing Research California Institute of Technology Pasadena CA USA

The Hybrid Technology Multi-Threading project is a long-term study of the feasibility of combining several emerging technologies to reach 1 petaFLOPS within ten years. HTMT will combine high-speed superconductor processors, semiconductor memories with built-in processors, high-speed optical interconnects, and high-density holographic storage. While there are major challenges in all aspects of this project, those in processor architecture are the focus of this paper. Fundamental differences between RSFQ circuits and conventional semiconductor circuits, including a radical jump in clock speed, make today's processor design approaches inappropriate for HTMT. Sequential instruction dispatching, even within the lowest programming unit (a strand), will lead to unacceptably high latencies, hence poor performance. We propose alternative processor designs which use fine-grain synchronizations between individual instructions in order to avoid these bottlenecks.

关键词： Random access memory Optical buffering Holography Holographic optical components Delay computer architecture Optical interconnections Electrical capacitance tomography Quantum computing Clocks

来源：评论

学校读者我要写书评

暂无评论

parallel scientific computing in promoter on the interface between application modelling and language design 98

Parallel scientific computing in promoter on the interface b...

引用

1998 ACM Symposium on Applied Computing, SAC 1998

作者： Besch, Matthias Heber, Gerd Wilhelmi, Matthias Parallel and Distributed Systems GMD Laboratory GMD-FIRST Rudower Chaussee 5 BerlinD-12489 Germany Computer Architecture and Parallel Systems Laboratory University of Delaware 140 Evans Hall NewarkDE19716 United States

ISBN: (纸本)0897919696

a key issue of problem-oriented parallel programming is an appropriate concept for representing the spatial structures of an application and modelling local or global interactions operating on them. This paper advocates for the use of so-called index spaces as a unified and powerful expression tool. It discusses the interface between application modelling and programming abstractions, and presents their embeddings in the high-level, data parallel language Promoter. © 1998 ACM.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：