检索结果-内蒙古大学图书馆

Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores

引用

IEEE TRANSACTIONS ON COMPUTERS 2021年第2期70卷 212-227页

作者： Schuiki, Fabian Zaruba, Florian Hoefler, Torsten Benini, Luca Swiss Fed Inst Technol Integrated Syst Lab IIS CH-8092 Zurich Switzerland Swiss Fed Inst Technol Scalable Parallel Comp Lab SPCL CH-8092 Zurich Switzerland Univ Bologna Dept Elect Elect & Informat Engn DEI I-40126 Bologna Italy

Single-issue processor cores are very energy efficient but suffer from the von Neumann bottleneck, in that they must explicitly fetch and issue the loads/storse necessary to feed their ALU/FPU. Each instruction spent on moving data is a cycle not spent on computation, limiting ALU/FPU utilization to 33 percent on reductions. We propose "Stream Semantic Registers" to boost utilization and increase energy efficiency. SSR is a lightweight, non-invasive RISC-V ISA extension which implicitly encodes memory accesses as register reads/writes, eliminating a large number of loads/stores. We implement the proposed extension in the RTL of an existing multi-core cluster and synthesize the design for a modern 22 nm technology. Our extension provides a significant, 2x to 5x, architectural speedup across different kernels at a small 11 percent increase in core area. Sequential code runs 3x faster on a single core, and 3x fewer cores are needed in a cluster to achieve the same performance. The utilization increase to almost 100 percent in leads to a 2x energy efficiency improvement in a multi-core cluster. The extension reduces instruction fetches by up to 3.5x and instruction cache power consumption by up to 5.6x. Compilers can automatically map loop nests to SSRs, making the changes transparent to the programmer.

关键词： Parallel architectures micro-architecture implementation considerations energy-aware systems

来源：评论

学校读者我要写书评

暂无评论

A GPU Register File using Static Data Compression 20

A GPU Register File using Static Data Compression

引用

49th International Conference on Parallel Processing (ICPP)

作者： Angerd, Alexandra Sintorn, Erik Stenstrom, Per Chalmers Univ Technol Dept Comp Sci & Engn Gothenburg Sweden

ISBN: (纸本)9781450388160

GPUs rely on large register files to unlock thread-level parallelism for high throughput. Unfortunately, large register files are power hungry, making it important to seek for new approaches to improve their utilization. This paper introduces a new register file organization for efficient register-packing of narrow integer and floating-point operands designed to leverage on advances in static analysis. We show that the hardware/software co-designed register file organization yields a performance improvement of up to 79%, and 18.6%, on average, at a modest output-quality degradation.

关键词： Graphics processors micro-architecture implementation considerations Data compaction and compression Approximation

来源：评论

学校读者我要写书评

暂无评论

FusedCache: A Naturally Inclusive, Racetrack Memory, Dual-Level Private Cache

引用

IEEE TRANSACTIONS ON MULTI-SCALE COMPUTING SYSTEMS 2016年第2期2卷 69-82页

作者： Xu, Haifeng Alkabani, Yousra Melhem, Rami Jones, Alex K. Univ Pittsburgh Dept Elect & Comp Engn Pittsburgh PA 15260 USA Amer Univ Cairo Comp Sci & Engn Cairo Cairo Governora Egypt Univ Pittsburgh Dept Comp Sci Pittsburgh PA 15260 USA

We propose FusedCache, a two-level set-associative Racetrack memory (RM) cache design that utilizes RM's high density for providing fast uniform access at one level, and non-uniform access at the next. FusedCache is well suited for private L1/L2 caches enforcing alignment of L1 data with the RM access points with the remaining non-aligned data serving as L2. It uses traditional LRU eviction for L1 misses. Promotion and demotion between L1 and L2 are performed through shifts and, when necessary, background swap operations. These swap operations do not require physical stores or loads, making accesses both faster and more energy efficient. Further, unlike a traditional inclusive cache hierarchy, fused L1 cache lines naturally exist in L2 avoiding duplicated storage and tag structures, promotions, and evictions. L1 status on each track is strictly enforced by track LRU maintenance and background swapping. Our results demonstrate that compared to an iso-area L1 SRAM cache replacement, FusedCache improves application performance by 7 percent while reducing cache energy by 33 percent. Compared to an iso-capacity two level (L1/L2) SRAM cache replacement, FusedCache provides similar performance with a dramatic 69 percent cache energy reduction. Compared to a TapeCache L1 scheme, FusedCache gains a 7 percent performance improvement with a 6 percent cache energy saving which translates to a 13 percent improvement in energy-delay product.

关键词： Emerging technologies general computer systems organization memory hierarchy micro-architecture implementation considerations processor architectures computer systems organization

来源：评论

学校读者我要写书评

暂无评论

RESOURCE MANAGEMENT FOR SOFTWARE-DEFINED RADIO CLOUDS

引用

IEEE micro 2012年第1期32卷 44-53页

作者： Gomez Miguelez, Ismael Marojevic, Vuk Gelonch Bosch, Antoni Univ Politecn Cataluna Dept Signal Theory & Commun ES-08034 Barcelona Spain

Software-defined radio (sdr) clouds combine sdr concepts with cloud computing technology for designing and managing future base stations. They provide a scalable solution for the evolution of wireless communications. The authors focus on the resource management implications and propose a hierarchical approach for dynamically managing the real-time computing constraints of wireless communications systems that run on the sdr cloud.

关键词： Antenna arrays Cloud computing Compilers Digital signal processing Distributed Systems integration and modeling micro-architecture implementation considerations Parallel processors Processor architectures Real time systems Real-time distributed Resource management Scheduling Signal processing systems System architectures Wireless communication

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：