检索结果-内蒙古大学图书馆

application-specific customisation of multi-threaded soft processors

IEE PROCEEDINGS-COMPUTERS AND DIGITAL TECHNIQUES 2006年第3期153卷 173-180页

作者： Dimond, R. Mencer, O. Luk, W. Univ London Imperial Coll Sci Technol & Med Dept Comp London SW7 2RH England

A multi-threaded microprocessor with a customisable instruction set, CUStomisable Threaded ARchitecture (CUSTARD), is proposed. CUSTARD features include design space exploration and a compiler for automatic selection of custom instructions. Custom instructions, optimised for a specific application, accelerate frequently performed computations by implementing them as dedicated hardware. Field programmable gate array implementations of CUSTARD are evaluated using media and cryptography benchmarks, and commercial MicroBlaze processor is compared. As low as 28% area overhead for four interleaved threads and up to 355% speedup over a processor without custom instructions are demonstrated.

关键词： DATA encryption (Computer science) SYSTEM analysis COMPUTER science ENGINEERING MICROprocessors FIELD programmable gate arrays PROGRAMMABLE logic devices GATE array circuits SWITCHING circuits DIGITAL electronics

来源：评论

学校读者我要写书评

暂无评论

16th international conference on application-specific Systems, Architecture and processors - Copyright Page

16th International Conference on Application-Specific System...

引用

international conference on application specific Systems (ASAP), Architectures and processors

Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries may photocopy beyond the limits of US copyright law, for private use of patrons, those articles in this volume that carry a code at the bottom of the first page, provided that the per-copy fee indicated in the code is paid through the Copyright Clearance Center. The papers in this book comprise the proceedings of the meeting mentioned on the cover and title page. They reflect the authors' opinions and, in the interests of timely dissemination, are published as presented and without change. Their inclusion in this publication does not necessarily constitute endorsement by the editors or the Institute of Electrical and Electronics Engineers, Inc.

关键词：

来源：评论

学校读者我要写书评

暂无评论

An Efficient application specific Instruction Set Processor (ASIP) for Tensor Computation 30

An Efficient Application Specific Instruction Set Processor ...

引用

30th IEEE international conference on application-specific Systems, Architectures and processors (ASAP)

作者： Huang, Wei-pei Cheung, Ray C. C. Yan, Hong City Univ Hong Kong Dept Elect Engn Hong Kong Peoples R China

ISBN: (纸本)9781728116013

In the past decade, tensor computation is widely used in different areas. Various software toolbox have been released to assist tensor computation. However, there is still no hardware architecture to accelerate the tensor computation. This paper presents an efficient application specific instruction set processor (ASIP) for tensor computation. Different tensor computations are fully optimized in terms of resource usage and performance. We implement the ASIP on FPGA platform. We test our design by implementing the CANDECOMP/PARAFAC(CP) decomposition. Our design can achieve a low resource usage and run at 141 Mhz.

关键词： tensor computation ASIP hardware architecture field-programmable gate array (FPGA)

来源：评论

学校读者我要写书评

暂无评论

Voltage range evaluation of an optically reconfigurable gate array VLSI 35

Voltage range evaluation of an optically reconfigurable gate...

引用

35th IEEE international conference on application-specific Systems, Architectures and processors (ASAP)

作者： Shimamura, Yuki Watanabe, Minoru Watanabe, Nobuya Okayama Univ Fac Engn Dept Informat Technol 3-1-1 Tushima NakaKita Ku Okayama 7008530 Japan

ISBN: (纸本)9798350349641;9798350349634

Currently available very large-scale integrations (VLSIs) are vulnerable to radiation, as measured in terms of soft error and total-ionizing-dose. Therefore, by following a repairable VLSI concept, we have been developing a radiation-hardened optical reconfigurable gate array VLSI that can support a use of a partially damaged VLSI. Earlier development efforts have fabricated a 1 Grad total-ionizing-dose tolerant radiation-hardened optical reconfigurable gate array VLSI using the repairable VLSI concept. However, since stabilized power supply units are also vulnerable to radiation, a radiation-hardened optical reconfigurable gate array VLSI with no stabilized function must be used with a battery in intense radiation environments such as the Fukushima Daiichi Nuclear Power Plant. This paper presents the operating voltage range of a radiation-hardened optically reconfigurable gate array VLSI. These findings confirm that a battery direct drive is possible for a radiation-hardened optically reconfigurable gate array VLSI.

关键词： FPGA optically reconfigurable gate array radiation-tolerant device soft-error VLSI

来源：评论

学校读者我要写书评

暂无评论

Design of Low Power On-Chip Processor arrays

Design of Low Power On-Chip Processor Arrays

引用

23rd IEEE international conference on application-specific Systems, Architectures and processors (ASAP)

作者： Lari, Vahid Muddasani, Shravan Boppu, Srinivas Hannig, Frank Teich, Juergen Univ Erlangen Nurnberg Dept Comp Sci Nurnberg Germany

ISBN: (纸本)9780769547688

In this paper, we present an ultra low power design for a class of massively parallel architectures, called tightly-coupled processor arrays. Here, the key idea is to exploit the benefits of a decentralized resource management as inherent to invasive computing for power saving. We propose concepts and studying different architecture trade-offs for hierarchical power management by temporarily shutting down regions of processors through power gating. Moreover, a) overall system chip energy consumption, b) hardware cost, and c) timing overheads are compared for different sizes of power domains. Experimental results show that up to 70% of system energy consumption may be saved for selected characteristical algorithms and different resource utilizations.

关键词： Energy sources array processors Power (Psychology) system energy consumption LOW POWER Parallel architectures PROCESSOR power Power management of resources Power system-on-chip tight coupling

来源：评论

学校读者我要写书评

暂无评论

IEEE 17th international conference on application-specific Systems, Architectures and processors - Copyright

IEEE 17th International Conference on Application-specific S...

引用

international conference on application specific Systems (ASAP), Architectures and processors

关键词：

来源：评论

学校读者我要写书评

暂无评论

WinoCNN: Kernel Sharing Winograd Systolic array for Efficient Convolutional Neural Network Acceleration on FPGAs 32

WinoCNN: Kernel Sharing Winograd Systolic Array for Efficien...

引用

32nd IEEE international conference on application-specific Systems, Architectures and processors (ASAP)

作者： Liu, Xinheng Chen, Yao Hao, Cong Dhar, Ashutosh Chen, Deming Univ Illinois Champaign IL 61820 USA Adv Digital Sci Ctr Singapore Singapore Georgia Inst Technol Atlanta GA 30332 USA

ISBN: (纸本)9781665427012

The combination of Winograd's algorithm and systolic array architecture has demonstrated the capability of improving DSP efficiency in accelerating convolutional neural networks (CNNs) on FPGA platforms. However, handling arbitrary convolution kernel sizes in FPGA-based Winograd processing elements and supporting efficient data access remain underexplored. In this work, we are the first to propose an optimized Winograd processing element (WinoPE), which can naturally support multiple convolution kernel sizes with the same amount of computing resources and maintains high runtime DSP efficiency. Using the proposed WinoPE, we construct a highly efficient systolic array accelerator, termed WinoCNN. We also propose a dedicated memory subsystem to optimize the data access. Based on the accelerator architecture, we build accurate resource and performance modeling to explore optimal accelerator configurations under different resource constraints. We implement our proposed accelerator on multiple FPGAs, which outperforms the state-of-the-art designs in terms of both throughput and DSP efficiency. Our implementation achieves DSP efficiency up to 133 GODS/DSP and throughput up to 3.1 TOPS with the Xilinx ZCU102 FPGA. These are 29.1% and 20.0% better than the best solutions reported previously, respectively.

关键词： Winograd algorithm CNN systolic array FPGA DSP efficiency

来源：评论

学校读者我要写书评

暂无评论

application-specific processor architecture: Then and now

引用

JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY 2008年第1-2期53卷 197-215页

作者： Cappello, Peter Univ Calif Santa Barbara Dept Comp Sci Santa Barbara CA 93106 USA

We first relate the architecture of systolic arrays to the technological and economic design forces acting on architects of special-purpose systems some 20 years ago. We then observe that those same design forces now are bearing down on the architects of contemporary general-purpose processors, who consequently are producing general-purpose processors whose architectural features are increasingly similar to those of systolic arrays. We then describe some economic and technological forces that are changing the landscape of architectural research. At base, they are the increasing complexity of technology and applications, the fragmenting of the general-purpose processor market, and the judicious use hardware configurability. We describe a 2D architectural taxonomy, identifying what, we believe, to be a "sweet spot" for architectural research.

关键词： application-specific processor computer architecture field-programmable gate array FPGA general-purpose processor processor array systolic array taxonomy

来源：评论

学校读者我要写书评

暂无评论

Register transfer modeling and simulation for array processors

引用

Proceedings of the 1994 international conference on application specific array processors

作者： Chou, W.H. Kung, S.Y. Princeton Univ Princeton United States

This paper presents a register transfer modeling scheme for array processor simulation. Its main goals are to verify the application specific design by real data computation, and to help fine tune the array architecture by precise timing analysis. The data flow graph of the design is translated into a register transfer language which is further combined with a hardware description module. An interactive simulator SISim v2.0 has been implemented to simulate the behavior of such a system. The results are compared with the expected values to verify the array processor design. The recorded timing information can help the designer to analyze the system and improve the performance and resource utilization.

关键词： Parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

VLSI design and implementation of the array processors of a multilayer vision system architecture

VLSI design and implementation of the array processors of a ...

引用

Proceedings of the international conference on application specific array processors, ASAP'95

作者： Saha, B. Mertoguno, J.S. Bourbakis, N.G. Binghamton Univ Binghamton United States

This paper describes the VLSI design and simulation of the lower layer processors of the KYDON vision system. KYDON is a completely autonomous, hierarchical, multilayered image understanding system. The VLSI design of the individual components as well as the timing simulation results of the processor of every have been presented. The system runs at 50 Mhz and promises a high processing rate of 300 image frames/sec.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：