检索结果-内蒙古大学图书馆

26th international conference on field-programmable logic and applications (FPL)

作者： Liang, Wei Yin, Wenbo Kang, Ping Wang, Lingli Fudan Univ State Key Lab ASIC & Syst Shanghai Peoples R China

ISBN: (纸本)9782839918442

Key-value stores (KVS) become critical in many applications because of the data explosion recently. there is a strong demand to improve the throughput and reduce the latency for KVS. FPGA-based parallel architecture can bring excellent performance and power efficiency. Cuckoo hashing has proven to be an efficient approach to implement KVS with good memory utilization and constant worst case access time. In this paper, an FPGA-based KVS implementation is proposed based on Cuckoo hashing, with a decoupled storage to achieve 81.7% memory utilization, and a pipeline scheme to achieve high performance. the latency of insert, search and delete operations is only 40 ns. And the throughput for search and delete can be 200 million requests per second (MRPS) which is 5x faster than [1]. Even when the load factor becomes 0.9, the throughput for insert can still achieve 147 MRPS.

关键词： field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

Invited paper: Enhanced architectures, design methodologies and cad tools for dynamic reconfiguration of Xilinx FPGAS

Invited paper: Enhanced architectures, design methodologies ...

引用

16th international conference on field programmable logic and applications

作者： Lysaght, Patrick Blodget, Brandon Mason, Jeff Young, Jay Bridgford, Brendan Xilinx Res Labs San Jose CA 95124 USA Xilinx Longmont CO 80503 USA

ISBN: (纸本)9781424403127

We describe architectural enhancements to Xilinx FPGAs that provide better support for the creation of dynamically reconfigurable designs. these are augmented by a new design methodology that uses pre-routed IP cores for communication between static and dynamic modules and permits static designs to route through regions otherwise reserved. for dynamic modules. A new CAD tool flow to automate the methodology is also presented. the new tools initially target the Virtex-II, Virtex-II Pro and Virtex-4 families and are derived from Yjlinx's commercial CAD tools.

关键词： field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

Wirelength prediction for FPGAS

Wirelength prediction for FPGAS

引用

17th international conference on field programmable logic and applications

作者： Pandit, Audip Akoglu, Ali Univ Arizona Dept Elect & Comp Engn Tucson AZ 85721 USA

ISBN: (纸本)9781424410590

FPGA CAD tools require wirelength predictions to make informed decisions through clustering, placement and routing stages towards power, area or delay based design goals. Unfortunately, there has been minimal work devoted to estimating individual wirelengths early in the CAD flow. Rent's rule can be used to generate a wirelength distribution but cannot be used to predict lengths of individual wires. Hence, this paper explores "structural metrics" that have been found to possess strong predictive qualities in the ASIC domain. To our knowledge this is a first study in the application of these metrics in the FPGA CAD flow. Results show that the studied metrics capture characteristics of placement optimization carried out by VPR, and hence, are good indicators of post-placement wirelengths.

关键词： field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

H2PIPE: High throughput CNN Inference on FPGAs with High-Bandwidth Memory 34

H2PIPE: High Throughput CNN Inference on FPGAs with High-Ban...

引用

34th international conference on field-programmable logic and applications (FPL)

作者： Doumet, Mario Stant, Marius Hall, Mathew Betz, Vaughn Univ Toronto Dept Elect & Comp Engn Toronto ON Canada Microsoft Corp Redmond WA 98052 USA Vector Inst Toronto ON Canada

ISBN: (纸本)9798331530082;9798331530075

Convolutional Neural Networks (CNNs) combine large amounts of parallelizable computation with frequent memory access. field programmable Gate Arrays (FPGAs) can achieve low latency and high throughput CNN inference by implementing dataflow accelerators that pipeline layer-specific hardware to implement an entire network. By implementing a different processing element for each CNN layer, these layer-pipelined accelerators can achieve high compute density, but having all layers processing in parallel requires high memory bandwidth. Traditionally this has been satisfied by storing all weights on chip, but this is infeasible for the largest CNNs, which are often those most in need of acceleration. In this work we augment a state-of-the-art dataflow accelerator (HPIPE) to leverage both High-Bandwidth Memory (HBM) and on-chip storage, enabling high performance layer-pipelined dataflow acceleration of large CNNs. Based on profiling results of HBM's latency and throughput against expected address patterns, we develop an algorithm to choose which weight buffers should be moved off chip and how deep the on-chip FIFOs to HBM should be to minimize compute unit stalling. We integrate the new hardware generation within the HPIPE domain-specific CNN compiler and demonstrate good bandwidth efficiency against theoretical limits. Compared to the best prior work we obtain speed-ups of at least 19.4x, 5.1x and 10.5x on ResNet-18, ResNet-50 and VGG-16 respectively.

关键词： field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

TAIGA: A NEW RISC-V SOFT-PROCESSOR FRAMEWORK ENABLING HIGH PERFORMANCE CPU ARCHITECTURAL FEATURES 27

TAIGA: A NEW RISC-V SOFT-PROCESSOR FRAMEWORK ENABLING HIGH P...

引用

27th international conference on field programmable logic and applications (FPL)

作者： Matthews, Eric Shannon, Lesley Simon Fraser Univ Sch Engn Sci Burnaby BC Canada

ISBN: (纸本)9789090304281

Recently, there has been an increased focus on integration of reconfigurable fabric with modern processors. However, existing soft-processors are optimized to leverage older FPGA fabrics, focus primarily on resource minimization and have fixed-pipeline designs that limit the scope for tightly integrated hardware accelerators. In this work, we present Taiga: a RISC-V, 32-bit, soft-processor architecture supporting the RISC-V Multiply/Divide and Atomic operations extensions (RV32IMA) designed to support Linux-based shared-memory systems. the processor design is highly configurable and features a standardized interface for functional units allowing for ease of integration of new functional units. Despite a more complex pipeline, our design uses approximately 33% fewer slices while clocking 39% faster than a LEON3 based system built on a Xilinx Zynq X7CZ020.

关键词： Program processors field programmable gate arrays Pipelines Rockets Table lookup Fabrics

来源：评论

学校读者我要写书评

暂无评论

Formal modeling of process migration

Formal modeling of process migration

引用

17th international conference on field programmable logic and applications

作者： Blumer, Aric D. Mortveit, Henning Patterson, Cameron D. Virginia Polytech Inst & State Univ Bradley Dept Elect & Comp Engn Blacksburg VA 24061 USA Virginia Polytech Inst & State Univ Virginia Bioinformat Inst Blacksburg VA 24061 USA

ISBN: (纸本)9781424410590

this paper develops a formal model of process migration that describes pro.-rams, processes, and the migration of those processes within a migration realm. A migration realm is a group of processors modeled as finite state machines. the model is motivated by a migration application between software and field programmable Gate Array (FPGA) hardware, and the theorems of the model guide the use of FPGA resources while guaranteeing complete and correct execution of a process. By defining different types of migration realms this paper also develops a migration realm taxonomy.

关键词： field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

High speed tablation system using an FPGA designed for distribution tables of frequent DNA subsequences

High speed tablation system using an FPGA designed for distr...

引用

17th international conference on field programmable logic and applications

作者： Yamaguchi, Yoshiki Maruyama, Tsutomu Konishi, Fumikazu Konagaya, Akihiko Univ Tsukuba 1-1-1 Tenou Dai Tsukuba Ibaraki 3058573 Japan RIKEN Genom Sci Ctr Genom Sci Cent Kanagawa 2300045 Japan

ISBN: (纸本)9781424410590

A method is described for enumerating the frequencies of DNA subsequences on a system comprising a host computer and a field programmable gate array (FPGA) board with one FPGA. Frequencies of subsequences with lengths of up to K-0 K-1 K-2 (24 in the current implementation) are enumerated in three phases. In these three phases, subsequences with lengths of up to K-0, K (0) K-1, and K-0 K-1 K-2, respectively, are enumerated;these three phases are executed simultaneously on a pipelined circuit, resulting in high performance. the enumeration of frequent subsequences in databases, which are becoming larger and larger, will enable subsequences that are unique and/or repeatedly used in many parts of the sequences to be found.

关键词： field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

Application-specific customisation of multi-threaded soft processors

引用

IEE PROCEEDINGS-COMPUTERS AND DIGITAL TECHNIQUES 2006年第3期153卷 173-180页

作者： Dimond, R. Mencer, O. Luk, W. Univ London Imperial Coll Sci Technol & Med Dept Comp London SW7 2RH England

A multi-threaded microprocessor with a customisable instruction set, CUStomisable threaded ARchitecture (CUSTARD), is proposed. CUSTARD features include design space exploration and a compiler for automatic selection of custom instructions. Custom instructions, optimised for a specific application, accelerate frequently performed computations by implementing them as dedicated hardware. field programmable gate array implementations of CUSTARD are evaluated using media and cryptography benchmarks, and commercial MicroBlaze processor is compared. As low as 28% area overhead for four interleaved threads and up to 355% speedup over a processor without custom instructions are demonstrated.

关键词： cryptography microprocessor chips FPGA implementation multi-threading Cryptography logic and switching circuits Microprocessor chips media benchmark MicroBlaze processor Data security Distributed systems software logic circuits design space exploration multithreaded microprocessor customisable instruction set dedicated hardware field programmable gate arrays instruction sets field programmable gate array custom instruction automatic selection cryptography benchmark interleaved thread customisable threaded architecture multithreaded soft processor Microprocessors and microcomputers

来源：评论

学校读者我要写书评

暂无评论

A new scalable hardware architecture for RSA algorithm

A new scalable hardware architecture for RSA algorithm

引用

17th international conference on field programmable logic and applications

作者： Guedue, Tamer Tubitak Natl Res Inst Elect & Cryptol TR-41470 Kocaeli Turkey

ISBN: (纸本)9781424410590

A new scalable systolic hardware architecture for RSA cryptosystems is presented. the kernel of the architecture can operate with different precision of inputs which enables making area-time tradeoff in design. the add-shift Montgomery algorithm is used for modular multiplication. Unlike previous approaches after add operation, the result is shifted to the previous systole to divide by radix. this simplifies the structure of processing elements. the R-L binary Montgomery exponentiation algorithm is used. the square and multiply operations are performed in parallel. the architecture is implemented in Xilinx Virtex-5 FPGA (field programmable Gate Array) chips for different radixes. the DSP48E slices in the FPGA chips are used to increase the throughput of the design. the results are compared with the literature. It is seen that the highest performance per area is obtained with the Radix-2(16) design.

关键词： field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

A seamless DFT/FFT self-adaptive architecture for embedded radar applications 30

A seamless DFT/FFT self-adaptive architecture for embedded r...

引用

30th international conference on field-programmable logic and applications (FPL)

作者： Mazuet, Julien Narozny, Michel Dezan, Catherine Diguet, Jean-Philippe Thales LAS France Elancourt France UBO UBS Lab STICC CNRS Brest France

ISBN: (纸本)9781728199023

Radar is one of the domains where adaptability is paramount and algorithms must be adapted to system state. However, most systems include static implementations on FPGA or ASIC to process the massive amount of data from multiple sensors in parallel. the classic approach is to configure hardware logic through registers to switch radar modes, requiring to hardwire all configurations. In embedded systems, FPGA dynamic partial reconfiguration (DPR) is a promising solution to reuse scarce resources. In this paper, we use DPR for radar processing in order to switch between a classic discrete Fourier transform (DFT) sum and a fast Fourier transform (FFT) to enhance Doppler extraction. Our study explores the pros and cons of both methods. Based on these observations, we propose a new architecture and decision method that relies on Radar QoS for enabling an efficient self-adaptive solution. Finally, we provide a case study and a hardware-in-loop simulation with a reconfigurable radar implementation.

关键词： FPGA DPR radar QoS DFT

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：