Realistic benchmarks are important for FPGA Architecture and CAD evaluation. This paper provides a demo illustrating how designs described in HDL can be converted to BLIF using the Titan flow, and used in academic CAD...
详细信息
ISBN:
(纸本)9781479900046
Realistic benchmarks are important for FPGA Architecture and CAD evaluation. This paper provides a demo illustrating how designs described in HDL can be converted to BLIF using the Titan flow, and used in academic CAD tools.
FPGAs are used in many long-life systems that serve mission-critical needs. The supply chain and life-cycle management of these devices have long relied on ensuring adequate controls are in place. In this paper, a tec...
详细信息
ISBN:
(纸本)9782839918442
FPGAs are used in many long-life systems that serve mission-critical needs. The supply chain and life-cycle management of these devices have long relied on ensuring adequate controls are in place. In this paper, a technique is presented that provides measurement vectors by determining both characteristics of the supply properties of the FPGA and characteristics of aging of the FPGA. Asynchronous ring oscillators are placed throughout the FPGA, and the measurement of these oscillators is compared to other chips both within a manufacturing lot and between other manufacturing lots. Through these non-invasive measurements, the "health history" of the FPGA can be evaluated and utilized in supply chain decisions before and during system operation.
This paper presents an alternative FPGA design compilation flow that reduces the back-end time required to implement a design. Beginning with the GReasy front-end and proceeding through the TFlow back-end, this flow c...
详细信息
ISBN:
(纸本)9781479900046
This paper presents an alternative FPGA design compilation flow that reduces the back-end time required to implement a design. Beginning with the GReasy front-end and proceeding through the TFlow back-end, this flow consists of a rapid method for design assembly, decoupled from the vendor tools. This enables software-like turnaround time for faster prototyping and increased productivity.
Many CPU design houses have added dedicated support for cryptography in recent processor generations, including Intel, IBM, and ARM. While adding accelerators and/or dedicated instructions boosts performance on crypto...
详细信息
ISBN:
(纸本)9789090304281
Many CPU design houses have added dedicated support for cryptography in recent processor generations, including Intel, IBM, and ARM. While adding accelerators and/or dedicated instructions boosts performance on cryptography, we are investigating a different approach that is not adding extra silicon area: We study to replace the hardened NEON SIMD unit of an ARM Cortex-A9 with an identical sized FPGA fabric, called an interlay. This will be used for implementing cryptographic instructions in soft-logic. We show that this approach can outperform the hardened NEON by up to 7.7 x on AES and provide functionality that is not available in the hardened ARM.
The poor scalability of current mesh-based FPGA interconnection networks is impeding our attempts to build next-generation FPGA of larger logic capacity. A few alternative interconnection network architectures have be...
详细信息
ISBN:
(纸本)9781424419609
The poor scalability of current mesh-based FPGA interconnection networks is impeding our attempts to build next-generation FPGA of larger logic capacity. A few alternative interconnection network architectures have been proposed for future FPGAs, but they still have several design challenges that need to be addressed. In this paper we propose sFPGA, a scalable FPGA architecture, which is a hybrid between hierarchical interconnection and Network-on-Chip. The logic resources in sFPGA are organized into an array Of logic tiles. The tiles are connected by a hierarchical network of switches, which route data packets over the network. In addition, we have proposed a design flow for sFPGA which integrates current design flows seamlessly. By doing a case study in our emulation prototype, we have validated our sFPGA design flow.
Stencil computations represent a highly recurrent class of algorithms in various high performance computing scenarios. The Streaming Stencil Time-step (SST) architecture is a recent implementation of stencil computati...
详细信息
ISBN:
(纸本)9789090304281
Stencil computations represent a highly recurrent class of algorithms in various high performance computing scenarios. The Streaming Stencil Time-step (SST) architecture is a recent implementation of stencil computations on fieldprogrammable Gate Array (FPGA). In this paper, we propose an automated framework for SST-based architectures capable of achieving the maximum performance level for a given FPGA device through 1) the maximization of basic modules instantiated in the design and 2) optimization of the design floorplanning. Experimental results show that the proposed approach reduces the design time up to 15x w.r.t. naive design space exploration approaches, and improves the performance of the 13%.
This paper examines various activity estimation techniques in order to determine which are most appropriate for use in the context of field-programmable gate arrays (FPGAs). Specifically, the paper compares how differ...
详细信息
ISBN:
(纸本)9781424403127
This paper examines various activity estimation techniques in order to determine which are most appropriate for use in the context of field-programmable gate arrays (FPGAs). Specifically, the paper compares how different activity estimation techniques affect the accuracy of FPGA power models and the ability of power-aware FPGA CAD tools to minimize power. After comparing various existing techniques, the most suitable existing techniques are combined with two novel enhancements to create a new activity estimation tool called ACE-2.0. Finally, the new publicly available tool is compared to existing tools to validate the improvements. Using activities estimated by ACE-2.0, the power estimates and power savings were both within 1% of the results obtained using simulated activities.
One of the most important topics of today is a packet processing in data centers with respect to the power consumption and efficient utilization of computational resources. The ARM architecture has proved to be an ene...
详细信息
ISBN:
(纸本)9782839918442
One of the most important topics of today is a packet processing in data centers with respect to the power consumption and efficient utilization of computational resources. The ARM architecture has proved to be an energy efficient computational system. Together with an integrated FPGA on a single die, it offers potentially a high performance with respect to the power consumption. DPDK - a set of libraries and drivers intended primarily for fast packet processing - is becoming to be a standard approach for packet processing, especially in data centers. In this paper, we exploit the potential of packet processing based on DPDK and FPGA SoC architectures. Especially, we aim at the potential of utilizing the ARM Cortex-A9 and Cortex-A53 CPUs.
An improved architecture for efficiently computing the sum of absolute differences (SAD) on FPGAs is proposed in this work. It is based on a configurable adder/subtractor implementation in which each adder input can b...
详细信息
ISBN:
(纸本)9782839918442
An improved architecture for efficiently computing the sum of absolute differences (SAD) on FPGAs is proposed in this work. It is based on a configurable adder/subtractor implementation in which each adder input can be negated at runtime. The negation of both inputs at the same time is explicitly allowed and used to compute the sum of absolute values in a single adder stage. The architecture can be mapped to modern FPGAs from Xilinx and Altera. An analytic complexity model as well as synthesis experiments yield an average look-up table (LUT) reduction of 17.4% for an input word size of 8 bit compared to state-of-the-art. As the SAD computation is a resource demanding part in image processing applications, the proposed circuit can be used to replace the SAD core of many applications to enhance their efficiency.
Reconfiaurable logic Devices are classified as the fine-grained or coarse-rained type on the basis of their basic logic cell architecture. In general, each architecture has its own merit;therefore, it is difficult to ...
详细信息
ISBN:
(纸本)9781424410590
Reconfiaurable logic Devices are classified as the fine-grained or coarse-rained type on the basis of their basic logic cell architecture. In general, each architecture has its own merit;therefore, it is difficult to achieve a balance between the operation speed and implementation area in various applications. In this paper, we propose a Variable Grain logic Cell (VGLC) architecture, which consists of a 4-bit ripple carry adder with configuration memory bits and also develop technology mapping tool. Its key feature is the variable granularity being a trade-off between coarse-grained and fine-grained types required for the implementation arithmetic and random logic, respectively. As a result, critical path delay, and number of configuration memory bits are reduced by 49.7%, and 48.5%, respectively, in the benchmark circuits.
暂无评论