the proceedings contain 97 papers. the topics discussed include: using island-style bi-directional intra-CLB routing in low power FPGAs;energy efficient partitioning of dynamic reconfigurable MRAM-FPGAs;automatic supp...
ISBN:
(纸本)9780993428005
the proceedings contain 97 papers. the topics discussed include: using island-style bi-directional intra-CLB routing in low power FPGAs;energy efficient partitioning of dynamic reconfigurable MRAM-FPGAs;automatic support for multi-module parallelism from computational patterns;fine-tuning CLB placement to speed up reconfigurations in NVM-based FPGAs;a technology mapper for depth-constrained FPGA logic cells;parallel feature extraction and heterogeneous object-detection for multi-camera driver assistance systems;generating FPGA accelerators for chemical similarity assessment;7MOPS/lemon-battery image processing demonstration with an ultra-low power reconfigurable accelerator CMA-SOTB-2;NetFPGA - rapid prototyping of high bandwidth devices in open source;optimizing energy efficient low-swing interconnect for sub-threshold FPGAs;reduction calculater in an FPGA based switching hub for high performance clusters;and serial and parallel interleaved modular multipliers on FPGA platform.
Fuzzy logic systems have been implemented successfully for the design of a wide variety of control systems. they provide a powerful way for designing nonlinear controllers using human expert knowledge. In this article...
详细信息
ISBN:
(纸本)9783319312934;9783319312910
Fuzzy logic systems have been implemented successfully for the design of a wide variety of control systems. they provide a powerful way for designing nonlinear controllers using human expert knowledge. In this article, we present an approach to design and implement a fuzzy logic proportional integral controller (Fuzzy-PI) for omnidirectional robot navigation system, using a field-programmable gate array (FPGA). First, we define the kinematic model of the robot system and then we design, simulate, and optimize the controller navigation system using MATLAB and Robot-ino Sim platforms. the main goal of this work is the design of the Fuzzy-PI controller and the hardware implementation using FPGA resources. the controller can be implemented on an FPGA using software or hardware approach. For the latter approach, the Fuzzy-PI algorithm is implemented in VHDL language, synthesized, optimized, placed and routed, and downloaded on an FPGA board.
Digital circuit design may demand critical requirements, such as power consumption, robustness, performance, etc., while being implemented in VLSI (Very Large Scale Integration). the asynchronous paradigm presents int...
详细信息
Bayesian neural networks (BNNs) have been proposed to address the problem of model uncertainty in training. By introducing weights associated with conditioned probability distributions, BNN is capable to resolve overf...
详细信息
ISBN:
(纸本)9781538622544
Bayesian neural networks (BNNs) have been proposed to address the problem of model uncertainty in training. By introducing weights associated with conditioned probability distributions, BNN is capable to resolve overfitting issues commonly seen in conventional neural networks. Frequent usage of Gaussian random variables requires a properly optimized Gaussian Random Number Generator (GRNG). the high hardware cost of conventional GRNG makes the hardware realization of BNN challenging. In this paper, a new hardware acceleration architecture for variational inference in BNNs is proposed to facilitate the applicability of BNN in larger-scale applications. In addition, the proposed implementation introduced the RAM based Linear Feedback based GRNG (RLF-GRNG) for effective weight sampling in BNNs. the RAM based Linear Feedback method can effectively utilize RAM resources for parallel Gaussian random number generation while requiring limited and sharable control logic. Implementation on an Altera Cyclone V FPGA suggests that the RLF-GRNG utilizes much less RAM resources compared to other GRNG methods. Experiments results show that the proposed hardware implementation of a BNN can still attain similar accuracy compared to software implementation.
Smart Systems create new challenges for semiconductor components that need to provide, at one hand, the ease of software programmability and, on the other hand, the capability of hardware efficiency. this dichotomy is...
Smart Systems create new challenges for semiconductor components that need to provide, at one hand, the ease of software programmability and, on the other hand, the capability of hardware efficiency. this dichotomy is driven by, on the one hand, the need for a software stack that supports flexibility, scalability, fast development cycles and, on the other hand, the need for hardware optimization to support performance, efficiency and cost. In this talk, we will discuss the technical trends in major market segments such as datacenters, 5G wireless infrastructure and software defined networking and the resulting requirements for semiconductor platforms, both from hardware and software perspective. To provide an answer to these challenges, it will be demonstrated on how FPGA technology is evolving from a programmable hardware solution catering to ASIC refugees towards an All programmable architecture empowering system and software engineers.
As transistor scaling is slowing down [1], other opportunities for ensuring continuous performance increase have to be explored. fieldprogrammable gate arrays (FPGAs) are in the spotlight these days: not only due to ...
详细信息
As transistor scaling is slowing down [1], other opportunities for ensuring continuous performance increase have to be explored. fieldprogrammable gate arrays (FPGAs) are in the spotlight these days: not only due to their malleability and energy efficiency, but also because FPGAs have recently been integrated into the cloud [2]. the latter makes them available to everyone in need of the immense computing power and data throughput they can offer. However, one important issue needs to be resolved first - the time to compile an industrial-scale design for an FPGA must be drastically reduced. Researchers have been looking for ways to accelerate FPGA routing through parallelism, since routing is one of the most time-consuming compilation steps. However, the ideal solution has not been found yet. this paper provides a survey of parallel FPGA routers, withthe aim to identify their strengths and weaknesses, thus suggesting directions to take in further efforts for acceleration.
Iterative stencils are kernels in various application domains such as numerical simulations and medical imaging, that merit FPGA acceleration. the best architecture depends on many factors such as the target platform,...
详细信息
Iterative stencils are kernels in various application domains such as numerical simulations and medical imaging, that merit FPGA acceleration. the best architecture depends on many factors such as the target platform, off-chip memory bandwidth, problem size, and performance requirements. We generate a family of FPGA stencil accelerators targeting emerging System on Chip platforms, (e.g., Xilinx Zynq or Intel SoC). Our designs come with design knobs to explore trade-offs. We also propose performance models to hone in on the most interesting design points, and show how they accurately lead to optimal designs. the optimal choice depends on problem sizes and performance goals.
Physically unclonable functions are used for IP protection, hardware authentication and supply chain security. While many PUF constructions have been put forward in the past decade, only few of them are applicable to ...
详细信息
Physically unclonable functions are used for IP protection, hardware authentication and supply chain security. While many PUF constructions have been put forward in the past decade, only few of them are applicable to FPGA platforms. Strict constraints on the placement and routing are the main disadvantages of the existing PUFs on FPGAs, because they place a high effort on the designer. In this paper we propose a new delay-based PUF construction called Monte Carlo PUF, that does not require low-level placement and routing control. this construction relies on the on-chip Monte Carlo method that is applied for measuring the delays of logic elements in order to extract a unique device fingerprint. the proposed construction allows a trade-off between the evaluation time and the error rate. the Monte Carlo PUF is implemented and evaluated on Xilinx Spartan-6 FPGAs.
Bit matrix compression is a highly relevant operation in computer arithmetic. Essentially being a multi-operand addition, it is the key operation behind fast multiplication and many higher-level operations such as mul...
详细信息
Bit matrix compression is a highly relevant operation in computer arithmetic. Essentially being a multi-operand addition, it is the key operation behind fast multiplication and many higher-level operations such as multiply-accumulate, the computation of the dot product or the implementation of FIR filters. Compressor implementations have been constantly evolving for greater efficiency both in general and in the context of concrete applications or specific implementation technologies. this paper is building on this history and describes a generic implementation of a bit matrix compressor for Xilinx FPGAs, which does not require a generator tool. It contributes FPGA-oriented metrics for the evaluation of elementary parallel bit counters, a systematic analysis and partial decomposition of previously proposed counters and a fully implemented construction heuristic with a flexible compression target matching the device capabilities. the generic implementation is agnostic of the aspect ratio of the input matrix and can be used for multiplication the same way as it can be for single-column population count operations.
the P4 language provides a way to describe a custom network packet processing behavior that involves header parsing, matching and assembling modified packets. Such abstraction represents a significant step towards rem...
详细信息
the P4 language provides a way to describe a custom network packet processing behavior that involves header parsing, matching and assembling modified packets. Such abstraction represents a significant step towards removing the limitation of fixed-function networking devices. Our live demonstration shows a straightforward usage of an algorithm and tool that maps a P4 program to a general architecture of FPGA-based networking device. Network traffic is received, parsed, filtered and modified by the generated circuit at the full line rate of 100 Gbps Ethernet. the results of our ongoing joint research project NFV200 show that the FPGA technology can be used to improve network flexibility without the usual burden of tedious and error-prone HDL coding.
暂无评论