Despite the vast hardware resources available in field-programmablegatearrays (FPGAs), the implementation of very large and complex designs on these platforms is a challenging problem. One effective solution is to u...
详细信息
Despite the vast hardware resources available in field-programmablegatearrays (FPGAs), the implementation of very large and complex designs on these platforms is a challenging problem. One effective solution is to use task scheduling. In this method, the FPGA configuration is changed according to the required task. This requires access to external nonvolatile memory, which significantly reduces system performance. In this letter, the development of a nonvolatile spintronic FPGA structure with task-scheduling capability is described. By using the nonvolatile feature of the magnetic tunnel junction device, the structure can store two different configurations in a nonvolatile way, which eliminates the need for external memory access and significantly enhances system performance.
Direction-of-arrival (DOA) estimation of radio signals is of utmost importance in many commercial and military applications. In this study, the authors propose an efficient field-programmablegatearray (FPGA) archite...
详细信息
Direction-of-arrival (DOA) estimation of radio signals is of utmost importance in many commercial and military applications. In this study, the authors propose an efficient field-programmablegatearray (FPGA) architecture for implementing a recently published DOA estimation algorithm. This algorithm estimates DOAs by making use of QR decomposition of the received data matrix of four- and eight-element uniform linear antenna arrays. The hardware implementation has been thoroughly analysed and experimentally validated by building a real-time prototype of the DOA estimation algorithm. The experimental results show good agreement between DOA estimates obtained by the prototype and true values.
field-programmablegatearrays (FPGAs) offer an alternative to application-specific integrated circuits (ASICs) that is attractive in scenarios where flexibility may be required or where chip volumes are not sufficien...
详细信息
field-programmablegatearrays (FPGAs) offer an alternative to application-specific integrated circuits (ASICs) that is attractive in scenarios where flexibility may be required or where chip volumes are not sufficiently high enough to justify the costs of a custom chip. The flexibility of FPGAs offer users the power/performance benefits of a custom hardware implementation, compared to software running on a processor, without committing to one specific design. However, the flexibility can lead to inefficiencies in terms of development time, area, performance, and power. FPGAs are used for a variety of applications and different optimizations can be applied to increase efficiency of FPGA implementations. This thesis considers techniques that can be applied to achieve efficient implementation of machine learning and other applications on an FPGA from the perspectives of architecture and computer-aided design (CAD). We consider the use of a high-level synthesis (HLS) tool to synthesize an accelerator for deep convolutional neural networks (CNNs) on an FPGA. We implement a complete end-to-end system running on an Arria 10 SoC FPGA. The accelerator implements zero-skipping and reduced-precision convolution with minimal impact on accuracy. We evaluate various versions of the accelerator through software changes and tool constraints alone. Then, we propose architecture changes to the carry-chain architecture in FPGAs to improve the resource utilization of binarized CNNs (BNNs). We add additional carry-chain circuitry that propagates sum instead of the carry. We demonstrate that we are able to reduce FPGA resource utilization while keeping the additional circuit area small. Lastly, we propose to re-map some of the look-up-tables (LUTs) to use the existing carry-chain architecture in order to increase performance by adding a post-LUT mapping step to the FPGA CAD flow. Using a subject graph that closely matches the underlying hardware, we are able to select critical paths to
暂无评论