检索结果-内蒙古大学图书馆

A Deep Learning Approach for the Design of Narrow Transition-Band FIR Filter

CIRCUITS SYSTEMS AND SIGNAL PROCESSING 2022年第10期41卷 5578-5613页

作者： Roy, Subhabrata Chandra, Abhijit Jadavpur Univ Dept Instrumentat & Elect Engn Kolkata 700106 India

Deep neural network (DNN), being an important member of machine learning family, has been employed to serve a wide range of applications in the area of signal and image processing like pattern recognition, speech recognition, language processing, image segmentation, etc. To this aim, this paper concentrates on the design of a narrow transition-band finite impulse response (FIR) filter with the aid of back-propagation-based deep learning approach. The proposed deep learning-based approach offers a unified design framework for a variety of FIR filters. Convergence behaviour of the proposed algorithm has been proved analytically in situations when weights between adjacent layers are updated continuously. Simulation results have shown the frequency response characteristics of several FIR filters with narrow transition-band, designed with the help of proposed approach. Advantage of our design strategy has also been established in terms of magnitude response over a number of state-of-the-art techniques of recent interest. Simulation results have shown noticeable improvement in terms of transition bandwidth when compared with few existing works. Designed filter is subsequently implemented on Altera's Cyclone IV field programmable gate array (FPGA) chip, and hardware efficiency of the suggested design has strongly been established by correlating its hardware cost with many of the state-of-the-art FIR filters.

关键词： Deep neural network Finite impulse response filter field programmable gate array Low complexity Narrow-band filter

来源：评论

学校读者我要写书评

暂无评论

FPGAN: An FPGA Accelerator for Graph Attention Networks With Software and Hardware Co-Optimization

引用

IEEE ACCESS 2020年 8卷 171608-171620页

作者： Yan, Weian Tong, Weiqin Zhi, Xiaoli Shanghai Univ Sch Comp Engn & Sci Shanghai 200444 Peoples R China Shanghai Univ Shanghai Inst Adv Commun & Data Sci Shanghai 200444 Peoples R China

The Graph Attention Networks (GATs) exhibit outstanding performance in multiple authoritative node classification benchmark tests (including transductive and inductive). The purpose of this research is to implement an FPGA-based accelerator called FPGAN for graph attention networks that achieves significant improvement on performance and energy efficiency without losing accuracy compared with PyTorch baseline. It eliminates the dependence on digital signal processors (DSPs) and large amounts of on-chip memory and can even work well on low-end FPGA devices. We design FPGAN with software and hardware co-optimization across the full stack from algorithm through architecture. Specifically, we compress model to reduce the model size, quantify features to perform fixed-point calculation, replace multiplication addition cell (MAC) with shift addition units (SAUs) to eliminate the dependence on DSPs, and design an efficient algorithm to approximate SoftMax function. We also adjust the activation functions and fuse operations to further reduce the computation requirement. Moreover, all data is vectorized and aligned for scalable vector computation and efficient memory access. All the above optimizations are integrated into a universal hardware pipeline for various structures of GATs. We evaluate our design on an Inspur F10A board with an Intel Arria 10 GX1150 and 16 GB DDR3 memory. Experimental results show that FPGAN can achieve 7.34 times speedup over Nvidia Tesla V100 and 593 times over Xeon CPU Gold 5115 while maintaining accuracy, and 48 times and 2400 times on energy efficiency respectively.

关键词： field programmable gate arrays Acceleration Computational modeling Computer architecture Optimization Hardware Energy efficiency Graph attention networks model optimization inference accelerating field programmable gate array heterogeneous computing parallel computing energy efficiency shift operation

来源：评论

学校读者我要写书评

暂无评论

Hardware Implementation of Reconfigurable 1D Convolution

引用

JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY 2016年第1期82卷 1-16页

作者： Rao, Lei Zhang, Bin Zhao, Jizhong Xi An Jiao Tong Univ Xian 710049 Shaanxi Peoples R China

Convolution has been extensively used in image processing and computer vision, including image enhancement, smoothing, and structure extraction. However, convolution operation typically requires a significant amount of computing resources. A novel one-dimensional (1D) convolution processor with reconfigurable architecture is implemented in this study. This processor is a combination of a line buffer, controller units, as well as a reconfigurable and separable convolution module. The use of a reconfigurable architecture and separable convolution approach improves the flexibility and performance of the convolution processor. The reconfigurable and separable convolution array, which is the main component of the processor, can simultaneously execute convolution operation with different kernels, with a maximum kernel size of up to 24 x 24. Experimental results show that the maximum frames rate of the processor is approximately 194 frames per second (fps), which exceeds the real-time requirement. Synthesis results show that the processor occupies 13.39 mm (2) at a 204 MHz system clock and consumes a power of 419 mW at maximum kernel size at a 120 MHz system clock in SMIC 0.18 mu m CMOS technology. Verification experiments on field programmable gate arrays (FPGAs) demonstrate that the processor is suitable for real-time image processing applications even for high-resolution images.

关键词： Convolution field programmable gate array Reconfigurable architecture

来源：评论

学校读者我要写书评

暂无评论

ROSETTA: A Resource and Energy-Efficient Inference Processor for Recurrent Neural Networks Based on programmable Data Formats and Fine Activation Pruning

引用

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING 2023年第3期11卷 650-663页

作者： Kim, Jiho Kim, Tae-Hwan Korea Aerosp Univ Sch Elect & Informat Engn Goyang 10540 Gyeonggi Do South Korea

Recurrent neural networks (RNNs) are extensively employed to perform inference based on the temporal features of the input data. However, their computational workload and power consumption involved in inference are prohibitively high in practice, which may be problematic to achieve a high-speed inference in devices with tight limitations in the available silicon resources and power supply. This paper presents an efficient inference processor for RNNs, named ROSETTA. ROSETTA supports multiple data formats programmable for each vector operand to achieve a wide range or high precision with a limited data size. ROSETTA consistently performs every vector operation based on homogeneous processing units with a high utilization rate. Moreover, ROSETTA skips operations and reduces memory accesses to achieve high energy efficiency by pruning the activation elements in a fine-grained manner. Implemented in a low-cost 28 nm field-programmable gate array, ROSETTA exhibits a resource and energy efficiency as high as 2.51 - 1.14 MOP/s/LUT and 434.01 - 113.29 GOP/s/W, respectively, while producing near-floating-point inference results. The resource and energy efficiency of ROSETTA are higher than those of the previous processor implemented in the same device by up to 206.1% and 304.0%, respectively. The functionality has been verified for several RNN models of various types under a fully-integrated inference system.

关键词： Logic gates field programmable gate arrays Speech recognition Energy efficiency Convolutional neural networks Recurrent neural networks Integrated circuit modeling Accelerator field programmable gate array inference microarchitecture recurrent neural networks

来源：评论

学校读者我要写书评

暂无评论

HopliteRT*: Real-Time NoC for FPGA

引用

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 2020年第11期39卷 3650-3661页

作者： Gonzalez, Yilian Ribot Nelissen, Geoffrey Polytech Inst Porto ISEP CISTER Res Ctr P-4200465 Porto Portugal Eindhoven Univ Technol Dept Math & Comp Sci NL-5612 AZ Eindhoven Netherlands

With the increasing number of computation nodes integrated in multi and many-core platforms, network-on-chips (NoCs) emerged as a new communication medium in systems-on-chips (SoCs). HopliteRT is a new NoC design that was recently proposed to address the needs of real-time systems whilst respecting the constraints of field-programmable gate array (FPGA) platforms. In this article, we: 1) introduce priority-based routing in HopliteRT;2) change the network topology in order to improve the packets' worst-case traversal time (WCTT);3) identify a flaw in the existing timing analysis of HopliteRT;and 4) develop a new timing analysis that is proven correct. We also show by means of experiments that the modifications of HopliteRT proposed in this article allows for at least 2x improvement on the worst and average case traversal time of high priority packets, without impacting the quality of service of low-priority packets. The timing properties of high priority flows are greatly improved for negligible additional hardware costs. The proposed NoC has been implemented in Verilog and synthesized for a Xilinx Virtex-7 FPGA platform.

关键词： field programmable gate array network-onchips real-time embedded systems systems-on-chips timing analysis

来源：评论

学校读者我要写书评

暂无评论

Infrared Target Detection and Recognition Method in Airborne Photoelectric System

引用

JOURNAL OF AEROSPACE INFORMATION SYSTEMS 2019年第3期16卷 94-106页

作者： Ding, Meng Sun, Zhejun Wei, Li Cao, Yunfeng Yao, Yuheng Nanjing Univ Aeronaut & Astronaut Coll Civil Aviat Nanjing 210016 Jiangsu Peoples R China Nanjing Univ Aeronaut & Astronaut Jin Cheng Coll Nanjing 210016 Jiangsu Peoples R China Nanjing Univ Aeronaut & Astronaut Coll Astronaut Nanjing 210016 Jiangsu Peoples R China Univ Southern Calif Viterbi Sch Engn 649 W 34th St Los Angeles CA 90089 USA

Infrared target detection and recognition are investigated by considering the wide application requirements of an airborne photoelectric system. The proposed algorithm can be divided into three parts. First, on the basis that the target of infrared images dominates the background in the frequency domain, this paper presents a method of candidate region detection. The detection algorithm first generates a saliency map using the discrete cosine transform and then identifies candidate regions by computing and comparing saliency scores of different regions. Second, to extract the features of each candidate region for recognition, the paper presents a local descriptor and subsequently uses locality-constrained linear coding and a pooling operator to obtain the feature vector of the target, and then further completes target recognition via a simple linear classifier. Finally, as preliminary research on the engineering application of related algorithms, the detection and recognition algorithms are transplanted to an embedded platform. The paper conducts experiments on six test sequences to evaluate the performance of the proposed algorithms and the computing efficiency on the embedded platform. An evaluation experiment and comparison experiment verify the effectiveness and practicability of the proposed algorithms.

关键词： Computing Frequency Domain field programmable gate array Convolutional Neural Network Airport Towers Sensor Technology Unmanned Aerial Vehicle Graphics Processing Unit Support Vector Machine Focal Plane arrays

来源：评论

学校读者我要写书评

暂无评论

A New Imaging System for Real-Time Process Control

引用

IEEE SENSORS JOURNAL 2017年第12期17卷 3844-3852页

作者： Saied, Imran Mahmoud, Meribout Coll Engn Dept Elect Engn Abu Dhabi U Arab Emirates

In this paper, a new tomography technique called electrical charge tomography for two-phase flow imaging is presented. The probe consists of few pair of electrodes which are electrically energized to generate electrical charges within the fluid under test. The intensity of these charges depends on the chemical and physical properties of the fluid, as well as to its molecular distribution. Another group of electrodes surrounding the cross section of the fluid under test are used to capture the induced electrical charges. These are then converted into an electrical signal using a high sensitive charge amplifier. A postprocessing unit which consists of an analog to digital converter, followed by an field programmable gate array (FPGA) module is then used for high level signal processing (i.e., a dedicated dynamic thresholding algorithm) and image reconstruction. Experimental results demonstrate the capability of the system to accurately generate 2-D cross-sectional images, where the error is lower by up to 14% when using another electrical capacitance (ECT) tomography probe. The other advantage of this technique over ECT is the reduced data acquisition time, since in ECT a minimum time is required for the charge and discharge of the capacitance in order to achieve acceptable accuracy. This makes the probe another attractive concept for future tomography systems targeting real-time applications.

关键词： Electrical capacitance tomography imaging systems field programmable gate array solid contaminants measurement gas pipeline monitoring

来源：评论

学校读者我要写书评

暂无评论

Normally-Off Computing for Crystalline Oxide Semiconductor-Based Multicontext FPGA Capable of Fine-Grained Power Gating on programmable Logic Element With Nonvolatile Shadow Register

引用

IEEE JOURNAL OF SOLID-STATE CIRCUITS 2015年第9期50卷 2199-2211页

作者： Aoki, Takeshi Okamoto, Yuki Nakagawa, Takashi Kozuma, Munehiro Kurokawa, Yoshiyuki Ikeda, Takayuki Yamade, Naoto Okazaki, Yutaka Miyairi, Hidekazu Fujita, Masahiro Koyama, Jun Yamazaki, Shunpei Semicond Energy Lab Co Ltd Atsugi Kanagawa 2430036 Japan Univ Tokyo VDEC Tokyo 1130032 Japan

Normally-off computing (Noff computing) using a multicontext field programmable gate array (MC-FPGA) consisting of crystalline oxide semiconductor FETs has been developed. The Noff computing discussed in this paper is a control architecture for an MC-FPGA capable of performing fine-grained power gating on each programmable logic element (PLE) whose registers include a volatile register and also a nonvolatile shadow register for storing and loading data in the volatile register. The MC-FPGA performs fine-grained control of power supplied only to PLEs contributing to effective calculation, when context switching happens. With an MC-FPGA fabricated with a hybrid process of a 1.0 mu m crystalline oxide semiconductor FET on a 0.5 mu m CMOS FET, it has been confirmed that the proposed Noff computing can resume the previous task when a context switches back to it, increases PLE use efficiency, and reduces the power consumption by 27.7% at operating frequencies of 20 MHz with a driving voltage of 2.5 V.

关键词： CAAC-IGZO crystalline IGZO crystalline oxide semiconductor field programmable gate array multicontext nonvolatile memory normally-off computing oxide semiconductor power gating shadow register

来源：评论

学校读者我要写书评

暂无评论

An Architecture for Coexistence with Multiple Users in Frequency Hopping Cognitive Radio Networks

引用

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS 2014年第3期32卷 563-571页

作者： McLean, Ryan K. Silvius, Mark D. Hopkinson, Kenneth M. Flatley, Bridget N. Hennessey, Ethan S. Medve, Curtis C. Thompson, Jared J. Tolson, Matthew R. Dalton, Clark V. Air Force Inst Technol Dept Elect & Comp Engn Wright Patterson AFB OH 43433 USA

The radio frequency (RF) spectrum is a limited resource. Spectrum allotment disputes stem from this scarcity as many radio devices are confined to a fixed frequency. One alternative is to incorporate reconfigurability within a cognitive radio platform, thereby enabling the radio to adapt to dynamic RF spectrum environments. In this way, the radio is able to actively observe the RF spectrum, orient itself to the current RF environment, decide on a mode of operation, and act accordingly, thereby sharing the spectrum and operating in more a flexible manner. This research presents a novel architecture for the purpose of adapting radio operation to the current RF spectrum environment. Specifically, this research makes three contributions: (1) a framework for testing and evaluating clustering algorithms in the context of cognitive radio networks, (2) a new RF spectrum map merging technique for adaptive waveform selection, with initial integration testing on a field-programmable gate array (FPGA), and (3) a novel cognitive radio network emulation framework for testing and evaluating totally-ordered multicast as a means for inter-node communication.

关键词： Cognitive radio adaptive waveform clustering radio environment map multicast field programmable gate array

来源：评论

学校读者我要写书评

暂无评论

FPGA-Accelerated Quantum Computing Emulation and Quantum Key Distillation

引用

IEEE MICRO 2021年第4期41卷 49-57页

作者： Li, He Pang, Yaru Univ Cambridge Cambridge CB2 1TN England UCL London WC1E 6BT England

In the past decades, field-programmable gate arrays (FPGAs) have demonstrated an interesting physical platform to facilitate quantum information processing, particularly in the emergence of domain-specific hardware accelerators for quantum computing emulation and quantum key distillation. While conventional general-purpose hardware platforms have been used for quantum information processing, FPGAs promise deep pipeline parallelism, adaptable interface, and trivial support for custom-precision operation. Therefore, the time is ripe for describing recent development of quantum computing emulators and quantum key distillation accelerators on FPGAs. In this article, we provide a comprehensive review of the state-of-the-art in this active field, with a balance between theoretical, implementational, and technological results. Challenges and promising research opportunities are also discussed.

关键词： field programmable gate arrays Quantum Computing Quantum Cryptography Quantum Information Processing FPG As Quantum Computing Emulators Quantum Key Distillation Accelerators FPGA Accelerated Quantum Computing Emulation field programmable gate arrays Interesting Physical Platform Domain Specific Hardware Accelerators General Purpose Hardware Platforms Qubit Logic gates Quantum Computing field programmable gate arrays Hardware Computational Modeling Process Control field programmable gate array Quantum Information Processing Hardware Emulation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：