检索结果-内蒙古大学图书馆

VLSI Architectures and Hardware Implementation of Ultra Low-Latency and Area-Efficient Pietra-Ricci Index Detector for Spectrum Sensing

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS 2024年第5期71卷 2348-2361页

作者： Pereira, Elivander Judas Tadeu Guimaraes, Dayan Adionel Shrestha, Rahul Inatel Natl Inst Telecommun BR-37540000 Santa Rita do Sapuca Brazil IIT Mandi Sch Comp & Elect Engn Mandi 175005 India

The Pietra-Ricci index detector (PRIDe) has been recently proposed as one of the simplest techniques for centralized, data-fusion cooperative spectrum sensing, attaining robustness against time-varying signal and noise levels, constant false alarm rate, and high detection power. In this paper, we propose the design and implementation of the PRIDe detector, targeting field programmable gate array (FPGA) and application-specific integrated circuit (ASIC) solutions. Novel approaches are proposed for computing the PRIDe's test statistic, including the absolute value of complex quantities, the complex multiplier-accumulator, and the spectrum occupancy decision. The absolute value operation, which is critical to the PRIDe test statistic computational cost, applies the coordinate rotation digital computer (CORDIC) algorithm as a low latency and resource-efficient option. Register transfer level (RTL) and Monte Carlo simulations show that the resulting ultra-low latency PRIDe detector architectures attain no performance loss with respect to floating-point simulations. One of the two proposed ASIC design versions of the PRIDe sensor occupies 34.9% lower area compared to the most area-efficient sensor reported in literature, whereas the other one is $5.7\times$ faster than the fastest state-of-the-art sensor. In a nutshell, the proposed detector architecture delivers the highest area and power efficiencies, considering the scaled values of area-time product (ATP) and power-delay product (PDP) metrics, in comparison to implementations reported to date.

关键词： Cognitive radio coordinate rotation digital computer field programmable gate array application-specific integrated circuit Pietra-Ricci index detector spectrum sensing

来源：评论

学校读者我要写书评

暂无评论

ROSETTA: A Resource and Energy-Efficient Inference Processor for Recurrent Neural Networks Based on programmable Data Formats and Fine Activation Pruning

引用

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING 2023年第3期11卷 650-663页

作者： Kim, Jiho Kim, Tae-Hwan Korea Aerosp Univ Sch Elect & Informat Engn Goyang 10540 Gyeonggi Do South Korea

Recurrent neural networks (RNNs) are extensively employed to perform inference based on the temporal features of the input data. However, their computational workload and power consumption involved in inference are prohibitively high in practice, which may be problematic to achieve a high-speed inference in devices with tight limitations in the available silicon resources and power supply. This paper presents an efficient inference processor for RNNs, named ROSETTA. ROSETTA supports multiple data formats programmable for each vector operand to achieve a wide range or high precision with a limited data size. ROSETTA consistently performs every vector operation based on homogeneous processing units with a high utilization rate. Moreover, ROSETTA skips operations and reduces memory accesses to achieve high energy efficiency by pruning the activation elements in a fine-grained manner. Implemented in a low-cost 28 nm field-programmable gate array, ROSETTA exhibits a resource and energy efficiency as high as 2.51 - 1.14 MOP/s/LUT and 434.01 - 113.29 GOP/s/W, respectively, while producing near-floating-point inference results. The resource and energy efficiency of ROSETTA are higher than those of the previous processor implemented in the same device by up to 206.1% and 304.0%, respectively. The functionality has been verified for several RNN models of various types under a fully-integrated inference system.

关键词： Logic gates field programmable gate arrays Speech recognition Energy efficiency Convolutional neural networks Recurrent neural networks Integrated circuit modeling Accelerator field programmable gate array inference microarchitecture recurrent neural networks

来源：评论

学校读者我要写书评

暂无评论

FPGA Implementation of Image Registration Using Accelerated CNN

引用

SENSORS 2023年第14期23卷 6590-6590页

作者： Aydin, Seda Guzel Bilge, Hasan Sakir Bingol Univ Dept Elect & Elect Engn TR-12000 Bingol Turkiye Gazi Univ Biomed Calibrat & Res Ctr BIYOKAM TR-06560 Ankara Turkiye

Background: Accurate and fast image registration (IR) is critical during surgical interventions where the ultrasound (US) modality is used for image-guided intervention. Convolutional neural network (CNN)-based IR methods have resulted in applications that respond faster than traditional iterative IR methods. However, general-purpose processors are unable to operate at the maximum speed possible for real-time CNN algorithms. Due to its reconfigurable structure and low power consumption, the field programmable gate array (FPGA) has gained prominence for accelerating the inference phase of CNN applications. Methods: This study proposes an FPGA-based ultrasound IR CNN (FUIR-CNN) to regress three rigid registration parameters from image pairs. To speed up the estimation process, the proposed design makes use of fixed-point data and parallel operations carried out by unrolling and pipelining techniques. Experiments were performed on three US datasets in real time using the xc7z020, and the xcku5p was also used during implementation. Results: The FUIR-CNN produced results for the inference phase 139 times faster than the software-based network while retaining a negligible drop in regression performance of under 200 MHz clock frequency. Conclusions: Comprehensive experimental results demonstrate that the proposed end-to-end FPGA-based accelerated CNN achieves a negligible loss, a high speed for registration parameters, less power when compared to the CPU, and the potential for real-time medical imaging.

关键词： accelerated CNN field programmable gate array image registration ultrasound

来源：评论

学校读者我要写书评

暂无评论

FPGA-based Implementation of a Resource-Efficient UNET Model for Brain Tumour Segmentation

引用

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS 2024年第1期15卷 622-630页

作者： Neiso, Modise Kagiso Muchuka, Nicasio Maguu Mambo, Shadrack Maina PAUSTI Dept Elect & Elect Engn Juja Kenya Egerton Univ Dept Elect & Control Engn Nakuru Kenya Walter Sisulu Univ Elect Engn Dept Ibika South Africa

In this study an optimized UNET model is used for FPGA-based inference in the context of brain tumour segmentation using the BraTS dataset. The presented model features reduced depth and fewer filters, tailored to enhance efficiency on FPGA hardware. The implementation leverages High -Level Synthesis for Machine Learning (HLS4ML) to optimize and convert a Keras-based UNET model to Hardware Description Language (HDL) in the Kintex Ultrascale (xcku085flva1517-3-e) FPGA. Resource strategy, First in First out (FIFO) depth optimization, and precision adjustment were employed to optimize FPGA resource utilization. Resource strategy is demonstrated to be effective, with resource utilization reaching a saturation point at a 1000 -reuse factor. Following FIFO optimization, significant reductions are observed, including a 55 percent decrease in Block RAM (BRAM) usage, a 43 percent reduction in Flip -Flops (FF), and a 49 percent reduction in LookUp Tables (LUT). In C/RTL co -simulation, the proposed FPGAbased UNET model achieves an Intersection over Union (IoU) score of 74 percent, demonstrating comparable segmentation accuracy to the original Keras model. These findings underscore the viability of the optimized UNET model for efficient brain tumour segmentation on FPGA platforms.

关键词： UNET field programmable gate array high-level synthesis for machine learning brain tumour segmentation

来源：评论

学校读者我要写书评

暂无评论

A Priori-Knowledge-Free Real-Valued Capon-Like Method and Implementation on FPGA

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS 2024年第12期71卷 6532-6543页

作者： Chen, Yili Fu, Zhe Li, Jianzhong Guangdong Univ Technol Sch Integrated Circuits Guangzhou 510000 Guangdong Peoples R China

Improving real-time computational efficiency is a major research direction in Direction-Of-Arrival (DOA) estimation. In this paper, a novel computationally efficient real-valued DOA estimator is presented, in which the estimation is performed without the need for EigenValue Decomposition (EVD) and therefore avoids estimating the source number in advance. Following the comparison between the traditional MUSIC algorithm and the Capon Method, we present a general form of DOA estimation, which reveals that the construction of the noise subspace in the traditional MUSIC algorithm derives from the activation function performed on the eigenvalues. Unlike the classic subspace-based algorithm, our proposed activation-like function eliminates the reliance on subspace decomposition, thereby removing the need for source number estimation and mitigating performance degradation caused by incorrect estimations. Moreover, existing real-valued DOA algorithms would estimate both the true DOAs and their corresponding mirror DOAs, and the space-shifting property is used to eliminate the mirror DOAs. In addition, the field programmable gate array (FPGA) implementation for our proposed real-valued algorithm is developed, showing a dramatic reduction of the hardware resource consumption and computation burden compared with the complex-valued MUSIC. Experiments illustrate that our proposed algorithm is computationally more efficient, and achieves higher estimation resolution compared to the existing methods.

关键词： Direction-of-arrival estimation Estimation Multiple signal classification Covariance matrices field programmable gate arrays Signal processing algorithms Computational efficiency Direction-of-arrival (DOA) estimation subspace method field programmable gate array real-valued computation

来源：评论

学校读者我要写书评

暂无评论

RF Drone Detection System Based on a Distributed Sensor Grid With Remote Hardware-Accelerated Signal Processing

引用

IEEE ACCESS 2023年 11卷 138759-138772页

作者： Flak, Przemyslaw Czyba, Roman Silesian Tech Univ Fac Automat Control Elect & Comp Sci Dept Automat Control & Robot PL-44100 Gliwice Poland Silesian Tech Univ Fac Automat Control Elect & Comp Sci Dept Automat Control & Robot PL-44100 Gliwice Poland

Unmanned Aerial Vehicles (UAVs), sometimes known as drones, evolved from military to civilian applications, opening up novel perspectives in a variety of everyday services. The rapidly growing consumer interest in amateur drones equipped with high-end cameras compromises the everyday safety and privacy of people. In the literature, a variety of sensing techniques based on different physical phenomena have been proposed for drone detection. Among acoustic, optical, or radar detection systems, passive radiofrequency sensing is the only one that can identify a drone even before it takes off and additionally indicate the operator's location. A spectrogram-based method is developed and optimised in terms of computing location, resulting in the possibility of sensor grid deployment over a standard Ethernet network. The detection phase involves hardware-accelerated energy sensing to extract the data frames from the background noise. Drone presence is then identified using machine learning based solely on preamble pattern recognition, which reduces the computational effort. The presented procedure is evaluated in an isolated setting employing an open-source dataset and tuned across multiple neural network architectures. Next, the complete sensor processing chain is examined in a real-life scenario. The analytical energy detector stage reaches a margin of roughly -8.7 dB in the signal-to-noise (SNR) ratio. With 1.1 M parameters, the proposed neural network achieves 99.93% simulation accuracy in up to -9.5 dB SNR range. Even after quantization for embedded platform implementation, the device can be used as a stand-alone early intrusion detector or as part of a distributed sensor grid.

关键词： Convolutional neural network drones field programmable gate array software defined radio spectrogram surveillance unmanned aerial vehicles

来源：评论

学校读者我要写书评

暂无评论

A Novel Ergodic Cellular Automaton Model of Gene-Protein Network: Theoretical Nonlinear Analyses and Efficient FPGA Implementation

引用

IEEE ACCESS 2023年 11卷 300-312页

作者： Shirafuji, Shogo Torikai, Hiroyuki Hosei Univ Grad Sch Sci & Engn Koganei Tokyo 1848584 Japan

A novel ergodic cellular automaton model of gene-protein network is presented. It is shown that the presented model can predict occurrences of typical nonlinear phenomena of a conventional ordinary differential equation gene-protein network model. In addition, theoretical analysis methods of the presented model are proposed. Using the analysis methods, an important advantage of the presented model is revealed: the ergodic cellular automaton is better suited to predict the occurrences of the nonlinear phenomena of the differential equation gene-protein network model compared to a regular (standard) cellular automaton. Furthermore, the presented model is implemented by a field programmable gate array and experiments validate its operations. It is then revealed that the presented model is much more hardware-efficient compared to a standard numerical integration formula of the differential equation model.

关键词： Integrated circuit modeling Biological system modeling Mathematical models Automata Bifurcation Orbits Computational modeling Gene-protein network cellular automaton nonlinear dynamics bifurcation phenomena field programmable gate array

来源：评论

学校读者我要写书评

暂无评论

FPGA Implementation of Efficient CFAR Algorithm for Radar Systems

引用

SENSORS 2023年第2期23卷 954-954页

作者： Sim, Yunseong Heo, Jinmoo Jung, Yongchul Lee, Seongjoo Jung, Yunho Korea Aerosp Univ Sch Elect & Informat Engn Goyang Si 10540 South Korea Korea Aerosp Univ Dept Smart Air Mobil Goyang Si 10540 South Korea Korea Elect Technol Inst KETI Seongnam 13509 South Korea Sejong Univ Dept Informat & Commun Engn Seoul 05006 South Korea Sejong Univ Dept Convergence Engn Intelligent Drone Seoul 05006 South Korea

The constant false-alarm rate (CFAR) algorithm is essential for detecting targets during radar signal processing. It has been improved to accurately detect targets, especially in nonhomogeneous environments, such as multitarget or clutter edge environments. For example, there are sort-based and variable index-based algorithms. However, these algorithms require large amounts of computation, making them difficult to apply in radar applications that require real-time target detection. We propose a new CFAR algorithm that determines the environment of a received signal through a new decision criterion and applies the optimal CFAR algorithms such as the modified variable index (MVI) and automatic censored cell averaging-based ordered data variability (ACCA-ODV). The Monte Carlo simulation results of the proposed CFAR algorithm showed a high detection probability of 93.8% in homogeneous and nonhomogeneous environments based on an SNR of 25 dB. In addition, this paper presents the hardware design, field-programmable gate array (FPGA)-based implementation, and verification results for the practical application of the proposed algorithm. We reduced the hardware complexity by time-sharing sum and square operations and by replacing division operations with multiplication operations when calculating decision parameters. We also developed a low-complexity and high-speed sorter architecture that performs sorting for the partial data in leading and lagging windows. As a result, the implementation used 8260 LUTs and 3823 registers and took 0.6 mu s to operate. Compared with the previously proposed FPGA implementation results, it is confirmed that the complexity and operation speed of the proposed CFAR processor are very suitable for real-time implementation.

关键词： radar signal processing constant false alarm rate target detection field programmable gate array automotive radar drone detection radar

来源：评论

学校读者我要写书评

暂无评论

Low area FPGA implementation of modified histogram estimation architecture with CSAC-DPROM-OBC for medical image enhancement application

引用

INTERNATIONAL JOURNAL OF NANOTECHNOLOGY 2023年第1-4期20卷 259-280页

作者： Bonagiri, Koteswar Rao Kande, Giri Babu Reddy, P. Chandrasekhar Jawaharlal Nehru Technol Univ Dept Elect & Commun Engn Hyderabad 500085 India Marrilaxman Reddy Inst Technol & Management Domara Pocham Pally 500043 Telangana India Vasireddy Venkatadri Inst Technol Dept Elect & Commun Engn Namburu 522508 Andhrapradesh India

In this work, modified histogram estimation (MHE) architecture is proposed to verify the histogram count in the FPGA platform, and the Basic HE (BHE) architecture is also implemented for comparative purpose. The entire proposed MHE architecture is developed newly so as to reduce the logical elements involved in the HE process. In MHE architecture, dual port read only memory (DPROM), carry select adder based counter (CSAC), and Optimal Bin Counter (OBC) are used to evaluate the HE count with effective accuracy. The amount of percentage reduced by the 256 sample MHE is 17.62%, 15.41% and 23.01% for area, power and delay respectively. Additionally, the performance of the proposed MHE is compared with four existing methods HOG, HBS, MBPA and DMH. The number of flip flops utilised by the MHE architecture is 2177 for Vertex 6 device, which is less compared to the HOG and MBPA.

关键词： area basic histogram estimation CSAC carry select adder based counter delay DPROM dual port read only memory field programmable gate array medical image enhancement MHE modified histogram estimation optimal bin counter power

来源：评论

学校读者我要写书评

暂无评论

Low-Complexity FPGA Implementation of 106.24Gbps DP-QPSK Coherent Optical Receiver With Fractional Oversampling Rate Based on One FIR Filter for Resampling, Retiming and Equalizing

引用

JOURNAL OF LIGHTWAVE TECHNOLOGY 2023年第16期41卷 5244-5251页

作者： Song, Jingwei Li, Yan Qiu, Jifang Hong, Xiaobin Guo, Hongxiang Yang, Zhisheng Wu, Jian Beijing Univ Posts & Telecommun State Key Lab Informat Photon & Opt Commun Beijing 100876 Peoples R China

A novel low-complexity combined resampling, retiming and equalizing (RRE) algorithm is proposed. The RRE algorithm uses a single FIR filter for resampling, retiming and equalizing and thus lower the complexity. In the numerical simulation, with an oversampling rate of 32/27, compared to the traditional time-domain scheme with a 15-tap CMA equalizer and the frequency-domain scheme based on 256-point FFT, the RRE algorithm with a 15-tap RRE filter lowers the error vector magnitude (EVM) by 0.036 dB and 0.043 dB and the complexity is lowered by 48.3% and 31.9%, respectively. In the offline experiment, with a received optical power of -35 dBm, compared to the traditional time-domain scheme with a 15-tap CMA equalizer and the frequency-domain scheme based on 256-point FFT, the RRE algorithm with a 15-tap RRE filter lowers the EVM by 0.26 dB and 0.36 dB. And the RRE algorithm respectively lowers the complexity by 48.3% and 31.9%. The RRE algorithm also enables a real-time 106.24 Gbps (26.56 GBaud) DP-QPSK coherent optical receiver based on a single FPGA chip using four 6-bit ADCs with a sampling rate of similar to 31.48 GSa/s. The FPGA-based receiver achieves a sensitivity of -34 dBm at BER of 1E-3. As far as we know, this is the highest reported bit rate of a coherent receiver based on a single FPGA chip.

关键词： Clock recovery coherent optical communication digital signal processing field programmable gate array

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：