检索结果-内蒙古大学图书馆

IEEE 31st International Symposium on Industrial Electronics (ISIE)

作者： Bosson, Serge Pacome Janous, Stepan Kosan, Tomas Peroutka, Zdenek Univ West Bohemia Fac Elect Engn Plzen Czech Republic Univ West Bohemia Res & Innovat Ctr Elect Engn Plzen Czech Republic

ISBN: (纸本)9781665482400

The paper deals with the design, implementation and experiment of an embedded drive system diagnostics for interior permanent magnet synchronous motor drive for safety critical applications. The drive is intended for traction drive applications therefore it uses a combination of a digital signal processor (DSP) and a field programmable gate array (FPGA) as is often the case in modern industrial drives. A real-time harmonic monitoring is employed to indicate the development or existence of fault within the drive system. This is achieved by embedding a real-time measurement and control within the DSP operating in conjunction with the motor control, to estimate the machine's transient reactance after applying an excitation with voltage pulses using the switching of the inverter, and a real-time frequency analysis of the parameter within the FPGA operating independently of the motor control. Experimental testing is used to validate the proposed condition monitoring algorithm on a laboratory prototype of interior permanent magnet synchronous motor drive with a rated power of 4.5 kW.

关键词： fault detection digital signal processors field programmable gate array interior permanent magnet synchronous motor real-time systems pulse-width modulated inverter transient excitation fast Fourier transform hardware FFT processor spectral analysis

来源：评论

学校读者我要写书评

暂无评论

FPGA Implementation of Phase Recovery Technique for Complex Transforms

FPGA Implementation of Phase Recovery Technique for Complex ...

引用

IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS)

作者： Bhaskar, Poorvi Yuvaraj, S. Palanisamy, P. Thilagavathy, R. SRA Inst Sci & Technol Dept ECE Chennai Tamil Nadu India NIT Dept ECE Trichy Tamil Nadu India

ISBN: (纸本)9781665486842

The ECG signals are one of the most important signals to check the human heart's condition. On monitoring the heart continuously, a large amount of ECG signal data will be produced. So, there is a need for efficient compression techniques. Discrete Anamorphic Stretch Transform (DAST) is one of the most efficient techniques. It is a one-dimensional complex transform that includes the phase recovery technique for recovering the phase from the magnitudes. This paper deals with implementing the phase recovery block in field programmable gate array (FPGA), which will recover the phase by using magnitudes. Phase recovery block plays a key role in reconstructing the phases from the magnitudes. First, the required signal is passed through the linear filter or phase recovery filter. Then the phase value is estimated using a non-iterative algorithm depending on the linearity and causality conditions. The new approach for the phase recovery block is also used for any complex signal transmission. The input ECG signal is taken from the MIT-BIH Arrhythmia database and implementation is carried out in Artix-7 NEXYS 4 DDR FPGA Board. The performance of the phase recovery block is quantified in terms of hardware and computational complexity.

关键词： Electrocardiogram Discrete Anamorphic Stretch Transform Coordinate Rotation Digital Compute Phase Recovery field programmable gate array

来源：评论

学校读者我要写书评

暂无评论

RT-FLOW: FPGA Implementation of Real-Time Optical-Flow-Based SLAM for High-Speed Tracking and High-Quality Mapping

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS 2025年

作者： Li, Mengjie Zhang, Yiming He, Siqi Liu, Qi Zeng, Xiaoyang Chen, Chixiao Zhu, Haozhe Fudan Univ Coll Integrated Circuits & Micronano Elect State Key Lab Integrated Chips & Syst Shanghai 200438 Peoples R China

Simultaneous Localization and Mapping (SLAM) is pivotal for autonomous robotics, yet feature-based SLAM systems struggle with sparse environmental representations and robustness under dynamic conditions. Optical-flow-based SLAM (OpF-SLAM) addresses these limitations by leveraging pixel-level motion data for dense mapping;however, its computational intensity hinders real-time deployment. This paper presents RT-FLOW, an FPGA-based accelerator for OpF-SLAM that achieves real-time performance through three key innovations: 1) A feature-context encoding engine that exploits inter-frame similarity to resolve data dependency in correlation construction, reducing latency by 77.5%. 2) A heterogeneous mixed-precision flow update engine guided by correlation sparsity, enabling 3.7x faster optical flow computation with negligible accuracy loss. 3) A pivoting-free linear solver using Householder transformations for stable pose optimization. Implemented on Xilinx XCZU7EV FPGA, RT-FLOW processes full-image pixels per frame at 65 fps with an energy efficiency of 0.358 mu J/point, outperforming previous FPGA designs. Evaluated on benchmark datasets, RT-FLOW demonstrates robustness in diverse environments while maintaining sub-110mJ/frame energy consumption. This work bridges the gap between algorithmic potential and hardware feasibility for high-density SLAM, empowering next-generation mobile robots with real-time scene understanding capabilities.

关键词： Simultaneous localization and mapping field programmable gate array optical flow hardware accelerator deep neural network

来源：评论

学校读者我要写书评

暂无评论

Accelerating Deep Neuroevolution on Distributed FPGAs for Reinforcement Learning Problems

引用

ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS 2021年第2期17卷 p1-17页

作者： Asseman, Alexis Antoine, Nicolas Ozcan, Ahmet S. IBM Almaden Res Ctr 650 Harry Rd San Jose CA 95120 USA

Reinforcement learning, augmented by the representational power of deep neural networks, has shown promising results on high-dimensional problems, such as game playing and robotic control. However, the sequential nature of these problems poses a fundamental challenge for computational efficiency. Recently, alternative approaches such as evolutionary strategies and deep neuroevolution demonstrated competitive results with faster training time on distributed CPU cores. Here we report record training times (running at about 1 million frames per second) for Atari 2600 games using deep neuroevolution implemented on distributed FPGAs. Combined hardware implementation of the game console, image preprocessing and the neural network in an optimized pipeline, multiplied with the system level parallelism enabled the acceleration. These results are the first application demonstration on the IBM Neural Computer, which is a custom designed system that consists of 432 Xilinx FPGAs interconnected in a 3D mesh network topology. In addition to high performance, experiments also showed improvement in accuracy for all games compared to the CPU implementation of the same algorithm.

关键词： Genetic algorithm field programmable gate array neuroevolution reinforcement learning artificial neural network

来源：评论

学校读者我要写书评

暂无评论

Efficient FPGA based architecture for high-order FIR filtering using simultaneous DSP and LUT reduced utilization

引用

IET CIRCUITS DEVICES & SYSTEMS 2021年第5期15卷 475-484页

作者： Maamoun, Mountassar Hassani, Adnane Dahmani, Samir Saadi, Hocine Ait Zerari, Ghania Chabini, Noureddine Beguenane, Rachid Univ Blida Dept Elect LATSI Lab Blida Algeria ENS Kouba LSIC Lab Dept Phys Algiers Algeria Royal Mil Coll Canada Dept Elect & Comp Engn Kingston ON Canada

This paper proposes an efficient high-order finite impulse response (FIR) filter structure for field programmable gate array (FPGA)-based applications with simultaneous digital signal processing (DSP) and look-up-table (LUT) reduced utilization. The real-time updating of the filter coefficients is also put into perspective. In order to perform these objectives, both the speed and the structure of FPGA are efficiently exploited. The gap between the required input sampling frequency and the FPGA allowed maximum frequency is managed to achieve additional computing sequences. Furthermore, the special structures of the FPGA Look-up-table Shift-Register (LUT-SR) and their internal connections are fully employed for pipelining and selecting the input samples. The FPGA Block RAMs (BRAMs) are employed for handling the reconfigurable filter coefficients, and the FPGA DSP slices are associated for computing the output data of the BRAMs and the multiplexers. To synchronize the BRAM unit addressing with the LUT multiplexer selection, a single unit is used for simultaneous control. The obtained results show that the proposed reconfigurable 16-tap FIR filter offers reductions of 79.3% and 74.4% of slice utilization over the hybrid variable size partitioning (VP-Hybrid) based structure and the Radix-2(R) based structure, respectively when implemented on a Xilinx Spartan-6 XC6SLX45 FPGA. Moreover, an improvement of efficiency is achieved compared to all reputed FPGA-based architectures.

关键词： LUT-SR LUT multiplexer selection reputed FPGA-based architectures high-order FIR FIR filters special structures random-access storage simultaneous DSP efficient FPGA based architecture reconfigurable 16-tap FIR filter efficient high-order finite impulse response filter structure FPGA block RAM required input sampling frequency table lookup simultaneous digital signal processing high-order FIR filtering field programmable gate array LUT reduced utilization FPGA DSP slices field programmable gate arrays digital signal processing chips look-up-table reduced utilization reconfigurable filter coefficients Xilinx Spartan-6 XC6SLX45 FPGA BRAM unit addressing FPGA look-up-table shift-register

来源：评论

学校读者我要写书评

暂无评论

Evaluation of Tradeoffs in the Design of FPGA Fabrics Using Electrostrictive 2-D FETs

引用

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS 2021年第4期29卷 691-701页

作者： Baskaran, Saambhavi Sampson, Jack Penn State Univ Sch Elect Engn & Comp Sci Dept Comp Sci & Engn University Pk PA 16802 USA

The electrostrictive 2-D field-effect transistor (EFET) is a steep-slope device that promises to offer aggressive length and voltage scalability. Two key features of this device are its high-drive strength with high ON-OFF current ratio and the isolated back-gate terminal, which provides us the fourth knob to control the transistor drive strength. The disadvantage of the technology is the increased device capacitance incurred due to the additional piezoelectric layer in the transistor structure. Second, although the back-gate biasing of EFETs provides us the fourth knob of control, statically biasing the back gate increases the static power consumption. Despite the idiosyncrasies of the technology, this work shows the use of EFETs in field-programmable gate arrays (FPGAs) to be advantageous because the added energy cost of device capacitance gets amortized by the improvement in performance and energy efficiency of using high-drive EFET transistors in the FPGA interconnect architecture. We also show that co-optimization of back-bias voltage along with transduction efficiency is essential in the FPGA subcircuit level for achieving an energy-efficient architecture. This work highlights the specific design approach tradeoffs that differ from prior CMOS approaches and provides guidance for the engineering parameters necessary for EFETs to evolve as a competitive technology.

关键词： field programmable gate arrays Capacitance Logic gates Strain Photonic band gap Delays Routing Electrostrictive field-effect technology field programmable gate array integrated circuit technology beyond CMOS two-dimensional materials

来源：评论

学校读者我要写书评

暂无评论

Biomimetic FPGA-based spatial navigation model with grid cells and place cells

引用

NEURAL NETWORKS 2021年 139卷 45-63页

作者： Krishna, Adithya Mittal, Divyansh Virupaksha, Siri Garudanagiri Nair, Abhishek Ramdas Narayanan, Rishikesh Thakur, Chetan Singh Indian Inst Sci Dept Elect Syst Engn NeuRonICS Lab Bangalore 560012 Karnataka India Indian Inst Sci Mol Biophys Unit Cellular Neurophysiol Lab Bangalore 560012 Karnataka India

The mammalian spatial navigation system is characterized by an initial divergence of internal representations, with disparate classes of neurons responding to distinct features including location, speed, borders and head direction;an ensuing convergence finally enables navigation and path integration. Here, we report the algorithmic and hardware implementation of biomimetic neural structures encompassing a feed-forward trimodular, multi-layer architecture representing grid-cell, place-cell and decoding modules for navigation. The grid-cell module comprised of neurons that fired in a grid-like pattern, and was built of distinct layers that constituted the dorsoventral span of the medial entorhinal cortex. Each layer was built as an independent continuous attractor network with distinct grid-field spatial scales. The place-cell module comprised of neurons that fired at one or few spatial locations, organized into different clusters based on convergent modular inputs from different grid-cell layers, replicating the gradient in place-field size along the hippocampal dorsoventral axis. The decoding module, a two-layer neural network that constitutes the convergence of the divergent representations in preceding modules, received inputs from the place-cell module and provided specific coordinates of the navigating object. After vital design optimizations involving all modules, we implemented the tri-modular structure on Zynq Ultrascale+ field-programmable gate array silicon chip, and demonstrated its capacity in precisely estimating the navigational trajectory with minimal overall resource consumption involving a mere 2.92% Look Up Table utilization. Our implementation of a biomimetic, digital spatial navigation system is stable, reliable, reconfigurable, real-time with execution time of about 32 s for 100k input samples (in contrast to 40 minutes on Intel Core i7-7700 CPU with 8 cores clocking at 3.60 GHz) and thus can be deployed for autonomous-robotic navigation without

关键词： Path integration Autonomous robot navigation Time-multiplexing Continuous attractor network field programmable gate array Neuromorphic computing

来源：评论

学校读者我要写书评

暂无评论

A Self-Attention Network for Deep JSCCM: The Design and FPGA Implementation

A Self-Attention Network for Deep JSCCM: The Design and FPGA...

引用

IEEE Global Communications Conference (GLOBECOM)

作者： Fujimaki, Shohei Inoue, Yoshiaki Hisano, Daisuke Maruta, Kazuki Nakayama, Yu Hara-Azumi, Yuko Tokyo Inst Technol Sch Engn Tokyo Japan Osaka Univ Grad Sch Engn Osaka Japan Tokyo Univ Sci Dept Elect Engn Tokyo Japan Tokyo Univ Agr & Technol Inst Engn Tokyo Japan

ISBN: (纸本)9781665435406

The deep joint source-channel coding and modulation (JSCCM) is a promising technology to realize efficient communication over extreme environments such as underwater area. In previous works, it is shown that deep convolutional neural networks (CNN) can successfully learn JSCCM encoder and decoder, outperforming conventional separation-based coding and modulation schemes in low signal-to-noise ratio settings. This paper proposes a new architecture for deep JSCCM based on the self-attention mechanism. We show that the proposed architecture achieves significant performance improvement compared with the CNN-based schemes while requiring a smaller network size in terms of the number of weight parameters. Furthermore, we present efficient hardware implementation of the proposed JSCCM encoder on a field programmable gate array (FPGA). In particular, we demonstrate that a systolic-array-like structure is effective for FPGA implementation of the proposed JSCCM scheme based on the self-attention mechanism.

关键词： Combined source-channel coding field programmable gate array image coding deep learning underwater communication

来源：评论

学校读者我要写书评

暂无评论

Distributed Deep Learning With GPU-FPGA Heterogeneous Computing

引用

IEEE MICRO 2021年第1期41卷 15-22页

作者： Tanaka, Kenji Arikawa, Yuki Ito, Tsuyoshi Morita, Kazutaka Nemoto, Naru Terada, Kazuhiko Teramoto, Junji Sakamoto, Takeshi NTT Corp NTT Device Technol Labs Atsugi Kanagawa 2430198 Japan NTT Corp NTT Software Innovat Ctr Tokyo 1808585 Japan

In distributed deep learning (DL), collective communication algorithms, such as Allreduce, used to share training results between graphical processing units (GPUs) are an inevitable bottleneck. We hypothesize that a cache access latency occurred at every Allreduce is a significant bottleneck in the current computational systems with high-bandwidth interconnects for distributed DL. To reduce this frequency of latency, it is important to aggregate data at the network interfaces. We implement a data aggregation circuit in a field-programmable gate array (FPGA). Using this FPGA, we proposed novel Allreduce architecture and training strategy without accuracy degradation. Results of the measurement show Allreduce latency reduction to 1/4. Our system can also conceal about 90% of the communication overhead and improve scalability by 20%. The end-to-end time consumed for training in distributed DL with ResNet-50 and ImageNet is reduced to 87.3% without any degradation in validation accuracy.

关键词： Cache Storage Data Aggregation Deep Learning Artificial Intelligence field programmable gate arrays Graphics Processing Units Neural Chips Distributed Deep Learning GPU FPGA Heterogeneous Computing Collective Communication Algorithms Graphical Processing Units Cache Access Latency High Bandwidth Interconnects Network Interfaces field programmable gate array Novel Allreduce Architecture Communication Overhead Distributed DL Training Strategy Allreduce Latency Reduction Res Net 50 Image Net field programmable gate arrays Random Access Memory Graphics Processing Units Training Data Distributed Databases Bandwidth Data Aggregation Deep Learning

来源：评论

学校读者我要写书评

暂无评论

FPGA based Adaptive Lock-in Amplifier 19

FPGA based Adaptive Lock-in Amplifier

引用

19th IEEE-India-Council International Conference (INDICON)

作者： Aparna, A., V Sukumar, N. Sumathi, P. Indian Inst Technol Dept Elect Engn Roorkee 247667 Uttarakhand India

ISBN: (纸本)9781665473507

An adaptive Lock-in Amplifier (LIA), which works for sinusoidal signals in the frequency range of 9 - 11 kHz and an amplitude range of 0.3 - 10 V is being proposed. LIAs can extract useful signals from a very high noisy environment. For an adaptive LIA, the reference signal has to be generated by a phase locked loop (PLL) from the incoming signal. The phase and frequency error between the reference and input signals may reduce the accuracy of LIA system. To eliminate this error, a PLL with an enhanced phase detector is proposed. Using this quadrature PLL (QPLL), the accuracy of the LIA has effectively increased in the designed frequency range. The simulation results show that the proposed model can extract the amplitude of the signal buried in noise with a signal-to-noise ratio (SNR) as small as 10 dB and harmonics. The system-on-chip implementation of the adaptive LIA is carried out in the Altera Stratix III FPGA device. Testing the implementation for noise as well as harmonics have been performed in the designed frequency and amplitude range.

关键词： Adaptive Lock-in amplifier field programmable gate array Quadrature phase locked loop Phase detector

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：