The purpose of a communication system is to transmit an information-bearing message signal through a channel that separates a transmitter from a receiver. The modulated carrier is often induced and interfered with by ...
详细信息
The purpose of a communication system is to transmit an information-bearing message signal through a channel that separates a transmitter from a receiver. The modulated carrier is often induced and interfered with by various noise sources. The co-channel separation system is a demodulation process function that operates at the same carrier modulation system. Here, we adopted the field-programmable gate array (FPGA) design platform configuration to develop, implement and achieve co-channel separation for an amplitude-locked loop demodulation chip-design digital system with additive white Gaussian noise interference. In this paper, the compact reconfigurable I/O built-in FPGA chip system is integrated and applied to obtain the cross-field relevant integration function for communication and chip-design system via programming in a graphical language. Additionally, the FPGA chip-design system runs all of the program code in hardware and provides high reliability and determinism. This cross-field ideal is adopted to save time and reduce complexity in the design development of a custom circuitry system. The FPGA chip-design system described in this paper is also used to achieve a digital communication chip prototype design model, followed by presentation of the steps necessary for building and program verification. The communication and chip-design concept may provide very useful physical applications for the industry.
A new fault detection circuit for on-chip design is presented in this article. The circuit function to detect substation faults has been investigated and verified on an Altera DE1 platform with Cyclone II 2C20 field-p...
详细信息
A new fault detection circuit for on-chip design is presented in this article. The circuit function to detect substation faults has been investigated and verified on an Altera DE1 platform with Cyclone II 2C20 field-programmable gate array. The experimental results showed that the hardware prototyping is feasible for practical applications. Compared to existing fault diagnosis methods, the proposed hardware implementation is more suitable for real-time applications as it is able to achieve high-speed inference. Additionally, the computational burden on host computers in a supervisory control and data acquisition system can thus be reduced through the presented framework.
In this paper, a novel prototype laboratory is presented for engineering education, in which experiments are based on the fractional calculus. The prototypes of analog and digital fractional-order proportional-integra...
详细信息
In this paper, a novel prototype laboratory is presented for engineering education, in which experiments are based on the fractional calculus. The prototypes of analog and digital fractional-order proportional-integral-derivative (PID) controllers are built in the laboratory. These fractional-order PID controllers are applied to linear and nonlinear plants to demonstrate the effectiveness of fractional-order calculus in real time. These experiments are designed, developed, and implemented on the analog and digital platforms. These controllers are integrated to control the DC motor, brushless DC motor, and magnetic levitation modules through hardware-in-loop as well as stand-alone systems. The analog type of fractional-order PID implementation is carried out by using passive components (i.e. resistances and capacitances) with an operational amplifier. However, real-time digital implementation is carried out using field-programmable gate array and digital signal processor. This paper describes how the experiments on fractional calculus can be tailored for graduate, undergraduate students' education and extended for research in this emerging area.
The design and implementation of a sparse matrix-matrix multiplication architecture on field-programmable gate arrays is presented. Performance of the design, in terms of computational latency, as well as the associat...
详细信息
The design and implementation of a sparse matrix-matrix multiplication architecture on field-programmable gate arrays is presented. Performance of the design, in terms of computational latency, as well as the associated power-delay and energy-delay tradeoff are studied. Taking advantage of the sparsity of the input matrices, the proposed design allows user-tunable power-delay and energy-delay tradeoffs by employing different number of processing elements (PEs) in the architecture design and different block size in the blocking decomposition. Such ability allows designers to employ different on-chip computational architecture for different system power-delay and energy-delay requirements. It is in contrast to conventional dense matrix-matrix multiplication architectures that always favor the maximum number of PEs and largest block size. In our implementation, the better energy consumption and power-delay product favors less PEs and smaller block size for the 90%-sparsity matrix-matrix multiplications. Although in order to achieve better energy-delay product, more PEs and larger block size are preferred. Copyright (c) 2011 John Wiley & Sons, Ltd.
Spike timing-dependent plasticity (STDP) is crucial for training neural networks (SNNs), offering a hardware-compatible and energy-efficient alternative to backpropagation. Current STDP hardware platforms encounter si...
详细信息
Spike timing-dependent plasticity (STDP) is crucial for training neural networks (SNNs), offering a hardware-compatible and energy-efficient alternative to backpropagation. Current STDP hardware platforms encounter significant challenges, such as slowness, high energy consumption, and limited configurability. To overcome these issues, this paper presents a high-performance SNN training platform. A parallel multi-ring first-in-first-out structure with event-driven processing for spike handling is proposed, which enhances training efficiency, and the flexible pre- and post-synaptic parallelism enhances speed and flexibility. A dataflow strategy that considers spike sparsity unifies spike representations by updating weights only upon spike arrival, thereby promoting logical symmetry and enabling parallelization. Additionally, three encoding strategies, including a hybrid encoding, are implemented to address diverse scenarios. Leveraging Xilinx field-programmable gate array and Jetson Xavier NX, the proposed platform achieved remarkable performance gains. On the MNIST dataset, the platform demonstrated a 22.51x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times $$\end{document} speedup, 2.13% accuracy boost, and 14.79x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times $$\end{document} reduction in energy consumption. On the Fashion-MNIST dataset, it improved accuracy by 10.74% and the training speed by 1.89x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargi
In this paper, a software-defined radio (SDR) based transceiver system is designed and implemented on the system-on-chip (SoC) platform, which consists of a high-speed Arm embedded processor and a reconfigurable field...
详细信息
In this paper, a software-defined radio (SDR) based transceiver system is designed and implemented on the system-on-chip (SoC) platform, which consists of a high-speed Arm embedded processor and a reconfigurable field-programmable gate array (FPGA). In the proposed SDR transceiver, the real-time baseband signal generation and adaptive digital predistortion (ADPD) units are implemented on the SoC platform. Memory polynomial model based ADPD solution is implemented to linearize the radio frequency (RF) power amplifiers (PAs). The implementation of the ADPD on a reconfigurable FPGA platform makes the system flexible and cost-effective. The PA characterization, in terms of model extraction and coefficient calculation, is done in real-time. These calculated coefficients are updated in the transmission path to precondition the transmitted signal before it is applied to the PA. The proposed ADPD is applied at the baseband level. Therefore, it can be used for different classes of PA operating at different RF carrier frequencies. A long-term evolution (LTE) signal with 20 MHz bandwidth and 11 dB peak to average power ratio (PAPR) is used for simulation and measurement purposes. The LTE signal is amplified using a GaN-based harmonically tuned continuous Class-F PA in measurement. The performance of the implemented ADPD scheme is analyzed in terms of NMSE, ACPR and EVM.
This paper presents a digital interpolation module for high-resolution sinusoidal encoders. The proposed strategy performs phase-shifting manipulation on the sinusoidal encoder signals, and takes advantage of the line...
详细信息
This paper presents a digital interpolation module for high-resolution sinusoidal encoders. The proposed strategy performs phase-shifting manipulation on the sinusoidal encoder signals, and takes advantage of the linear sections of the absolute values of the phase-shifted sinusoidal signals. Thus, an approximately linear signal is obtained, from which the displacement can be linearly determined. Simulation result shows that the theoretical error of the proposed interpolate strategy is within 0.072 degrees over the 360 degrees signal period. Furthermore, a digital interpolation module is developed by implementing the interpolate strategy in a field-programmable gate array (FPGA), and performance evaluation experiments are carried out by applying it to a linear optical encoder with a pitch of 20 mu m. The effectiveness of the digital interpolation module has been validated by both simulation results and experimental results. (C) 2018 Published by Elsevier B.V.
High-quality digital tachometers are incorporated into servo, mechatronic, robotic and precision production systems for the calculation of accurate, high-bandwidth, digital velocity information. The M/T-type tachomete...
详细信息
High-quality digital tachometers are incorporated into servo, mechatronic, robotic and precision production systems for the calculation of accurate, high-bandwidth, digital velocity information. The M/T-type tachometer and the related constant sample-time digital tachometer (CSDT) have been shown to perform well in many such systems. However, sensor nonideality can introduce very significant errors into the tachometer output. In this paper, it is shown that performance can be greatly improved (i.e., the noise present in the velocity signal significantly reduced) by oversampling the counter values used for velocity calculation. The counting and oversampling operations inherent to the oversampled CSDT (OCSDT) are implemented using a field-programmable gate array (FPGA). The design of the digital circuitry is described in detail, with particular emphasis on the circuits required for implementation and control of the oversampling operation. The FPGA acts as a peripheral device to a digital signal processor (DSP). Besides implementing some division-based calculations to generate a velocity word, the DSP can carry out other measurement and control functions, as required by the overall system. Simulation studies and experimental results are used to highlight the advantages of the oversampling technique.
This paper proposes an area, speed and power-optimized band-pass digital signal processing filter targeted for Kintex-7 fieldprogrammablegatearray device. The filter was designed using MATLAB and Simulink and code ...
详细信息
This paper proposes an area, speed and power-optimized band-pass digital signal processing filter targeted for Kintex-7 fieldprogrammablegatearray device. The filter was designed using MATLAB and Simulink and code generated using HDL Coder from Math-Works. The implementation was created using a novel high-level synthesis design method, which reduces pessimism associated with bit-width constraints in synthesis for inputs, outputs, and intermediate data nodes. MATLAB HDL coder generated Register Transfer Level (RTL) code was implemented on Xilinx Kintex 7 using Vivado software. The obtained results are superior to those of previous implementations for exact filter specifications. We also performed an RTL simulation for the filter and compared the functional verification results with a golden double-precision implementation in MATLAB. The results suggest that constraining the bit width and pessimism reduction has less than 1% impact on the filter accuracy within limits specified by architecture specifications.
The extended greatest common divisor (XGCD)computation is a critical component in various cryptographic applications and algorithms, including both pre- and postquantum cryptosystems. In addition to computing the grea...
详细信息
The extended greatest common divisor (XGCD)computation is a critical component in various cryptographic applications and algorithms, including both pre- and postquantum cryptosystems. In addition to computing the greatest common divisor (GCD) of two integers, the XGCD also produces Be z out coefficients b(a )and b(b) which satisfy GCD(a, b) = a x b(a )+ b x b(b). In particular, computing the XGCD for large integers is of significant interest. Most recently, XGCD computation between 6479-bit integers is required for solving Nth-degree truncated polynomial ring unit (NTRU) trapdoors in FALCON, a National Institute of Standards and Technology (NIST)-selected postquantum digital signature scheme. To this point, existing literature has primarily focused on exploring software-based implementations for XGCD. The few existing high-performance hardware architectures require significant hardware resources and may not be desirable for practical usage, and the lightweight architectures suffer from poor performance. To fill the research gap, this work proposes a novel FPGA-based scalable and lightweight accelerator for large integer XGCD (FELIX). First, a new algorithm suitable for scalable and lightweight computation of XGCD is proposed. Next, a hardware accelerator (FELIX)is presented, including both constant- and variable-time versions. Finally, a thorough evaluation is carried out to showcase the efficiency of the proposed FELIX. In certain configurations, FELIX involves 81% less equivalent area-time product (eATP)than the state-of-the-art design for 1024-bit integers, and achieves a 95% reduction in latency over the software for 6479-bit integers(FALCON parameter set) with reasonable resource usage. Overall, the proposed FELIX is highly efficient, scalable, lightweight, and suitable for very large integer computation, making it the first such XGCD accelerator in the literature (to the best of our knowledge).
暂无评论