Fractional-order chaotic oscillators are a hot topic of research and nowadays many new mathematical models have been introduced, which can be suitable for the development of novel applications in all related fields of...
详细信息
Fractional-order chaotic oscillators are a hot topic of research and nowadays many new mathematical models have been introduced, which can be suitable for the development of novel applications in all related fields of science and engineering. However, the challenge is their implementation that can be performed using electronic devices. In this manner, we highlight the implementation of different families of fractional-order chaotic oscillators using field-programmable gate arrays (FPGAs). We detail the hardware implementation when solving the mathematical models applying the Grunwald-Letnikov method, and highlight the short-memory principle, which memory length is designed using specialized random-access-memory and read-only-memory blocks. In addition, we show how to reduce hardware resources by reusing blocks that are controlled by an especial architecture that is introduced herein, in order to perform an efficient processing of the data. Finally, using Cyclone IV GX FPGA DE2i-150 from Altera, DAS1612 digital-to-analog converter and fixed-point arithmetic of 32 bits, we provide experimental results that were observed in a Lecroy's oscilloscope showing working frequencies of fractional-order chaotic attractors between 77.59 and 84.9 MHz. (C) 2019 Elsevier B.V. All rights reserved.
A Gaussian Mixture Model (GMM) based machine learning algorithm has been applied to the problem of gamma/neutron pulse shape discrimination (PSD). The algorithm has been successfully implemented on a standard PC as we...
详细信息
A Gaussian Mixture Model (GMM) based machine learning algorithm has been applied to the problem of gamma/neutron pulse shape discrimination (PSD). The algorithm has been successfully implemented on a standard PC as well as a fieldprogrammablegatearray (FPGA). Here we describe the GMM classifier and its implementation on these two different types of hardware. We compare the performance of the algorithm on these two platforms against each other, along with other standard techniques applied in PSD. Our results show that the FPGA-based GMM classifier outperforms the standard PSD techniques in terms of classification accuracy at low particle energy and executes more quickly than its CPU-based counterpart.
INS-GPS integration is a fundamental task used to enhance the accuracy of an inertial navigation system alone. However, its implementation complexity has been a challenge to most embedded systems. This paper proposes ...
详细信息
INS-GPS integration is a fundamental task used to enhance the accuracy of an inertial navigation system alone. However, its implementation complexity has been a challenge to most embedded systems. This paper proposes a low-cost FPGA-based INS-GPS integration system, which consists of a Kalman filter and a soft processor. Moreover, we also evaluate the navigation algorithm on a low-cost ARM processor. Processing times and localization accuracy are compared in both cases for single and double precision floating-point format. Experimental results show the advantages of the FPGA-based approach over the ARM-based approach. The proposed architecture can operate at 100 Hz and demonstrates the advantage of using FPGAs to design low-cost INS-GPS localization systems.
In order to solve the problem of virtual power loss introduced by the L/C-based switch model in real-time simulation of power electronic converters, a modified algorithm is proposed which can be adopted in the L/C-bas...
详细信息
In order to solve the problem of virtual power loss introduced by the L/C-based switch model in real-time simulation of power electronic converters, a modified algorithm is proposed which can be adopted in the L/C-based switch model. This paper analyzes the causes of virtual power loss from a numerical algorithm perspective and concludes that the primary cause is the initial error of the L/C-based switch model during switching. Therefore, an additional modified algorithm is proposed to eliminate it. To minimize the impact of the proposed algorithm on the computation speed and resource consumption of field-programmable gate array (FPGA) in real-time simulation, an FPGA implementation architecture is proposed. Compared with the conventional electromagnetic transient simulation algorithm, the proposed algorithm only requires one additional clock cycle and barely increases the resource consumption. The effectiveness and superiority of the proposed algorithm are verified on a self-built FPGA-based real-time simulation platform.
FPGAs are drawing increasing attention in resolving molecular dynamics (MD) problems, and have already been applied in problems such as two-body potentials, force fields composed of these potentials, etc. Competitive ...
详细信息
FPGAs are drawing increasing attention in resolving molecular dynamics (MD) problems, and have already been applied in problems such as two-body potentials, force fields composed of these potentials, etc. Competitive performance is obtained compared with traditional counterparts such as CPUs and GPUs. However, as far as we know, FPGA solutions for more complex and real-world MD problems, such as multi-body potentials, are seldom to be seen. This work explores the prospects of state-of-the-art FPGAs in accelerating multi-body potential. An FPGA-based accelerator with customized parallel dataflow that features multi-body potential computation, motion update, and internode communication is designed. Major contributions include: (1) parallelization applied at different levels of the accelerator;(2) an optimized dataflow mixing atom-level pipeline and cell-level pipeline to achieve high throughput;(3) a mixed-precision method using different precision at different stages of simulations;and (4) a communication-efficient method for internode communication. Experiments show that, our single-node accelerator is over 2.7x faster than an 8-core CPU design, performing 20.501 ns/day on a 55,296-atom system for the Tersoff simulation. Regarding power efficiency, our accelerator is 28.9x higher than I7-11700 and 4.8x higher than RTX 3090 when running the same test case.
In the 5G and beyond networks, low-latency digital signatures are essential to ensure the security, integrity, and non-repudiation of massive data in communication processes. The binary finite field-based elliptic cur...
详细信息
In the 5G and beyond networks, low-latency digital signatures are essential to ensure the security, integrity, and non-repudiation of massive data in communication processes. The binary finite field-based elliptic curve digital signature algorithm (ECDSA) is particularly suitable for achieving low-latency digital signatures due to its carry-free characteristics. This paper proposes a low-latency and universal architecture for point multiplication (PM) and double point multiplication (DPM) based on the differential addition chain (DAC) designed for signing and verification in ECDSA. By employing the DAC, the area-time product of DPM can be decreased, and throughput efficiency can be increased. Besides, the execution pattern of the proposed architecture is uniform to resist simple power analysis and high-order power analysis. Based on the data dependency, two Karatsuba-Ofman multipliers and four non-pipeline squarers are utilized in the architecture to achieve a compact timing schedule without idle cycles for multipliers during the computation process. Consequently, the calculation latency of DPM is minimized to five clock cycles in each loop. The proposed architecture is implemented on Xilinx Virtex-7, performing DPM in 3.584, 5.656, and 7.453 mu s with 8135, 13372, and 17898 slices over GF(2(163)), GF(2(233)), GF(2(283)), respectively. In the existing designs that are resistant to high-order analysis, our architecture demonstrates throughput efficiency improvements of 36.7% over GF(2(233)) and 9.8% over GF(2(283)), respectively.
Embedded systems (electronic systems with a dedicated purpose that are part of larger devices) are increasing their relevance with the rise of the Internet of Things (IoT). Such systems are often resource constrained,...
详细信息
Embedded systems (electronic systems with a dedicated purpose that are part of larger devices) are increasing their relevance with the rise of the Internet of Things (IoT). Such systems are often resource constrained, battery powered, connected to the intemet, and exposed to an increasing number of threats. An approach to detect such threats is through an anomaly-based intrusion detection with machine-learning techniques. However, most of these techniques were not created with energy efficiency in mind. This paper presents an anomaly-based method for network intrusion detection in embedded systems. The proposed method maintains the classifier reliability even when network traffic contents changes. The reliability is achieved through a new rejection mechanism and a combination of classifiers. The proposed approach is energy-efficient and well suited for hardware implementation. The experiments presented in this paper show that the hardware versions of the machine learning algorithms consume 46% of the energy used by their software counterparts, and the feature extraction and packet capture modules consume 58% and 37% of their respective software counterparts. (C) 2018 Elsevier Ltd. All rights reserved.
This paper describes a novel hardware-oriented algorithm that can be implemented on a field-programmable gate array in a high-speed vision platform for detection of multiple objects with clear texture information in i...
详细信息
This paper describes a novel hardware-oriented algorithm that can be implemented on a field-programmable gate array in a high-speed vision platform for detection of multiple objects with clear texture information in images of 512 x 512 pixels at 10000 frames per second (fps) under complex background. The proposed algorithm is specially designed for devices with limited hardware resource for high-frame-rate, high-data-throughput, and high-parallelism processing of video streams with low latency. The proposed algorithm is based on the conventional histograms of oriented gradient (HOC) descriptor and support vector machine classifier algorithms. Considering the trade-off between speed and accuracy, many hardware-based optimization operations were implemented. The data throughput is nearly 29.30 Gbps while the latency for feature extraction is 0.76 us (61 clock period). After hardware-based image processing, the source image and the detected object features can be transferred to a personal computer for recording or post-processing at 10000 fps. Several experiments were done to demonstrate the performance of our proposed algorithm for ultra-high-speed moving object detection with clear texture information in images.
The execution of centroid extraction algorithms using a microprocessor consumes considerable resources when compared to the other steps involved in star trackers. This paper presents a method to identify star centroid...
详细信息
The execution of centroid extraction algorithms using a microprocessor consumes considerable resources when compared to the other steps involved in star trackers. This paper presents a method to identify star centroids in star trackers by pre-processing the pixels using a field-programmable gate array (FPGA) directly in the stream transmitted by an image sensor. The dedicated hardware filters the star pixels and transmits them to a processor, which computes the centroids of the respective image using an infinite impulse response filter. Thus, there is a substantial decrease in memory consumption and a reduction of the processor usage during the attitude determination computation, making the process more attractive for small satellites. A hardware-in-the-loop simulation is presented to test the performance of the system. It was possible to achieve a subpixel precision in the centroid coordinates' estimation, and also lower execution times in comparison with methods based on the processing of whole images.
The DRAM-Based Reconfigurable Acceleration Fabric (DRAF) uses commodity DRAM technology to implement a bit-level, reconfigurable fabric that improves area density by 10 times and power consumption by more than 3 times...
详细信息
The DRAM-Based Reconfigurable Acceleration Fabric (DRAF) uses commodity DRAM technology to implement a bit-level, reconfigurable fabric that improves area density by 10 times and power consumption by more than 3 times over conventional field-programmable gate arrays. Latency overlapping and multicontext support allow DRAF to meet the performance and density requirements of demanding applications in datacenter and mobile environments.
暂无评论