Circuit clustering algorithms fit synthesised circuits into field programmable gate array (FPGA) configurable logic blocks (CLBs) efficiently. This fundamental process in FPGA CAD flow directly impacts both effort req...
详细信息
Circuit clustering algorithms fit synthesised circuits into field programmable gate array (FPGA) configurable logic blocks (CLBs) efficiently. This fundamental process in FPGA CAD flow directly impacts both effort required and performance achievable in subsequent place-and-route processes. Circuit clustering is limited by hardware constraints of specific target architectures. Hence, better circuit clustering approaches are essential for improving device utilisation whilst at the same time optimising circuit performance parameters such as, e.g. power and delay. In this study, the authors present a method based on multi-objective genetic algorithm (MOGA) to facilitate circuit clustering. They address a number of challenges including CLB input bandwidth constraints, improvement of CLB utilisation, minimisation of interconnects between CLBs. The authors' new approach has been validated using the 'Golden 20' MCNC benchmark circuits that are regularly used in FPGA-related literature. The results show that the method proposed in this study achieves improvements of up to 50% in clustering, routability and timing when compared to state-of-the-art approaches including VPack, T-VPack, RPack, DPack, HDPack, MOPack and iRAC. The key contribution of this work is a flexible EDA flow that can incorporate numerous objectives required to successfully tackle real-world circuit design on FPGA, providing device utilisation at increased design performance.
Feature point matching is an important step for many vision-based unmanned-aerial-vehicle applications. This paper presents the development of a new feature descriptor for feature point matching that is well suited fo...
详细信息
Feature point matching is an important step for many vision-based unmanned-aerial-vehicle applications. This paper presents the development of a new feature descriptor for feature point matching that is well suited for micro unmanned aerial vehicles equipped with a low-resource, compact, lightweight, low-power embedded vision sensor. The Basis Sparse-Coding Inspired Similarity descriptor uses theory taken from sparse coding to provide an efficient image feature description method for frame-to-frame feature point matching. This descriptor requires simple mathematical operations, uses comparatively small memory storage, and can support color and grayscale feature descriptions. It is an excellent candidate for implementation on low-resource systems that require real-time performance, where complex mathematical operations are prohibitively expensive. To demonstrate its performance, the feature matching result was used to calculate a frame-to-frame homography that is essential to unmanned-aerial-vehicle applications such as pose estimation and obstacle detection for navigation. The proposed descriptor was tested on two video sequences and one dataset of real aerial images. Experimental results show that, along with performing in situations where existing complex descriptors cannot be used, the Basis Sparse-Coding Inspired Similarity descriptor also performs slightly better than these other methods on the task of homography calculation. Our experimental results and analysis show that the Basis Sparse-Coding Inspired Similarity descriptor is an excellent candidate for a resource-limited vision sensor for micro unmanned aerial vehicles.
This paper presents a COordinate Rotation DIgital Computer (CORDIC)-based architecture of the sliding discrete Fourier transform (SDFT) for the real-time spectrum analysis with a refreshing mechanism through which the...
详细信息
This paper presents a COordinate Rotation DIgital Computer (CORDIC)-based architecture of the sliding discrete Fourier transform (SDFT) for the real-time spectrum analysis with a refreshing mechanism through which the design can provide reduced and bounded error-accumulation due to the recursive nature of the existing SDFT algorithms. The proposed design is scalable with the transform length and the calculable number of the DFT bins, and can provide high throughput for a single bin evaluation. The paper also presents the comparison of the conventional and the modulated SDFT architectures based on CORDIC algorithm in terms of the angle-approximation and the truncation errors. The proposed design is synthesized on the Xilinx Virtex-6 FPGA platform and is implemented in ASIC using 90nm standard cell library.
Database management systems have become an indispensable tool for industry, government, and academia, and form a significant component of modern datacenters. They can be used in a multitude of scenarios, including onl...
详细信息
Database management systems have become an indispensable tool for industry, government, and academia, and form a significant component of modern datacenters. They can be used in a multitude of scenarios, including online analytical processing, data mining, e-commerce, and scientific analysis. Given the exponential growth in new data produced each year, there is a pressure on software and hardware developers to create datacenters that can cope with increasing requirements. The authors look at the organization of a modern relational database management system and propose optimizations and redesign for the storage access, memory, and CPU.
The Boolean Satisfiability(SAT) problem is the key problem in computer theory and application. A parallel multi-thread SAT solver named pprob SAT+ on a configurable hardware is proposed. In the algorithm,multithreads ...
详细信息
The Boolean Satisfiability(SAT) problem is the key problem in computer theory and application. A parallel multi-thread SAT solver named pprob SAT+ on a configurable hardware is proposed. In the algorithm,multithreads are executed simultaneously to hide the circuit stagnate. In order to improve the working frequency and throughput of the SAT solver, the deep pipeline strategy is adopted. When all data stored in block random access memory of the field programmable gate array, the solver can achieve maximum performance. If partial data are stored in the external memory, the size of the problem instances the SAT solver can be greatly improved. The experimental results show that the speedup of three-thread SAT solver is approximately 2.4 times with single thread,and shows that the pprob SAT+ have achieved substantial improvement while a solution is found.
This paper describes a non-IQ controller for digital Low Level RF (LLRF) feedback control. Based on this non-IQ sampling method, arbitrary frequency relationship between ADC/DAC sampling clocks and IF signals can be...
详细信息
This paper describes a non-IQ controller for digital Low Level RF (LLRF) feedback control. Based on this non-IQ sampling method, arbitrary frequency relationship between ADC/DAC sampling clocks and IF signals can be employed. The nonlinearity in digital conversion can be reduced and the system dynamic performance improved. This paper analyzes the nonlinearity in conventional IQ sampling, gives the state variable description of the non-IQ algorithm, presents an implementation and its synchronization, and compares its performances with IQ sampling.
This study investigated and compared the practical methods used for the efficient field- programmablegatearray (FPGA) implementation of space-time adaptive processing (STAP). The most important part of calculating t...
详细信息
This study investigated and compared the practical methods used for the efficient field- programmablegatearray (FPGA) implementation of space-time adaptive processing (STAP). The most important part of calculating the STAP weights is the QR decomposition (QRD), which can be implemented using the modified Gram-Schmidt (MGS) algorithm. The results show that the method that uses QRD with less computational burden leads to a more effective implementation. Its structure was parameterised with the vector size to create a trade-off between the hardware and performance factors. For this purpose, QRD-MGS algorithm was first modified to increase the speed, and then the STAP weight vector was calculated. The implementation results show that decreasing the vector size decreases the resource utilisation, computational burden and the consumption power. While the computation time increases slightly, the updated rate of the STAP weights is maintained. For example, the STAP weights in a system with 6 antenna arrays, 10 received pulses and 200 range samples computed in 262 mu s using a vector size of 17 on the Arria10 FPGA that has a maximum of 155 mu s correlates to the QRD-MGS algorithm and 107 mu s correlates to the other parts. Therefore, QRD-MGS algorithm is the most important component of the calculation of the STAP weight vector, and its simplification leads to efficient implementation.
Advanced technology used for arithmetic computing application,comprises greater number of approximatemultipliers and approximate *** and Rounding-based Scalable ApproximateMultiplier(TRSAM)distinguish a variety of mod...
详细信息
Advanced technology used for arithmetic computing application,comprises greater number of approximatemultipliers and approximate *** and Rounding-based Scalable ApproximateMultiplier(TRSAM)distinguish a variety of modes based on height(h)and truncation(t)as TRSAM(h,t)in the *** TRSAM operation produces higher absolute error in Least Significant Bit(LSB)data shift unit.A new scalable approximate multiplier approach that uses truncation and rounding TRSAM(3,7)is proposed to increase themultiplier *** the help of foremost one bit architecture,the proposed scalable approximate multiplier approach reduces the partial *** proposed approximate TRSAM multiplier architecture gives better results in terms of area,delay,and *** accuracy of 95.2%and the energy utilization of 24.6 nJ is observed in the proposed multiplier *** proposed approach shows 0.11%,0.23%,and 0.24%less Mean Absolute Relative Error(MARE)when compared with the existing approach for the input of 8-bit,16-bit,and 32-bit *** also shows 0.13%,0.19%,and 0.2%less Variance of Absolute Relative Error(VARE)when compared with the existing approach for the input of 8-bit,16-bit,and 32-bit *** proposed approach is implemented with field-programmablegatearray(FPGA)and shows the delay of 3.640,6.481,12.505,22.572,and 36.893 ns for the input of 8-bit,16-bit,32-bit,64-bit,and 128-bit *** proposed approach is applied in digital filters designwhich shows the Peak-Signal-to-NoiseRatio(PSNR)of 25.05 dB and Structural Similarity Index Measure(SSIM)of 0.98 with 393 pJ energy consumptions when used in image *** proposed approach is simulated with Xilinx and MATLAB and implemented with FPGA.
This paper presents a new intelligent system incorporating wavelet transform, artificial neural network and fuzzy logic to automate the classification of power quality disturbance. This novel and efficient method in h...
详细信息
This paper presents a new intelligent system incorporating wavelet transform, artificial neural network and fuzzy logic to automate the classification of power quality disturbance. This novel and efficient method in hardware, based on FPGA technology showed improved performance over existing approaches for power quality disturbance detection and classification on six types of disturbances including sag, swell, transient, fluctuation, interruption and normal waveform. The approach obtained an average classification accuracy of 98.19%. The design was successfully implemented, tested and validated on Altera APEX EP20K200EBC652-1X FPGA utilizing 1209 logic cells and achieved a maximum frequency of 263.71 MHz.
A p-norm extreme learning machine (ELM) based on sparsity constraint is presented in this study for tracking of fundamental frequency, harmonic and dc in current power signals which finds application in phasor measure...
详细信息
A p-norm extreme learning machine (ELM) based on sparsity constraint is presented in this study for tracking of fundamental frequency, harmonic and dc in current power signals which finds application in phasor measurement units for wide area power network in smart grid environment. Real-time power applications typically are furnished with on-board controller and hence have constraints to stock a complex architecture. Moreover, the data from online practices are polluted by noises of diverse statistical features obtained on a sample-by-sample basis. Hence, approaches with improved learning paradigm and close model dealing with noises of varied statistical characteristics are essential. The proposed approach formulates a cost function with recursive p-norm error criterion and sparsity penalty that updates the output weights in succession besides adjusting some coefficients of the output weights to zeros that promotes quicker convergence and higher accuracy results. Exhaustive computer simulations have been carried out with synthetic signals and real-time signals to track the dynamic changes in the power signal amplitude, phase and frequency that demonstrate the accuracy, efficiency and robustness of the proposed p-norm ELM. Additionally, the new ELM network also is validated on a field programmable gate array (FPGA) hardware to prove its practicability towards current developments on phasor measurement units.
暂无评论