Harris corner detection is an algorithm frequently used in image processing and computer vision applications to detect corners in an input image. In most modern applications of image processing, there is a need for re...
详细信息
Harris corner detection is an algorithm frequently used in image processing and computer vision applications to detect corners in an input image. In most modern applications of image processing, there is a need for real time implementation of algorithms such as Harris corner detection in hardware systems such as field-programmable gate arrays (FPGAs). FPGAs allow faster algorithmic throughput, which is required to match real time speeds or cases where there is a requirement to process faster data rates. High level synthesis tools offer higher abstraction level to designers with continued verification during the design flow and hence are getting popular with the design community. This paper proposes a high speed and area optimized implementation of a Harris corner detection algorithm. The proposed implementation was actualized using a novel high-level synthesis (HLS) design method based on application-specific bit widths for intermediate data nodes. Register transfer level (RTL) code was generated using MATLAB HDL coder for HLS. The generated hardware description language (HDL) code was implemented on Xilinx ZedBoard using Vivado software and verified for functionality in real time with input video stream. The obtained results are superior to those of previous implementations in terms of area (smaller gate count on target FPGA) and speed for the same target board.
Tato práce se zabývá návrhem a implementací hardwarové akcelerace lineárního genetického programování symbolické regrese. Práce obsahuje teoretický...
详细信息
Tato práce se zabývá návrhem a implementací hardwarové akcelerace lineárního genetického programování symbolické regrese. Práce obsahuje teoretický úvod do problematiky moderních metod návrhu hardware a genetického programování. V dalších částech práce je popsán návrh a implementace akcelerátoru LGP pro symbolickou regresi.
Tato bakalářská práce rozebíra principy komunikace na sběrnici CAN a návrh a implementaci řadiče této sběrnice. Řadič je implementovaný v jazyze VHDL pro školní vývojovou...
详细信息
Tato bakalářská práce rozebíra principy komunikace na sběrnici CAN a návrh a implementaci řadiče této sběrnice. Řadič je implementovaný v jazyze VHDL pro školní vývojovou platformu FITKit. Dále práce popisuje návrh obvodů fyzické vrstvy pro připojení FITKit-u na sběrnici.
Sensors are integrated to manage the City?s property information technology. Collect data because these services are the foundation, and the sensor network is an integral part of town planning. Wireless sensors are in...
详细信息
Sensors are integrated to manage the City?s property information technology. Collect data because these services are the foundation, and the sensor network is an integral part of town planning. Wireless sensors are installed in various locations and thus have the potential for remote data transfer to reduce layer consumption and maintenance costs;Small built-in sensors are much needed. The reason for each sensor?s vital needs, and town planning of the prerequisite information, integrated sensor systems, has developed the concept of town planning. Ann FPGA can be used to solve all the problems that can be calculated. This FPGA is evidenced by the fact that it can implement the microprocessor as a Micro frame or Altera Nios II soft Xilinx. The most important thing is that reverse planning in town design clarifies the importance of landscape design by around town planning. To find the concept of the outline town planning from the perspective of town design, highlight the critical touchpoint in the smart town landscape system, and lead smart, strong principles and town design, planning, and creation. This role is, for sustainable development, the City is effectively Town Development, and it is possible to perform a beautiful town landscape.
Several scheduling algorithms that have been proposed for Real-Time Operating System (RTOS) are supposed to be optimal. However, optimal scheduling is only theoretical due to the possibility of system overload where i...
详细信息
Several scheduling algorithms that have been proposed for Real-Time Operating System (RTOS) are supposed to be optimal. However, optimal scheduling is only theoretical due to the possibility of system overload where it cannot meet the deadlines of tasks. Besides, these algorithms are implemented in the RTOS, which generates additional overheads that can lead to the "nonscheduling" of certain independent tasks. In this paper, we propose an original solution for nonschedulable independent tasks in embedded systems. This solution, named Hybrid Fuzzy Earliest Deadline First Scheduling algorithm (HFEDFS), is based on the Earliest Deadline First algorithm (EDF) and Fuzzy Logic. It is characterized by a rejection policy and a rescheduling mechanism. The experimental results show that our proposed algorithm improves the system's performance. To reduce extra overheads of RTOS, this algorithm is implemented on a field-programmable gate array (FPGA) circuit (Xilinx Virtex-5 LX50T-1156 board from DIGILENT).
This paper proposes a closed-loop control implementation fully-embedded into an FPGA for a permanent-magnet synchronous motor (PMSM) drive based on a four-level active-clamped converter. The proposed FPGA controller c...
详细信息
This paper proposes a closed-loop control implementation fully-embedded into an FPGA for a permanent-magnet synchronous motor (PMSM) drive based on a four-level active-clamped converter. The proposed FPGA controller comprises a field-oriented control to drive the PMSM, a DC-link voltage balancing closed-loop control (VBC), and a virtual-vector-based modulator for a four-level active-clamped converter. The VBC and the modulator operate in consonance to preserve the DC-link capacitor voltages balanced. The FPGA design methodology is carefully described and the main aspects to achieve an optimal FPGA implementation using low resources are discussed. Experimental results under different operating conditions are presented to demonstrate the good performance and the feasibility of the proposed controller for motor-drive applications.
Modern embedded systems are packed with dedicated fieldprogrammablegatearrays (FPGAs) to accelerate the overall system performance. However, the FPGAs are susceptible to reconfiguration overheads. The reconfigurati...
详细信息
Modern embedded systems are packed with dedicated fieldprogrammablegatearrays (FPGAs) to accelerate the overall system performance. However, the FPGAs are susceptible to reconfiguration overheads. The reconfiguration overheads are mainly because of the configuration data being fetched from the off-chip memory at run-time and also due to the improper management of tasks during execution. To reduce these overheads, our proposed methodology mainly focuses on the prefetch heuristic, reuse technique, and the available memory hierarchy to provide an efficient mapping of tasks over the available memories. Our paper includes a new replacement policy which reduces the overall time and energy reconfiguration overheads for static systems in their subsequent iterations. It is evident from the result that most of the reconfiguration overheads are eliminated when the applications are managed and executed based on our methodology.
Hardware architecture of parallel computation is proposed for generating Fraunhofer computer-generated holograms (CGHs). A pipeline-based integrated circuit architecture is realized by employing the modified Fraunhofe...
详细信息
Hardware architecture of parallel computation is proposed for generating Fraunhofer computer-generated holograms (CGHs). A pipeline-based integrated circuit architecture is realized by employing the modified Fraunhofer analytical formulism, which is large scale and enables all components to be concurrently operated. The architecture of the CGH contains five modules to calculate initial parameters of amplitude, amplitude compensation, phases, and phase compensation, respectively. The precalculator of amplitude is fully adopted considering the "reusable design" concept. Each complex operation type (such as square arithmetic) is reused only once by means of a multichannel selector. The implemented hardware calculates an 800 x 600 pixels hologram in parallel using 39,319 logic elements, 21,074 registers, and 12,651 memory bits in an Altera field-programmable gate array environment with stable operation at 50 MHz. Experimental results demonstrate that the quality of the images reconstructed from the hardware-generated hologram can be comparable to that of a software implementation. Moreover, the calculation speed is approximately 100 times faster than that of a personal computer with an Intel i5-3230M 2.6 GHz CPU for a triangular object. (C) 2015 Society of Photo-Optical Instrumentation Engineers (SPIE)
Human activity recognition (HAR) technology is related to human safety and convenience, making it crucial for it to infer human activity accurately. Furthermore, it must consume low power at all times when detecting h...
详细信息
Human activity recognition (HAR) technology is related to human safety and convenience, making it crucial for it to infer human activity accurately. Furthermore, it must consume low power at all times when detecting human activity and be inexpensive to operate. For this purpose, a low-power and lightweight design of the HAR system is essential. In this paper, we propose a low-power and lightweight HAR system using point-cloud data collected by radar. The proposed HAR system uses a pillar feature encoder that converts 3D point-cloud data into a 2D image and a classification network based on depth-wise separable convolution for lightweighting. The proposed classification network achieved an accuracy of 95.54%, with 25.77 M multiply-accumulate operations and 22.28 K network parameters implemented in a 32 bit floating-point format. This network achieved 94.79% accuracy with 4 bit quantization, which reduced memory usage to 12.5% compared to existing 32 bit format networks. In addition, we implemented a lightweight HAR system optimized for low-power design on a heterogeneous computing platform, a Zynq UltraScale+ ZCU104 device, through hardware-software implementation. It took 2.43 ms of execution time to perform one frame of HAR on the device and the system consumed 3.479 W of power when running.
Measuring electroencephalography (EEG) signals has a variety of important applications. Most processing systems use statistical machine learning algorithms. To increase accuracy of such systems, more data are measured...
详细信息
Measuring electroencephalography (EEG) signals has a variety of important applications. Most processing systems use statistical machine learning algorithms. To increase accuracy of such systems, more data are measured in the form of increased channel count, called high-definition EEG (HD-EEG). EEG processing is hampered by noise from different instruments. While traditional detrending algorithms are highly resource-intensive, the problem is compounded for HD-EEG detrending. The generic-compounding architecture is not scalable. In such an architecture, one instance of the hardware accelerator is used for each channel. This is unsuitable for wearable devices which have limited computational and energy resources. In this article, we propose a time-sliced architecture to optimize resource and power utilization for a multi-channeled system. This is accomplished using time-division multiplexers and demultiplexers to share resources between different channels. The adaptive maximum-mean-minimum (AMaMeMi) filter is a computationally efficient algorithm, reported earlier for detrending EEGs. We apply guidelines of the proposed architecture, on the AMaMeMi filter, to design an efficient HD-EEG detrending hardware accelerator. The proposed accelerator is implemented for various channel counts. We use the Xilinx field-programmable gate array with part number XC7VX980T-1FFG1930 for implementation. We verify the correctness of the proposed architecture by comparing its output with that of the generic-compounding architecture. For a 1024-channeled system, the proposed time-sliced architecture provides 99% reduction in lookup table utilization, 70% in flip-flop utilization, 95% in area utilization, and 70% in estimated power utilization.
暂无评论