this paper presents a low cost FPGA based solution for a real-time moving object tracking system. A specialized architecture is presented based on a soft RISC processor capable of running kernel based mean shift track...
详细信息
ISBN:
(纸本)9781424438464
this paper presents a low cost FPGA based solution for a real-time moving object tracking system. A specialized architecture is presented based on a soft RISC processor capable of running kernel based mean shift tracking algorithm. the system includes a frame grabber unit that stores the video frame in DDR RAM using direct memory access, a video display unit to monitor the tracking statistics and a soft processor capable of running mean shift tracking algorithm within the required time constraint.
Dynamic reconfiguration of FPGAs allows the dynamic management of various tasks that describe an application. this new feature permits, for optimization purpose, to place tasks on line in an available region of the FP...
详细信息
ISBN:
(纸本)9781424438464
Dynamic reconfiguration of FPGAs allows the dynamic management of various tasks that describe an application. this new feature permits, for optimization purpose, to place tasks on line in an available region of the FPGA. Dynamic reconfiguration of tasks leads to some communication problems since tasks are not present in the matrix during all computation time. this dynamicity needs to be supported by the interconnection network. In this paper, we propose the implementation of a flexible interconnection network supporting such dynamicity. the proposed architecture is fully compliant withthe present state-of-art. dynamically reconfigurable circuits such as Xilinx Virtex family of FPGA.
In this paper we present radix r = 2(k) divider for fixed point operands. the divider divides in a radix r = 2(k), producing k bits at each iteration. the proposed digit recurrence algorithm has two different architec...
详细信息
ISBN:
(纸本)9781424438464
In this paper we present radix r = 2(k) divider for fixed point operands. the divider divides in a radix r = 2(k), producing k bits at each iteration. the proposed digit recurrence algorithm has two different architectures, a first one for general hardware implementation, and the second one is optimized for configurable logic (FPGAs). Results show a speedup greater to three times respect to a classical non-restoring division implemented in Xilinx Devices. Additionally a throughput-latency-area comparison of pipelined and sequential dividers implementation is disclosed.
In design of embedded systems for security applications, flexibility and tamper-resistance are two important factors to be considered. High frequency of updates and high costs of ASIC and their long design time urge u...
详细信息
ISBN:
(纸本)9781424438464
In design of embedded systems for security applications, flexibility and tamper-resistance are two important factors to be considered. High frequency of updates and high costs of ASIC and their long design time urge us to use a secure FPGA as an alternative. In this paper a secure FPGA is proposed for secure implementation of crypto devices. the FPGA architecture is based on Asynchronous methodology and is resistant against multiple side channel attacks such as Power Attacks and Fault Attacks. AES algorithm implementation shows the native resistance of SCAR-FPGA.
Low-cost FPGAs have comparable number of Configurable logic Blocks (CLBs) with respect to resource-rich FPGAs but have much less routing tracks. this leads to the difficulty for CAD tools to successfully and optimally...
详细信息
ISBN:
(纸本)9781424438464
Low-cost FPGAs have comparable number of Configurable logic Blocks (CLBs) with respect to resource-rich FPGAs but have much less routing tracks. this leads to the difficulty for CAD tools to successfully and optimally map a circuit into these devices. Instead of switching to resource-rich FPGAs, the designers could employ depopulation based clustering technique which underuses CLBs, hence improves routability by spreading the logic over the architecture. However, all depopulation based clustering algorithms to this date increase critical path delay. In this paper, we present a timing-driven non-uniform depopulation based clustering technique, T-NDPack, that targets critical path delay and channel width constraints simultaneously. We adjust the capacity of the CLB based on the criticality of the logic block. Paper analyzes the effect of depopulation strategies on area and delay performance. Results show that T-NDPack reduces minimum channel width by 11.07% while increasing the number of CLBs by 13.28%. More importantly, T-NDPack decreases critical path delay by 2.89%.
Modem embedded systems may consist of many devices, which may have complex interconnections between them. Many of these devices may also need to be configured and programmed during development and production. the JTAG...
详细信息
ISBN:
(纸本)9781424438464
Modem embedded systems may consist of many devices, which may have complex interconnections between them. Many of these devices may also need to be configured and programmed during development and production. the JTAG port and boundary scan techniques are the industry standard for that task, but are not supported by all devices. this paper presents the use of an FPGA as the means of interconnection between all devices and to provide a single configuration interface for the system. the flexible interconnections in the FPGA allow for faster development times and flexible and reconfigurable interconnections. the FPGA also provides a single, standard JTAG interface to the user during development and production the FPGA acts as a bridge between the standard programming interface and proprietary programming protocols and interfaces, easing development and simplifying the programming task during production and on the field.
this paper presents the design and implementation on FPGA devices of an algorithm for computing the similarity between neighbor photograms in a video sequence using luminance information. Making use of the well-known ...
详细信息
ISBN:
(纸本)9781424438464
this paper presents the design and implementation on FPGA devices of an algorithm for computing the similarity between neighbor photograms in a video sequence using luminance information. Making use of the well-known flexibility of Reconfigurable logic Devices, we have designed a hardware implementation of the algorithm used in video segmentation and indexation. the experimental work has established a tradeoff between concurrent sequential resources and functional blocks, in order to achieve maximum operation speed with minimum silicon area. In order to evaluate the efficiency of the designed system, we have compared the performance of the hardware solution withthat of calculations done via software using general-purpose processors with and without the MMX extension.
Decimal multiplication is an integral part of financial, commercial, and internet-based computations. this paper presents a novel double digit decimal multiplication (DDDM) technique that performs 2 digit multiplicati...
详细信息
ISBN:
(纸本)9781424438464
Decimal multiplication is an integral part of financial, commercial, and internet-based computations. this paper presents a novel double digit decimal multiplication (DDDM) technique that performs 2 digit multiplications simultaneously in one clock cycle. this design offers low latency and high throughput. When multiplying two n-digit operands to produce a 2n-digit product, the design has a latency of inverted right perpendicular (n/2)+1 inverted left perpendicular cycles. the paper presents area and delay comparisons for 7-digit, 16-digit, 34-digit double digit decimal multipliers on different families of Xilinx, Altera, Actel and Quick logic FPGAs. the multipliers presented can be extended to support decimal floating-point multiplication for IEEE P754 standard.
this paper proposes techniques for accelerating a software based image registration algorithm for 3D medical images targeting a reconfigurable hardware platform. Various methods, including dedicated fixed point arithm...
详细信息
ISBN:
(纸本)9781424438464
this paper proposes techniques for accelerating a software based image registration algorithm for 3D medical images targeting a reconfigurable hardware platform. Various methods, including dedicated fixed point arithmetic, error model based bit width analysis, architecture exploration and application-specific memory modules, are applied to address issues from the software algorithm and to maximize the performance of FPGA technology. Based on the reconfigurability of FPGA devices, the system can be extended to swap modules optimized for different parameters, and to adopt more advanced registration algorithms. We show that a single core on 412MHz XC5VLX330T FPGA can evaluate a rigid transformation of a 3D image with 16 million voxels in 35ms. With 30 cores on an FPGA, it is over 108 times faster than a multi-threaded implementation running on a 2.5GHz Intel Quad-Core Xeon platform.
Cube, a massively-parallel FPGA-based platform is presented. the machine is made from boards each containing 64 FPGA devices and eight boards can be connected in a cube structure for a total of 512 FPGA devices. With ...
详细信息
ISBN:
(纸本)9781424438464
Cube, a massively-parallel FPGA-based platform is presented. the machine is made from boards each containing 64 FPGA devices and eight boards can be connected in a cube structure for a total of 512 FPGA devices. With high bandwidth systolic inter-FPGA communication and a flexible programming scheme, the result is a low power, high density and scalable supercomputing machine suitable for various large scale parallel applications. A RC4 key search engine was built as an demonstration application. In a fully implemented Cube, the engine can perform a full search on the 40-bit key space within 3 minutes, this being 359 times faster than a multi-threaded software implementation running on a 2.5GHz Intel Quad-Core Xeon processor.
暂无评论