the PhD project described in this paper aims to use word-length optimization techniques to automatically optimize the dynamic power consumption of high-level descriptions of DSP algorithms intended for implementation ...
详细信息
ISBN:
(纸本)9781424403127
the PhD project described in this paper aims to use word-length optimization techniques to automatically optimize the dynamic power consumption of high-level descriptions of DSP algorithms intended for implementation on FPGA, before or during synthesis. By developing models which can quickly estimate the power consumed by a system from a high-level description of the algorithm it implements, our work will allow for existing word-length optimization techniques to minimize the power consumption of a system, subject to acceptable signal distortion constraints.
FPGAs have become an attractive choice for scientific computing. In this paper, we propose a high performance design for LU decomposition, a key kernel in many scientific and engineering applications. Our design achie...
详细信息
ISBN:
(纸本)9781424403127
FPGAs have become an attractive choice for scientific computing. In this paper, we propose a high performance design for LU decomposition, a key kernel in many scientific and engineering applications. Our design achieves the optimal performance for LU decomposition using the available hardware resources. the design is parameterized. thus, it can be easily adapted to variousbardware constraints. Experimental results show that our design achieves high performance and offers good scalability. Our implementation on a Xilinx Virtex-II Pro XC2VPIOO achieves superior sustained floating-point performance over existing FPGA-based implementations and optimized libraries on the state-of-the-art processors.
Domain-specific design flows can enable an efficient path to implementation, as well as making the design process intuitive and the designs reusable. When targeting FPGAs, there are few techniques in high level synthe...
详细信息
ISBN:
(纸本)9781424403127
Domain-specific design flows can enable an efficient path to implementation, as well as making the design process intuitive and the designs reusable. When targeting FPGAs, there are few techniques in high level synthesis that enable thorough exploration of the inherent flexibility of the FPGA fabric as an implementation medium. In this paper, we propose a new methodology, based on micro-coded data paths, that enables design space exploration of processing engine architectures implemented in programmablelogicthat range from a fixed finite state machine to a soft processor. As a use case, these processing engines can be embedded within programmablelogicthreads that are used to carry out network packet processing. We demonstrate the application of this methodology on a network address translation application, and show that micro-coded data paths indeed enable both human designers and automated tools to explore the design space in a structured way, thus exploiting the full potential of the FPGA technology.
this paper presents preliminary work exploring adaptive fieldprogrammable gate arrays (AFPGAs). An AFPGA is adaptative in the sense that the functionality of subcircuits placed on the chip can change in response to c...
详细信息
ISBN:
(纸本)9781424403127
this paper presents preliminary work exploring adaptive fieldprogrammable gate arrays (AFPGAs). An AFPGA is adaptative in the sense that the functionality of subcircuits placed on the chip can change in response to changes observed on certain control signals. We describe the high-level architecture which adds additional control logic and SRAM bits to a traditional FPGA to produce an AFPGA. We also describe a synthesis method that identifies and resynthesizes mutually exclusive pieces of logic so that they may share the resources available in an AFPGA. the architectural feature and its associated synthesis method helps reduce circuit size by 28% on average and up to 40% on select circuits.
In the last years FPGAs have become very important for electronic designs - they are very flexible, provide high configurability and allow short turn around times. Especially for Rapid Prototyping (RP) another feature...
详细信息
ISBN:
(纸本)9781424403127
In the last years FPGAs have become very important for electronic designs - they are very flexible, provide high configurability and allow short turn around times. Especially for Rapid Prototyping (RP) another feature plays an important rule: the nearly infinite reprogrammability. Now ever, handling these devices in the engineering process is not an easy issue. therefore our approach presents an efficient, flexible and versatile FPGA configuration methodology based on partial bitstream merging at design time.
Block matching motion estimation takes a great part of the processing time for video encoding. To accelerate this process is must to reach real time video coding. the best motion vector is obtained by full-search bloc...
详细信息
ISBN:
(纸本)9781424403127
Block matching motion estimation takes a great part of the processing time for video encoding. To accelerate this process is must to reach real time video coding. the best motion vector is obtained by full-search block matching algorithm which has to be usually implemented by hardware. In recent years, several FPGA based designs have been proposed since these devices support high number of process elements in parallel mode. In this paper a survey, of recent architectures to perform the full-search block matching algorithm in FPGAs is presented. A further comparison on terms of frames per second reached, hardware cost in CLB slices and system frequency is presented.
We describe architectural enhancements to Xilinx FPGAs that provide better support for the creation of dynamically reconfigurable designs. these are augmented by a new design methodology that uses pre-routed IP cores ...
详细信息
ISBN:
(纸本)9781424403127
We describe architectural enhancements to Xilinx FPGAs that provide better support for the creation of dynamically reconfigurable designs. these are augmented by a new design methodology that uses pre-routed IP cores for communication between static and dynamic modules and permits static designs to route through regions otherwise reserved. for dynamic modules. A new CAD tool flow to automate the methodology is also presented. the new tools initially target the Virtex-II, Virtex-II Pro and Virtex-4 families and are derived from Yjlinx's commercial CAD tools.
this paper examines various activity estimation techniques in order to determine which are most appropriate for use in the context of field-programmable gate arrays (FPGAs). Specifically, the paper compares how differ...
详细信息
ISBN:
(纸本)9781424403127
this paper examines various activity estimation techniques in order to determine which are most appropriate for use in the context of field-programmable gate arrays (FPGAs). Specifically, the paper compares how different activity estimation techniques affect the accuracy of FPGA power models and the ability of power-aware FPGA CAD tools to minimize power. After comparing various existing techniques, the most suitable existing techniques are combined with two novel enhancements to create a new activity estimation tool called ACE-2.0. Finally, the new publicly available tool is compared to existing tools to validate the improvements. Using activities estimated by ACE-2.0, the power estimates and power savings were both within 1% of the results obtained using simulated activities.
Configurable architectures offer the unique opportunity of customizing the storage allocation to meet specific applications' needs. In this paper we describe a compiler approach to map the arrays of a loop-based c...
详细信息
ISBN:
(纸本)9781424403127
Configurable architectures offer the unique opportunity of customizing the storage allocation to meet specific applications' needs. In this paper we describe a compiler approach to map the arrays of a loop-based computation to internal memories of a configurable architecture withthe objective of minimizing the overall execution time. We present an algorithm that considers the data access patterns of the arrays along the critical path of the computation as well as the available storage and memory bandwidth. We demonstrate experimental results of the application of this approach for a set of kernel codes when targeting a field-programmable Gate-Array (FPGA). the results reveal that our algorithm outperforms naive and custom data layouts for these kernels by an average of 33% and 15% in terms of execution time, while taking into account the available hardware resources.
the affective content of a video is defined as the expected amount and type of emotion that are contained in a video. Utilizing this affective content will extend the current scope of application possibilities. the di...
详细信息
ISBN:
(纸本)9781424403127
the affective content of a video is defined as the expected amount and type of emotion that are contained in a video. Utilizing this affective content will extend the current scope of application possibilities. the dimensional approach to representing emotion can play an important role in the development of an affective video content analyzer. the three basic affect dimensions are defined as valence, arousal and control [1]. this paper presents a novel FPGA-based system for modeling the arousal content of a video based on user saliency and film grammar. the design is implemented on a Xilinx Virtex-II xc2v6000 on board a RC300 board and it runs 25 times faster than a Pentium 4-based PC at 3.4 Ghz.
暂无评论