In this paper we propose an original and fast design space exploration method targeting reconfigurable architectures. This method takes place during the first steps of a design flow that works at the algorithmic level...
详细信息
ISBN:
(纸本)193241505X
In this paper we propose an original and fast design space exploration method targeting reconfigurable architectures. This method takes place during the first steps of a design flow that works at the algorithmic level. It uses as input a high level specification of the application and is based on a functional model to describe the architectures to compare. This paper describes the projection step of the flow and presents an allocation heuristic that is based on communication costs reduction.
reconfigurable computing architectures aim to dynamically adapt their hardware to the application at hand. As research shows, the time it takes to reconfigure the hardware forms an overhead that can significantly impa...
详细信息
ISBN:
(纸本)193241505X
reconfigurable computing architectures aim to dynamically adapt their hardware to the application at hand. As research shows, the time it takes to reconfigure the hardware forms an overhead that can significantly impair the benefits of hardware customization. Multi-context devices are one promising approach to overcome the limitations posed by long reconfiguration times. In contrast to more traditional reconfigurable architectures, multi-context devices hold several configurations on-chip. On demand, these devices quickly switch to another context. In this paper we present a co-simulation environment to investigate design,trade-offs for hybrid multi-context architectures. Our architectural model comprises a reconfigurable unit closely coupled to a CPU core. As a case study, we discuss the implementation of an FIR filter partitioned into several contexts.. We outline the mapping process and present simulation results for single- and multi-context reconfigurable units coupled with embedded and superscalar CPUs.
This paper presents an innovative, application-adaptive modeling methodology for variable precision floating-point (FP) computing. Implementing floating-point applications on custom devices can be costly. Since implem...
详细信息
ISBN:
(纸本)193241505X
This paper presents an innovative, application-adaptive modeling methodology for variable precision floating-point (FP) computing. Implementing floating-point applications on custom devices can be costly. Since implementation cost can be traded off with precision (represented by, mantissa bitwidth) for some FP applications, automatically obtaining the appropriate reduced bitwidth will lead to the low-cost implementation of FP computation. The methodology introduced in this paper targets model-based bitwidth optimization. Arithmetic models developed using the methodology can be used in optimization algorithms to compute the. minimal bitwidth set. This differs from most existing methodologies which are simulation based. The methodology makes use of a behavioral representation of the application and a behavioral profiling process to obtain the necessary parameters. Two examples are introduced to demonstrate the usage and accuracy of the modeling methodology.
The Dynamo system provides a flexible hardware/software runtime environment for image processing applications. Many image processing applications consist of a series of algorithms applied to the image, forming a compu...
详细信息
ISBN:
(纸本)1932415424
The Dynamo system provides a flexible hardware/software runtime environment for image processing applications. Many image processing applications consist of a series of algorithms applied to the image, forming a computation pipeline. Image processing algorithms can be implemented using reconfigurable hardware, such as Field Programmable Gate Arrays (FPGAs), or in software. Dynamo is a Java-based runtime partitioning system that permits the image analyst to specify the pipeline and an image to be processed. From this specification, Dynamo dynamically selects the most efficient combination of hardware and software component implementations to minimize pipeline runtime, generates the source code for a HW/SW implementation of the pipeline, processes the input image using the generated pipeline, and returns the results to the analyst. This paper describes Dynamo's design and implementation, and presents experimental results. Dynamo chooses a mix of hardware and software implementations to minimize run time for several different image processing applications.
Advances in field programmable gate arrays (FP-GAs), which are the platform of choice for reconfigurable computing, have made it possible to use FPGAs in increasingly many areas of computing, including complex scienti...
详细信息
ISBN:
(纸本)9781932415742
Advances in field programmable gate arrays (FP-GAs), which are the platform of choice for reconfigurable computing, have made it possible to use FPGAs in increasingly many areas of computing, including complex scientific applications. These applications demand high performance and high-precision, floating-point arithmetic Until now, most of the research has not focussed on compliance with IEEE standard 754, focusing instead upon custom formats and bitwidths. In this paper, we present double-precision floating-point cores that are parameterized by their degree of pipelining and the features of IEEE standard 754 that they implement. We then analyze the effects of supporting the standard when these cores are used in an FPGA-based accelerator for Lennard-Jones force and potential calculations that are part of molecular dynamics (MD) simulations.
The need for flexible computational power has motivated many researchers to incorporate run-time reconfigurable logic into their architectures. Most contemporary experiments include commercial FPGA's serving as re...
详细信息
ISBN:
(纸本)193241505X
The need for flexible computational power has motivated many researchers to incorporate run-time reconfigurable logic into their architectures. Most contemporary experiments include commercial FPGA's serving as reconfigurable hardware. Unfortunately, the FPGA does not exhibit the same run-time flexibility as the Instruction Set Processor (ISP) e.g. when it comes to ease and speed of setting tip a task. In addition, FPGA's tend to be less suited than traditional ISP's to accommodate control-flow dominated tasks. Obviously, it is possible to alleviate some of these issues by using a reconfiguration hierarchy (e.g. placing and configuring an ASIP or coarse grain reconfigurable block into the FPGA). This paper illustrates how our operating system transparently manages the complexin, of hierarchical reconfiguration. In addition, this paper highlights the benefits and drawbacks of employing multiple hierarchical levels of configuration. As a proof of concept, we developed a filtering application on top of an in-house 16 bit micro-controller and a parameterizable filter block, both instantiated inside an FPGA.
Topic Area: Software. Implementation details and performance measurements for a brute-force RC5 keycrack application executing on a cluster containing commodity reconfigurable hardware are presented. The purpose of th...
详细信息
ISBN:
(纸本)193241505X
Topic Area: Software. Implementation details and performance measurements for a brute-force RC5 keycrack application executing on a cluster containing commodity reconfigurable hardware are presented. The purpose of the application is to gauge the maximum real-world speedups attainable on a metacomputer that forms the underlying execution platform. The operation of the metacomputer and its associated tools, designed to target cluster-based distributed reconfigurable hardware in a high-level manner, is discussed in detail.
This paper presents experimental results on extraction of common tasks or core clusters in Control Data Flow Graphs (CDFGs) of applications, to embed them in Hybrid-FPGA environment. After removing common sub-graphs f...
详细信息
ISBN:
(纸本)1932415424
This paper presents experimental results on extraction of common tasks or core clusters in Control Data Flow Graphs (CDFGs) of applications, to embed them in Hybrid-FPGA environment. After removing common sub-graphs from the CDFG, remaining computations are then implemented on LUT based reconfigurable area. A new LUT based packing mechanism using live-in live-out variable analysis and scheduling information is introduced as part of routing architecture design methodology [1]. We conducted experiments on MPEG-4, Gnu Scientific, Biochemical and Molecular modeling libraries. Map report based on Spartan 2E architecture was obtained. Results show that partial reconfiguration with the use of computation cores embedded in a sea of LUTs offer the potential for massive savings in gate density and switching requirements by eliminating the need for unnecessary and redundant sub-circuit pattern configurations.
This paper describes an experimental platform, optimised for Active Network processing. The system is based on a PC running Linux and operating as a router. The active applications are downloaded in a FPGA based PCI b...
详细信息
ISBN:
(纸本)193241505X
This paper describes an experimental platform, optimised for Active Network processing. The system is based on a PC running Linux and operating as a router. The active applications are downloaded in a FPGA based PCI board. An embedded hardware module monitors and manages the active applications at run time. A stable and safe reconfigurable platform for the Active Applications is the objective of this investigation.
暂无评论