Modem FPGAs' parallel computing capability and their ability to be reconfigured make them an ideal platform to build accelerators for supercomputing systems. As a multicore processor, the recently announced Cell B...
详细信息
ISBN:
(纸本)9781424410590
Modem FPGAs' parallel computing capability and their ability to be reconfigured make them an ideal platform to build accelerators for supercomputing systems. As a multicore processor, the recently announced Cell Broadband EngineTM1 offers tremendous computing power. In this paper, we introduce a prototype system that combines these two types of computing devices together in a reconfigurable blade and we describe its architecture, memory system and abundant interfaces. On the reconfigurable blade it is desirable that the FPGA devices can be partially reconfigured at run-time. this paper presents the dynamic partial reconfiguration (DPR) technique and its design flow for the reconfigurable blade. We report our experimental results of the blade doing partial reconfiguration. DPR allows the reconfigurable blade to be a powerful, run-time changeable computing engine. A sample application is presented that was both simulated for the Cell processor and dynamically loaded to run on the FPGA.
this paper presents a novel architecture for domain-specific FPGA devices. this architecture can be optimised for both speed and density by exploiting domain-specific information to produce efficient reconfigurable lo...
详细信息
ISBN:
(纸本)9781424410590
this paper presents a novel architecture for domain-specific FPGA devices. this architecture can be optimised for both speed and density by exploiting domain-specific information to produce efficient reconfigurable logic with multiple granularity. In the reconfigurable logic, general-purpose fine-grained units are used for implementing control logic and bit-oriented operations, while domain-specific coarse-grained units and heterogeneous blocks are used for implementing datapaths;the precise amount of each type of resources can be customised to suit specific application domains. Issues and challenges associated withthe design flow and the architecture modelling are addressed. Examples of the proposed architecture for speeding up floating point applications are illustrated. Current results indicate that the proposed architecture can achieve 2.5 times improvement in speed and IS times reduction in area on average, when compared with traditional FPGA devices on selected floating point benchmark circuits.
Since the 1990s reusable functional blocks, well known as IP-Cores, have been integrated on one silicon die. these Systems-on-Chip (SoC) used a bus-based system for inter-module communication. Technology, performance ...
详细信息
ISBN:
(纸本)9781424410590
Since the 1990s reusable functional blocks, well known as IP-Cores, have been integrated on one silicon die. these Systems-on-Chip (SoC) used a bus-based system for inter-module communication. Technology, performance and flexibility issues require the introduction of a novel communication system called Network-on-Chip (NoC). A-round 1999 this method was introduced and since then has been investigated by several research groups withthe aim to connect different EP-Cores through an effective, flexible and scalable communication network. Exploiting the flexibility of FPGAs, the run-time adaptivity through run-time reconfiguration, opens a new area of research by considering dynamic and partial reconfiguration. Since software parts of an electronic system can also be included into reconfigurable hardware by integration of IP-based microcontrollers, the reconfigurable architecture provides a flexible, multi-adaptive heterogeneous platform for HW / SW Co-designs. this paper presents an approach for exploiting dynamic and partial reconfiguration with Xilinx Virtex-II FPGAs for an adaptive circuit switched Network-on-Chip and the related techniques for adapting the system during run-time to the requirements of the presented image processing application.
In this paper we present a framework for the seamlessly utilization of hardware accelerators in heterogeneous SoCs that are used to speedup the processing of Spark data analytics applications.
ISBN:
(纸本)9789090304281
In this paper we present a framework for the seamlessly utilization of hardware accelerators in heterogeneous SoCs that are used to speedup the processing of Spark data analytics applications.
this paper describes a correlator that is optimized for the Xilinx Virtex-4 SX FPGA, and its application in the SKAMP radio telescope at the Molonglo Radio Observatory. the digital backend of the SKAMP telescope consi...
详细信息
ISBN:
(纸本)9781424410590
this paper describes a correlator that is optimized for the Xilinx Virtex-4 SX FPGA, and its application in the SKAMP radio telescope at the Molonglo Radio Observatory. the digital backend of the SKAMP telescope consists of more than 800 Virtex-4 FPGAs. Correlation is performed between each and every pairing of antenna inputs, so the SKAMP telescope, with its 384 inputs, has approximately 74,000 antenna correlations;with 100 MHz of input bandwidth from each antenna this requires real-time processing of more than 7 tera complex multiply-accumulates per second. the correlation cell described takes advantage of the hard IP blocks found within the Virtex-4 FPGA to perform one 4+4-bit complex correlation per cycle at a clock rate exceeding 256 MHz. At the core of each cell is an efficient 4-bit signed complex multiplier, implemented using the 18-bit signed multiplier of the Virtex-4 DSP slice, and a short term accumulator, implemented using the adjacent Block RAM. Nearly 30,000 correlation cells are instantiated across 192 Virtex-4SX35 devices in order to process all the data from the SKAMP telescope.
We propose embedding hard NoCs on FPGAs to improve system-level communication as detailed in our previous studies [1-6]. this demo paper outlines the three main design and simulation tools that we have been using to e...
详细信息
ISBN:
(纸本)9781467381239
We propose embedding hard NoCs on FPGAs to improve system-level communication as detailed in our previous studies [1-6]. this demo paper outlines the three main design and simulation tools that we have been using to experiment with Embedded NoCs on FPGAs.
Many automatic algorithms have been proposed for analyzing magnetic resonance imaging (MRI) data sets. these algorithms allow clinical researchers to generate quantitative data analyses with consistently accurate resu...
详细信息
ISBN:
(纸本)9781424410590
Many automatic algorithms have been proposed for analyzing magnetic resonance imaging (MRI) data sets. these algorithms allow clinical researchers to generate quantitative data analyses with consistently accurate results. Withthe increasingly large data sets being used in brain mapping, there has been a significant rise in the need for methods to accelerate these algorithms, as their computation time can consume many hours. this paper presents the results from a recent study on implementing such quantitative analysis algorithms on High-Performance Reconfigurable Computers (HPRCs). A brain tissue classification algorithm for MRI, the Partial Volume Estimation (PVE), is implemented on an SGI RASC RC100 system using the Mitrion-C High-Level Language (HLL). the CPU-based PVE algorithm is profiled and computationally intensive floating-point functions are implemented on FPGA-accelerators. the images resulting from the FPGA-based algorithm are compared to those generated by the CPU-based algorithm for verification. the Similarity Indexes (SI) for pure tissues are calculated to measure the accuracy of the images resulting from the FPGA-based implementation. the portion of the PVE algorithm that was implemented on hardware achieved a 11 x performance improvement over the CPU-based implementation. the overall performance improvement of the FPGA-accelerated PVE algorithm was 3.5 x with four FPGAs.
In this paper, we present an improved CORDIC algorithm that reduces logic cell usage in an embedded programmablelogic device (EPLD) design. this algorithm improves on the, well-known, CORDIC algorithm by introducing ...
详细信息
暂无评论