A high-performance reconfigurable coarse-grain data-path, part of a hybrid reconfigurable platform, is introduced. the data-path consists of coarse grain components that their flexibility and universality is shown to ...
详细信息
ISBN:
(纸本)3540229892
A high-performance reconfigurable coarse-grain data-path, part of a hybrid reconfigurable platform, is introduced. the data-path consists of coarse grain components that their flexibility and universality is shown to increase the system's performance due to significant reductions in latency. An automated methodology for mapping applications on the proposed data-path is also presented. Results on DSP benchmarks show important performance improvements, up to 44%, over existing high-performance data-paths.
this PhD project seeks to examine the reconfiguration of FPGAs at the architectural level in order to develop efficient techniques for supporting application specific reconfiguration. this abstract presents a techniqu...
ISBN:
(纸本)3540229892
this PhD project seeks to examine the reconfiguration of FPGAs at the architectural level in order to develop efficient techniques for supporting application specific reconfiguration. this abstract presents a technique that addresses the problem of reducing the reconfiguration delay of an FPGA application at the time when configuration data is to be loaded onto the device. Let us consider an on-chip configuration c1 and a new configuration c2that we want to load onto the device (c1 and c2 might not span the entire device). the amount of configuration data that we need to load can be reduced if we only write those parts of c2that are not present in c1. In particular, a judicious placement of the two configurations can result in maximising the amount of overlap that we seek to exploit.
Selecting which program transformations to apply when mapping computations to FPGA-based computing architectures can lead to prohibitively long design space exploration cycles. An alternative is to develop fast, yet a...
详细信息
Selecting which program transformations to apply when mapping computations to FPGA-based computing architectures can lead to prohibitively long design space exploration cycles. An alternative is to develop fast, yet accurate, performance and area models to quickly understand the impact and interaction of the transformations. In this paper, we present a combined analytical performance and area modeling approach for complete FPGA designs in the presence of loop transformations. Our approach takes into account the impact of input/output memory bandwidth and memory interface resources, often the limiting factor in the effective implementation of computations. Our preliminary results reveal that our modeling is very accurate, being therefore amenable to be used in a compiler tool to quickly explore very large design spaces.
AAA is a methodology developed for the fast prototyping of real-time embedded applications and SynDEx is the software tool based on this methodology. Based on formal transformations, AAA helps the designer to implemen...
详细信息
ISBN:
(纸本)3540229892
AAA is a methodology developed for the fast prototyping of real-time embedded applications and SynDEx is the software tool based on this methodology. Based on formal transformations, AAA helps the designer to implement signal and images processing algorithms onto multicomponent. this includes the support of both algorithm and architecture specifications, resources allocations and optimizations, performances prediction and multicomponents code generation. Since AAA did not initially support configurable components, this paper presents an extension of AAA/SynDEx for FPGA. this still includes the support of specification, optimization, performance prediction and automatic VHDL code generation. this paper focuses on the implementation of this work in SynDEx-Ic.
Due to their flexibility, increased logic density and low design costs, field-programmable Gate Arrays (FPGAs) have become a viable option for implementing many kinds of applications such as custom computing machines,...
详细信息
ISBN:
(纸本)3540229892
Due to their flexibility, increased logic density and low design costs, field-programmable Gate Arrays (FPGAs) have become a viable option for implementing many kinds of applications such as custom computing machines, rapid system prototyping, hardware emulation, IP verification and evaluation. this paper proposes an alternative approach that allows IP providers to deliver their IP to customers for functional evaluation before purchase, by mapping IP cores into SRAM-based FPGA logic and distributing them as a bitstream file for a particular device so that customers can use their FPGA boards to try-out the IP as a black-box, pre-verified design component. this paper also presents a simple hardware/software infrastructure and its prototype implementation that allows for seamless integration of hardware IP into an existing simulation environment. In addition, a case study is given to demonstrate the proposed approach and some security issues concerning bitstream-level IP distribution are also discussed.
ASIPs and reconfigurable processors are architectural choices to extend the capabilities of a given processor. ASIPs suffer from fixed hardware after design, while ASIPs and reconfigurable processors suffer from the l...
详细信息
ISBN:
(纸本)3540229892
ASIPs and reconfigurable processors are architectural choices to extend the capabilities of a given processor. ASIPs suffer from fixed hardware after design, while ASIPs and reconfigurable processors suffer from the lack of a pre-established instruction set, making them difficult to program. As intermediate choice, reconfigurable coprocessors systems (RCSs) contain dedicated hardware (coprocessors) coupled to a standard processor core to accelerate specific tasks, allowing inserting or substituting hardware functionalities at execution time. this paper proposes a generic model for RCSs, targeted to reconfigurable devices with self-reconfiguration capabilities. A proof-of-concept case study is presented as well.
In 1964, Elgot and Robinson introduced the Random-Access Stored Program (RASP) machine model "to capture some of the most salient features of the central processing unit of a modem digital computer." After f...
详细信息
ISBN:
(纸本)3540229892
In 1964, Elgot and Robinson introduced the Random-Access Stored Program (RASP) machine model "to capture some of the most salient features of the central processing unit of a modem digital computer." After four decades of progress in computer science, this model is now somewhat outdated. Intriguingly though, the 1964 paper presented two theorems showing that programs of 'finitely determined' instructions are properly more powerful if modification of addresses in instructions is permitted during execution than when it is forbidden. In this paper, we celebrate the 40th birthday of these results by using them to prove that allowing programmability of circuits during execution adds extra computational power. To do this, we accord front-line computational status to programmable circuitry, and conduct a theoretical study based on a tradition dating back to Godel, Turing and Church in the 1930s. In particular, we introduce a new Local Access Stored Circuit (LASC) model of programmable circuitry, intended to form a solid basis for a broad range of future computational research.
We present an evaluation of accelerating fault simulation by hardware emulation on FPGA, Fault simulation is an important subtask in test pattern generation and it is frequently used throughout the test generation pro...
详细信息
ISBN:
(纸本)3540229892
We present an evaluation of accelerating fault simulation by hardware emulation on FPGA, Fault simulation is an important subtask in test pattern generation and it is frequently used throughout the test generation process. In order to evaluate possible simulation speed possibilities, we made a feasibility study of using reconfigurable hardware by emulating circuit under analysis together with fault insertion structures on FPGA. Experiments showed that it is beneficial to use emulation for circuits/methods that require large numbers of test vectors, e.g., sequential circuits and/or genetic algorithms.
this paper proposes a new adaptable FPGA logic element based on fracturable 6-LUTs, which fundamentally alters the longstanding belief that a 4-LUT is the most efficient area/delay tradeoff. We will describe theory an...
详细信息
ISBN:
(纸本)3540229892
this paper proposes a new adaptable FPGA logic element based on fracturable 6-LUTs, which fundamentally alters the longstanding belief that a 4-LUT is the most efficient area/delay tradeoff. We will describe theory and benchmarking results showing a 15% performance increase with 12% area decrease vs. a standard BLE4. the ALM structure is one of a number of architectural improvements giving Altera's 90nm Stratix II architecture a 50% performance advantage over its 130nm Stratix predecessor.
In this article we present a compact and efficient co-processor that calculates the Advanced Encryption Standard (AES). It implements the whole functionality of the AES algorithm: all key lengths (128-bit, 192-bit, an...
详细信息
ISBN:
(纸本)3540229892
In this article we present a compact and efficient co-processor that calculates the Advanced Encryption Standard (AES). It implements the whole functionality of the AES algorithm: all key lengths (128-bit, 192-bit, and 256-bit) are supported for both, encryption and decryption. Furthermore, it supports the Cipher Block Chaining mode. Due to an innovative AES State representation the complete AES co-processor is well suited for low-end FPGAs. the integrated AMBA interface facilitates the integration of the co-processor in System-on-Chip designs too. An implementation on a Xilinx Virtex-E FPGA device uses only 1,125 CLB slices and no block RAMs. Our FPGA implementation reaches a throughput of 215 Mbps at a clock frequency of 161.0 MHz.
暂无评论