Cognitive radios that support multiple standards and modify operation depending on environmental conditions are becoming more important as the demand for higher bandwidth and efficient spectrum use increases. Traditio...
详细信息
Cognitive radios that support multiple standards and modify operation depending on environmental conditions are becoming more important as the demand for higher bandwidth and efficient spectrum use increases. Traditional implementations in custom ASICs cannot support such flexibility, with standards changing at a faster pace, while software baseband implementations fail to achieve the performance required. Hence, FPGAs offer an ideal platform bringing together flexibility, performance, and efficiency. this work explores the possible techniques for designing multi-standard radios on FPGAs, and explores how partial reconfiguration can be leveraged in a way that is amenable for domain experts with minimal FPGA knowledge.
Hardware fault emulation for Application Specific Integrated Circuits (ASICs) on FPGAs can considerably reduce the time required for the fault simulation. this paper presents a methodology to emulate ASIC faults on st...
详细信息
Hardware fault emulation for Application Specific Integrated Circuits (ASICs) on FPGAs can considerably reduce the time required for the fault simulation. this paper presents a methodology to emulate ASIC faults on state-of-the-art FPGAs. the fault emulation is achieved by following a fully automated process consisting of: constrained technology mapping of ASIC net-list; creation of fault dictionary, generation of faulty partial bit-streams and fault emulation. the proposed approach exploits run-time partial reconfiguration techniques for fault injection and avoids full net-list re-compilations. the method's feasibility is assessed through carefully selected circuits and overhead in terms of area and timing is reported.
FPGA-based rapid prototyping is widely applied for fast simulations of hardware structure verifications. In this paper, we propose flipSyrup, a prototyping framework for cycle-accurate hardware simulations on abstract...
详细信息
FPGA-based rapid prototyping is widely applied for fast simulations of hardware structure verifications. In this paper, we propose flipSyrup, a prototyping framework for cycle-accurate hardware simulations on abstract FPGA platforms. In order to mitigate the development complexity of FPGA-based simulators, the framework provides two abstractions of resources on FPGA platforms: Memory systems and inter-FPGA interconnections on multi-FPGA platforms. the framework enables designers to draw up a target hardware using abstract interfaces as ideal memory systems and interconnections on FPGA platforms. Our evaluation result shows that the slowdowns in simulation speed under the abstractions by using the framework are not critical.
Increasing chip sizes and better programming tools have made it possible to increase the boundaries of application acceleration with FPGAs. Two applications, localization microscopy and electron tomography, are presen...
详细信息
Increasing chip sizes and better programming tools have made it possible to increase the boundaries of application acceleration with FPGAs. Two applications, localization microscopy and electron tomography, are presented in the author's PhD thesis and summarized in this paper. Both have been ported from imperative languages to the dataflow paradigm that maps well onto long processing pipelines in custom hardware. the results show that an acceleration of 200 compared to an Intel i5 450 CPU for localization microscopy, and an acceleration of 5 over an Nvidia Tesla C1060 for electron tomography while maintaining full accuracy. the main challenge arose from the need to fully understand and re-write most of the imperative source in a form suitable for dataflow computing.
this paper presents a Zynq capable version of GNU Radio - an open-source rapid radio deployment tool - with an enhanced flow that utilizes the processing capability of FPGAs. this work features TFlow - an FPGA back-en...
详细信息
this paper presents a Zynq capable version of GNU Radio - an open-source rapid radio deployment tool - with an enhanced flow that utilizes the processing capability of FPGAs. this work features TFlow - an FPGA back-end compilation accelerator for instant FPGA assembly. the Xilinx Zynq FPGA architecture integrates the FPGA fabric and CPU onto a single chip, which eliminates the need for a controlling host computer; thus, providing a single, portable, low-power, embedded platform. By exploiting the computational advantages of FPGAs in the GNU Radio flow, a larger class of software defined radios can be implemented. Once the FPGA is programmed with a design, modules can be parameterized to realize an even larger class of applications and further solidify the concept of rapid assembly of software defined radios.
We introduce a library for the productive development of image processing accelerators using C-based high-level synthesis. the key concept of our approach is to provide a set of generic building blocks that is applica...
详细信息
We introduce a library for the productive development of image processing accelerators using C-based high-level synthesis. the key concept of our approach is to provide a set of generic building blocks that is applicable to a multitude of image processing applications. An efficient memory architecture that facilitates easy integration of point and local image processing operators is the centerpiece of the library. the generic building blocks are kept very compact and can be tailored to support sophisticated processing techniques. the representation enables the designer to comply with specific design requirements, such as stringent timing constraints or limited resource budgets. Results show a significant gain in productivity compared to hand coded implementation while delivering comparable performance and resource requirements.
An approach to estimate the performance of FPGA architectures is proposed based on semi-supervised model tree algorithm. the proposed approach avoids synthesizing, mapping, packing, placing and routing, which are esse...
详细信息
An approach to estimate the performance of FPGA architectures is proposed based on semi-supervised model tree algorithm. the proposed approach avoids synthesizing, mapping, packing, placing and routing, which are essential steps in a traditional flow to obtain the performance of FPGA. thus it is time efficient while the performance predicted maintains quite close to the result obtained through the traditional method (a tool flow called VTR). this can be utilized effectively during the early FPGA design stage to choose an optimal architecture under a certain metric. Comparisons are made between the performance obtained by the proposed approach and by VTR on a commercial 40nm technology. Results show that the proposed approach has MRE below 7.62% compared to VTR, and improves the time cost by thousands of times when utilized in architecture design space exploration.
Graph mining is an important research area within the domain of data mining. One of the most challenging tasks of graph mining is frequent subgraph mining. this work presents the first FPGA-based implementation, to th...
详细信息
Graph mining is an important research area within the domain of data mining. One of the most challenging tasks of graph mining is frequent subgraph mining. this work presents the first FPGA-based implementation, to the best of our knowledge, of the most efficient and well-known algorithm for the Frequent Subgraph Mining (FSM) problem, i.e. gSpan. the proposed system, named High Performance Computing-gSpan (HPC-gSpan), achieves manyfold speedup vs. the official software solution of the gboost library when executed on a high-end CPU for various real-world datasets.
High-speed and energy-efficient computations are mandatory in the financial and insurance industry to survive in competition and meet the federal reporting requirements. On a hybrid CPU/FPGA system we propose a modula...
详细信息
High-speed and energy-efficient computations are mandatory in the financial and insurance industry to survive in competition and meet the federal reporting requirements. On a hybrid CPU/FPGA system we propose a modular pricing engine and derive a novel algorithmic extension able to exploit online dynamic reconfiguration. the result is a high-performance and energy-efficient pricing system suitable for exotic option pricing in the state-of-the-art Heston market model. Withthe online reconfiguration extension our hybrid pricing system is nearly two orders of magnitude faster than high-end Intel CPUs, while consuming the same power.
Mapping complex mathematical expressions to DSP blocks through standard inference from pipelined code is inefficient and results in significantly reduced throughput. In this paper, we demonstrate the benefit of consid...
详细信息
Mapping complex mathematical expressions to DSP blocks through standard inference from pipelined code is inefficient and results in significantly reduced throughput. In this paper, we demonstrate the benefit of considering the structure and pipeline arrangement of DSP blocks during mapping. We have developed a tool that can map mathematical expressions using RTL inference, through high level synthesis with Vivado HLS, and through a custom approach that incorporates DSP block structure. We can show that the proposed method results in circuits that run at around double the frequency of other methods, demonstrating that the structure of the DSP block must be considered when scheduling complex expressions.
暂无评论